Monthly Archive: July 2017

Jul 09

Bringing order to S3 bucket chaos with Chaos Sumo

This summer I’ve been working on getting my digital clutter organized.  With the cost of Amazon S3 being so low, it’s the perfect dumping ground for someone (me) who finds it cost prohibitive or impractical to keep large amounts of data locally. Appropriately, Amazon S3 calls each file collection a “bucket”.   What’s in my buckets? All kinds of stuff – but mostly documents, spreadsheets, log files from my website, and a smattering of stuff that didn’t make the cut to stay on my primary laptop drive all those years ago.


Recently I was introduced to a platform from Chaos Sumo ( that adds a level of intelligence to S3.  If you’ve used S3, you know a couple specific things about it – one is there is no correlation between your buckets, and the other is that S3 can’t tell you much about what’s in each bucket.  Chaos Sumo solves for both of these things.


To explore these capabilities, I signed up for the Chaos Sumo Beta

Chaos Sumo Beta Login Screen

Before you can ask Chaos Sumo to reveal the contents of your S3 buckets, you must give it access to your S3 files.  For this piece, you will first need to create an Amazon IAM (Identity and Access Management) Role.  It sounds complicated, but it’s effectively the equivalent of giving someone a guest account on your computer, or a temporary alarm code to your home or business.  You can remove the access at any time and revoke access to Chaos Sumo.  The process to give Chaos Sumo access is covered thoroughly in their help section, in step-by-step directions. In my experience so far, this was the most complicated step, and it really wasn’t that hard. Chaos Sumo is entirely cloud-hosted and provided as-a-service, so once account access is set up, there’s no software or server to maintain.

After configuration, your next step is to explore your physical buckets.  Chaos Sumo has placed some limitations on usage during the beta, but it’s enough to get you started.  The initial version will discover all file types, though the datasets you can auto-discover initially are based on CSV and LOG files.  I’ll cover this in a subsequent post when I get into exploration and correlation of datasets across multiple physical buckets.

Chaos Sumo Bucket Discovery

If you would like to check out Chaos Sumo for yourself, you can sign up for the beta at and tell them you heard about it from @StartsWithV


Permanent link to this article: