OH, CRAP! Dealing with the data explosion
Today, I’m going talk about crap. No, this isn’t an episode of Hoarders. I’m going to talk about crap in the digital sense, not the “I’m not emotionally capable so I’m going to allow myself to be buried in mountains of stuff” way we may associate with the word. I’m going to discuss it in the context of data.
Once upon a time, a significant number of people produced tangible goods such as cars, food, and clothes. Manufacturing has given way to service-centered occupations (as shown in this Population Reference Bureau report) which require us to utilize our mental skills more than our physical ones. We’ve gone from being craftsmen to knowledge workers. Now most of us produce data and boy, do we produce a lot! Why do I say this? Well, I recently read a blog post called The Coming Data Explosion and in it the author discussed how people are producing more online data than ever before. In 2009 alone 281 exabytes of data were created, compared to just 5 in 2002.
What’s an exabyte? This is what it looks like, mathematically speaking:
1,000,000,000,000,000,000
To get a sense of how much that is let’s look at my laptop. It’s hard drive has room for 100 gigabytes of data storage. That’s the equivalent of a billion bytes, a byte being the most commonly used standard of measurement when it comes to computing. With this in mind, an exabyte represents 281 billion of my laptops worth of data generated last year.
That my friends, is a lot of crap.
But for innovators, this mountain of crap data is a potential goldmine. As the saying goes, one person’s trash is another person’s treasure. In the right hands, data can drive intelligent decision making and create opportunities for cost savings, increased productivity, or better quality in product offerings or services. It just needs the right conditions for success. The 4 necessary conditions are the following:
- Data access
- Buy-in from leadership
- Disciplined experimentation
- Quick and clear reporting
Gaining access to the data is the first hurdle to overcome. To paraphrase noted author Jac Fitz-enz, “We do not have a scarcity of data, but we may have difficulty collecting it easily.” How is it stored and in what format? How easy is it to tap into? What data do we need that’s not in-house? Do we trust it, meaning is the data accurate? Is there a platform by which the data can be compiled and manipulated without negatively impacting other systems (e.g., a test environment)?
The next hurdle is obtaining leadership support. That means answering the question, “How will this data create value for the organization?” Be prepared to present a business case which shows how this initiative will increase revenues through improvements in service, quality, and/or productivity. Perhaps creating conditions for deeper data analysis will highlight opportunities for better product offerings. This is one of the ways in which Harrah’s uses its data to remain the #1 casino company, based on revenue.
Disciplined experimentation is the final step after data access and leadership support have been secured. By disciplined experimentation I mean two things: it must be rigorous (that is, done in a high quality manner) and consistent (able to reproduce results repeatedly). Utilizing the principles of the scientific method (observation, hypothesis, testing, and review) is a good way to create this sort of process. Finally, they must also be able to quickly share relevant insights with key stakeholders, such as peers, partners, and collaborators. Reporting needs to be cogent so as to support leadership’s ability to make well informed business decisions.
In conclusion, people and organizations produce a lot of crap. Be smart and take advantage of it!
Crap photo by Paul. Data photo by J. Kleyn. Both licensed under CC.

You're hitting on a topic that is interesting to many business professionals. What is surprising to me is seeing that topic quantified and it seems unimaginable. Just last night, Andrew McAffee (http://andrewmcafee.org/about) was talking about Enterprise 2.0 and how organizations have so much data from so many people, they just have trouble capturing that data and being able to use the knowledge to better the organization. Thanks for sharing some of the detail around the topic.
Thanks Trish! Data management is becoming an increasingly larger part of an organization's focus. It could also be a great source of value and competitive advantage. I just wrote a humorous spin-off post based on this assumption.
Thanks again Trish!