Turning data into Information Part 1: Start with the basics: What, How and Why

When you’re starting a project that involves collecting data (user accounts, etc) that you might want to analyze (and let’s face it – what project doesn’t?), how do you start? Here are some examples of approaches that we’ve seen.

  • Focus on the data model first and foremost. Think about what you need today, and what you might want later, and try to get it all into the project.
  • Start with just the minimal data requirements. Let the application framework build your database for you. Add data columns if/when needed. Build your reports and intelligence ad-hoc as you go.
  • Ignore the needs for analysis – focus only on what data the project itself requires, and figure out later what analysis you can do with it.

This analysis is usually both inefficient and not intuitive. Data people are brought in. Developers optimize tables, indices, column types and foreign key relationships.

But what is your project really about?

Most of the time, your project starts here – with the customer. In the classic detective model, let’s call this the “what”, because it’s what we care about.

Next, we’ll typically ask ourselves what we know about the customer. We might know what product a customer buys or is interested in, and then any number of other pieces of data, like their location, industry, company size, or more. That leads us to a clearer picture of the customer (pun intended), which we often draw out like this (even if we only draw in our heads):

We might then put that into a view that includes other customers, when we’re thinking about rolling out a project to a wider audience.

We look for aspects where customers connect or overlap. Again, this might be some industry characteristic, problem type or other.

This is the approach that makes the most sense in our heads. And yet time and again, we see optimization for any number of other factors, rather than  understanding.

Fundamentally, when we start with the customer (what) and the elements that describe them (how – our connections), the combination of this information leads us to insight and understanding (why).

Most of the complex data projects we’ve been working on start with this foundation. And they lead to deeper, systemic insight. That insight leads to success. It’s quite basic, really.

Why would we make it any more complicated?