Blog

From data to information: The NZ Budget 2015

What can we learn from the NZ Budget 2015?

Recently the New Zealand government delivered its 2015 annual budget. The treasury department put together a spreadsheet with all departmental appropriations along with comments on the use for the allocation. We wanted to see if we could extract some insights from this government information even though we knew little about what we might expect to find in the data.

Getting the Data

After downloading the spreadsheet, we spent some time analyzing what the important concepts were. These tend to be the column headings in the spreadsheet, such as Appropriation Name, Category and Function in this case. FlockData calls these “Tags”. We call each row in the spreadsheet an “Entity”.

NZ Budget - the original spreadsheet

 

After about 10 minutes of analysis, we spent a further 20 minutes producing a “Content Profile”. The Content Profile tells FlockData how to handle the incoming data. Content Profiles provide a mapping that converts two dimensional data into multi-dimensional information. We’ll come to that later on in this post. Here’s an excerpt of the Content Profile:

A FlockData Content Profile

With the content profile in place and the raw data from the spreadsheet (exported as a tab delimited file), we imported the data in to FlockData. Total time elapsed – 30 minutes.

 

Analysis

With the data in-place we’re now in a position to perform ad-hoc analytics using any tool that can communicate over HTTP using JSON documents to request and receive data. FlockData’s visualization workbench, The FlockData Viewer (FD-VIEW) can be used to get a high-level overview.

Since we described the information on the incoming spreadsheet, FlockData can make it available to query based on the Tags we defined as being important. In this example, we’ve said we are interested in:

  • All spending in the year 2015
  • Spending on Education

We’ve asked for the chart to show the allocation by Functional Classification relative to Appropriation. The descriptions for  these headings has been described in the spreadsheet for anyone reading the document to try to undestand. This chart is called a Chord Flare diagram. The thickness of a given segment is based on the amount of money allocated between the source and the target, so in this example, from Education funding to Early Childhood Education.

NZ Budget 2015 Education Chord Flare Diagram

At first glance, the data is quite noisy. But because we are able to chart on any of the text contained in the spreadsheet, we can filter and refine our analysis extensively and easily. Let’s say that we’re interested in only educational spending that is related to schools. We simply tell FlockData to also restrict to budget items containing “School”.  Now we start seeing a more focused view of the information.

Narrowing the view to School funding only

We can even change the presentation of the information to give us a different perspective on the same result

NZ Budget 2015 School funding shows in a BiPartite graph

We can see at a glance that Primary Education is receiving more funding than Secondary Education. Considerable funds are being spent on managing school property portfolios, while a tiny amount is spent on school furniture, equipment and improvements.

Does this tell us anything?

That’s the difference between information and insight.

Systems rely on data, but humans make decisions on information. FlockData provides a way of organising data as information.

Codifying insight gained from information into automated systems that responds to changes in data is one way that companies will leverage data to their competitive advantage. FlockData eases that journey by helping you describe what is important about information and putting that power into the hands of your systems, processes and users.

What we have done here is taken some publicly available data, identified the key concepts in the data and then performed any number of ad-hoc analysis of the information. This is all designed to allow rapid exploration of data to find information that supports the decision-making process of people and systems alike.