The (fixable) problem with Steve Ballmer’s USAFacts site: its lack of transparency

Those of us who have been in or around the technology space since the 1990s when Microsoft used to bulldozer all in front of it (for those younger than that: like Google in the 00s, and Facebook now) are completely unused to Steve Ballmer doing things that we can uncomplicatedly see as good.

But his new site, USAFacts, is one of those things. Like Bill Gates pouring money into vaccinations for developing countries, it’s a good thing to have done.

USAFacts, in case you haven’t read the New York Times piece about it (where the site is described as a “fascinating data trove”), the precis is that Ballmer has spent $10m hiring economists and others to put all the spending and other data that the US government products – particularly its budget, where it spends $5.4 trillion and gets $5.2trn (it’s left as an exercise to the reader to figure out how one bridges that $0.2trn gap) – into a site which you can query, to find out just where your money goes, if you’re an American.

Neatly, it divides expenditure into the four “missions” of the US government, which apparently are “establish justice and ensure domestic tranquility; provide for the common defence; promote the general welfare; secure the blessings of liberty to ourselves and our posterity”.

There may be trouble ahead

The problem starts once you begin wandering into it and wondering: ok, how does that number come together? Take the “Crime and police” tab under the first mission: click through and you get some data showing how arrests, violent crime, and “public safety officer” numbers are moving. The prisoner data actually has a link to a set of slides – 291 PDFs – drawn from the Bureau of Justice Statistics and reproduced as PDFs by USAFacts, which sticks its own copyright onto them (down in the bottom right).

Screenshot 2017 04 19 17 54 04

(Come on. That’s crazy. Repurposing open-licensed data as PDFs and sticking your own copyright note on them? Is this the 1990s?) In general, you’re left having to trust the site to have got it right. Plus: in almost every situation, we don’t know where the data has actually come from. The prison data is an exception. There’s no transparency in a site which is trying to make government transparent.

Unless I’ve missed something staringly obvious, there are no places where you click on a link and it says “we got this from the Bureau of Labor Statistics, and this from the Treasury, and this from the Department of Defense”. The methodologies (here’s one – PDF) will give you a number of the methods and sources from which all the numbers are collated, but not how they’re balanced out against each other.

This is a huge omission. I can understand that $10m only goes so far, but if you compare it to a site like Our World In Data, which has been built on a budget probably comparable to one day of USAFacts’s, you see the contrast. Not only are the datasets open and downloadable, you get pointers back to the originals.

Coins: adding up to something

Nor is it as though trying to analyse government spending is a new thing. In the UK, HM Treasury made its COINS database into open data after pressure from newspapers – well, particularly the Guardian, where we championed free data.

From COINS you could generate visualisations like these:

And you could do all sorts of visualisations of what was going on.

This was back in 2010; it’s not as though this is some groundbreaking piece of work that has only just been released to the public and might not have crossed the radar of, say, a multibillionaire who has enlisted a lot of economists and statisticians and programmers to do some work analysing a country’s budget.

Action points

In conclusion: Steve Ballmer has made a start. It’s an adequate start, and given the size of what he’s trying to contend with, it’s laudable that he’s got this far. But there are many more things that remain to be done.

Just in case he’s vanity-searching, here is my list of suggestions for how to turn USAFacts into a truly useful site that will let people explore what the US government and its states are doing with their money:

• link back to the original data. People need to be able to trust it.
• offer inflation-adjusted views of the data. It’s not hard to find inflation figures for past years.
• add trade figures. Imports and exports are relevant data for understanding your country’s performance, and some part of government definitely collects them.
• add comparative figures with other countries. It means nothing to tell people how much they’re spending on health care if they don’t know how that compares to other countries.
• find ways to let people drill down to their state, country and district, and compare them with other states, counties and districts. It’s just data and databases, after all.
• let people see how employment and other elements have changed. Show them how racial factors have changed. Show them how things have changed with different politicians. Everyone wants to apply these prisms to these data, and while you might not want this data to become partisan, the reality is that these numbers will be used anyway by people of whatever tinge to prove whatever they want. You’re already in that battle, so give people more weapons to fight it.

It’s a good start, Steve. Please don’t make us wait for version 3 for it to become a must-use. (Yes, kids, that was a Microsoft joke. Ask your parents.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s