In the tech industry, “Big Data” is the buzzword of the day. It actually means pretty much what it sounds like – a whole lot of data. TechAmerica’s definition gives us some more clarity: Big Data is “large volumes of high velocity, complex and variable data that require advanced techniques and technologies to enable the capture, storage, distribution, management and analysis of the information.” Big Data refers to situations where we have so much data, and so varied, accumulating on a continuous basis, that we have trouble analyzing and using the information with traditional techniques. Big Data is a challenge for modern organizations. Skyrocketing volume, variety, and velocity are making it harder to extract meaning.
The challenge of Big Data is exemplified throughout the U.S. federal government, the world’s largest and most complicated organization. Consider the daunting complexity of federal spending. In FY 2011, the federal government spent over $3 trillion in the form of contracts, grants, loans, direct payments, and other expenditures.
Attached to all that spending is data. Why did Congress appropriate taxpayers’ money the way it did? How did the dozens of federal agencies obligate each appropriation amongst the thousands of federal programs? How did each program divide its funding between internal expenditures (salaries and such) and external awards (contracts, grants, loans, etc.)? Who received each award? How much was received, and when did the Treasury issue each payment? How much has each recipient received, in sum, from multiple federal awards? How have the awards been spent? Where were they spent? The government has the right data – somewhere – to answer each of these questions. Yet with countless distinct agencies or offices using thousands of separate IT systems, it’s essentially impossible to assemble government-wide answers. The needed data resides in too many different places, organized in too many different ways.
The Big Data definition is not complete without the understanding of the big opportunity it brings. When advanced software tools are used to analyze Big Data, meaning can be extracted – insights that are more accurate, more detailed, and more useful. A report by the McKinsey Global Institute shows that the analysis of Big Data creates value by enabling experimentation, exposing vulnerability, creating transparency, permitting customization, and automating risk assessment. The tech industry is developing the needed tools. Companies are using them to catch fraud in financial markets, prevent drug abuse, find waste in the electric industry, and much more.
Big Data offers the same opportunity for big government. If the same techniques and technologies that are transforming the private sector could be unleashed on federal spending data, we would have government-wide answers to the questions above. Knowing these answers would help the government catch fraudulent contractors and grantees, prioritize resources, and identify and cut waste.
To derive these answers from its Big Data, the U.S. government needs to make two changes. First, it must publish more spending data in one place so that the tools can get at it. Second, it must apply consistent standards to the spending data wherever possible. With common identifiers and a common markup language for federal spending data, Big Data tools could make connections and compare large datasets easily.
The bipartisan Digital Transparency and Accountability Act (DATA Act), which passed the House of Representatives last year on a unanimous voice vote before dying in the Senate, is designed to make these two changes: publish and standardize federal spending data. The DATA Act would have required all the major categories federal spending data to be published together and reported using common identifiers and in a standard markup language.
The DATA Act’s death was not permanent. It is now awaiting reintroduction in the new 113th Congress. This should excite the tech industry, government transparency advocates, and anyone who cares about the effective use of taxpayers’ money. As the bill’s author, Representative Darrell Issa (CA-49), wrote in a recent op-ed, “journalists, academics, and citizen watchdogs will be able to build tools that ferret out fraudulent and wasteful spending and analyze the value taxpayers get for their dollar… the same [B]ig [D]ata analytic techniques that improve performance and save money on behalf of the shareholder today can be used on behalf of the taxpayer tomorrow.”
Stay tuned …