Data Coalition Celebrates New Open Data Law on Capitol Hill


On Wednesday, the Data Coalition hosted a Legislative Data Demo Day to show what’s possible when we make our laws and legislation more accessible.

Across all of our policy initiatives, the Data Coalition encourages federal and state governments to create or collect data in machine-readable structures using non-proprietary formats. This past Wednesday we explored how legal and regulatory information can be reformed in order to provide maximum value to both lawmakers, and the public.

We need to transform laws and regulations into searchable, usable data…the Statutes at Large Modernization Act (HR 4006) does just that.

From left to right: Daniel Schuman, Joel Gurin, Christian Hoehner, Rep. Brat.

Congress is poised to transform its legislative information from outdated documents into open, searchable data. If the House and Senate finally adopt a consistent data format for all bills, amendments, passed laws, and legal compilations, then new software could bring greater transparency and more efficient lawmaking. It’s for these reasons that the Data Coalition offers an earnest endorsement of Statutes at Large Modernization Act (SALMA – HR 4006), a bipartisan bill introduced by Reps. Dave Brat (VA-7-R) and Seth Moulton (MA-6-D).

SALMA takes a giant step toward a data-driven future by establishing a structured data format for the Statutes at Large.

While the U.S. Code organizes laws by subject matter (a process known as codification) it doesn’t contain laws that may have been subsequently amended or annual acts like yearly federal appropriations. The Statutes at Large lists all laws sequentially, the way they were originally passed by Congress from 1789 to the present. It is the definitive collection of U.S. statutory law.

Why does fixing the U.S. Statutes matter? Congressman Brat, an economics professor by trade, explained how a comprehensively searchable and open Statutes at Large could enable legislative staff, legal researchers, academia, and the public to examine how past generations exercised Constitutional powers at various points in history. “Access to accurate information is the lifeblood of academia…and we have a duty to make good decisions” he reiterated.

Together with similar reforms for other legislative materials, SALMA will enable better decision making. The functional benefits include automatic redlining between bills and the laws they amend, electronic crosswalks from Congressional budgetary appropriations to the final disbursement of taxpayer funds, and cheaper, easier legal research.

For a tangible example of why this matters, we can see how not having our Code of Federal Regulations (CFR) published in an easily searchable format has made research harder at the George Mason Mercatus Center’s RegData project. Their researchers have been challenged by the messiness of the existing CFR as as they have worked to develop a better methodology for accurately quantifying the total volume and impact of U.S. regulations. They originally worked with the CFR as published in machine-readable formats, but had to abandon the error-riddled XML publications and go back to screen scraping the PDF files. Reforms like SALMA that would verify that legal information is properly structured, are extremely attractive to them as consumers of these bulk data sets.

SALMA represents a meaningful reform that enables our nation to better understand our own laws. We are grateful for the leadership of Representatives Brat and Moulton on this front.

The Legislative Data Demo Day showed strong consensus for moving to structure data while multiple tech demos confirmed the right tech tools exist.

The Data Coalition was joined by a panel of partner organizations that found ready cooperation around the need to open legislative data. Representing these organizations were Joel Gurin of the Center for Open Data Enterprise, Daniel Castro of the Center for Data Innovation, and Daniel Schuman of the Congressional Data Coalition.

Some fun analogies emerged when we addressed the thorny issue of providing publicly available data vs. publically accessible data (the difference between merely providing access to messy data vs. assuring the data is properly structured and therefore usable). Castro compared the idea of open data to an attempt to compare the cost of burritos across DC restaurants, all the information is technically available on menus, but compiling all that information is a herculean task. Schuman said “the difference between open data and accessible data is the difference between a lightning bug and lightning.” Succinctly put!

Speaking of lightning, the data demo day also included a lightning round of six tech demonstrations highlighting the range of possibilities with open and accessible data.

First we learned about the benefits for citizens as we transform documents into data. Marci Harris discussed how her challenges as a former Hill staffer lead her to create the online platform PopVox to deliver targeted policy input from constituents to legislators (see her presentation). And the OpenGov Foundation demonstrated their open-source Madison Project which allows for direct public collaboration on policy drafting (see slides). Both tools would not be possible without reliable, structured access to bills.  

Representing the world of legislative and legal professionals we heard from FastCase, a platform providing legal research tools to 800,000 lawyers. Phil Rosenthal demonstrated how software can enable powerful visualizations of U.S. case law that shows “the law change right before your eyes” (see slide 8). With SALMA’s reforms, case law could then be closely mapped to our US Statutes!  

Xcential’s CEO, Grant Vergottini, detailed how their company has been deeply involved in the U.S. House Modernization project, installing modern XML-based software to transform how bills are actually drafted and amended in real time (see screenshots of their powerful solutions). Implementing such solutions heavily depends on there being standard structures for legislative information.

And to wrap up, the House Clerk’s Kirsten Gullickson discussed the longstanding work of the XML in Congress initiative. She made the fundamental point that Congress has been open to the public since 1789. And the U.S. House has a proud legacy of modernization projects responsive to the constantly evolving way in which the public consumes information (see this technology timeline).

What’s the ultimate vision? A legal and regulatory system that is searchable and logical.

Expressing laws and regulations as data we’ll ultimately be able to automate compliance. This could reduce the need for layers of lawyers between businesses and the laws they must follow. By first expressing laws and regulations as lines of code, this data will enable greater searchability and the development of automated interpretation tools.

This greater accessibility means quicker interpretation of, and more rapid compliance with, our laws so that they can ultimately have their intended effect.

Highlighting the current problem, GSA’s Dave Zvenyach commented on how expressing laws and regulations as data takes lawyers and coders willing to work together to make sense of convoluted regulatory information. He showed how the 18-F team’s work on the eRegulations project is restructuring how Consumer Financial Protection Bureau and Bureau of Alcohol, Tobacco, and Firearms regulations are presented. The intuitive, easy-to-navigate portal allows users to effortlessly find legal definitions and see how regulations are interlinked. However, he noted how this work would be a whole lot easier if regulations were better structured from the start.

To get there, governments need to treat legal and regulatory information as structured data and express that information in machine-readable formats.

Reforms like SALMA, and platforms like we saw demoed this week, are bringing us closer to this vision of truly open law.