Data Coalition Submits Ideas to Inform the New Federal Data Strategy
Last week, the Data Coalition responded to the newly released Federal Data Strategy which we summarized in a blog two weeks ago.
The Federal Data Strategy is an outgrowth of the President’s Management Agenda, specifically Cross Agency Priority Goal #2 – Leveraging Data as a Strategic Asset, which is co-lead by the Office of Management and Budget (OMB), the Office of Science and Technology Policy (OSTP), the Department of Commerce (DOC), and the Small Business Administration (SBA). Administration officials within these agencies called for public feedback and are currently working through the responses. We expect to see more detailed plans between October and January 2019 (see page 11 of the recent action plan).
Commentary on Draft Principles:
The Federal Data Strategy proposes 10 principles spread across three categories: Stewardship, Quality, and Continuous Improvement.
Overall, we emphasized the benefits of assuring federal data assets are in open and machine-readable formats that impose uniform and semantic structure on data, thus mitigating organizational uncertainties and expediting user development. We also discussed the importance of pursuing data standardization projects that identify common data elements across organizations and enforce standards.
For Stewardship, it is important to ensure that data owners and end users are connected in ways that assure data is presented in useful ways and that data quality can be continuously improved.
With regards to Quality, it is important to establish policies to assure core ‘operational’ and ‘programmatic’ data assets are accurate, consistent, and controlled. We note simply that it is at the point of ingestion that any data standards or quality thresholds should be enforced. As a starting place for data strategy principles, we recommend incorporating the open data the principles identified by the CIO Council’s Project Open Data.
And finally, for Continuous Improvement, we recommend that data should be made available in open formats for bulk download by default. This allows for maximum stakeholder engagement from the beginning.
Six Proposed Open Data Use Cases:
We also propose the following six use cases for the Administration to work on:
Use Case 1: Fix Public Company Filings (Access, use, and augmentation)
The Securities and Exchange Commission (SEC) requires public companies to file financial statements in standardized XBRL format, but the standard has complications. Currently, the format allows for too much custom tagging, inhibiting the goals of comparability and transparency. The Administration should work with the SEC and the Financial Accounting Standards Board (FASB) to ensure that the U.S. Generally Accepted Accounting Principles (US GAAP) taxonomy enforces FASB rules as the true reference for all elements in the taxonomy, thus eliminating unnecessary tags, reducing overall complexity, and minimize the creation of extension data elements. This will ultimately improve comparability and data quality.
Use Case 2: Documents to Data in Management Memorandum (Decision-making and Accountability)
Congress has already taken on the challenge of adopting a data standard for laws and mandates via the United States Legislative Markup (USLM), which provides a framework for how the Administration can transform federal documents into open data. The Administration should publish federal management guidance in integrated, machine-readable data formats instead of documents. This will allow agencies to better understand how policies integrate with each other and thus work to comply more readily, and allow the public and Congress to better understand the specific factors guiding and constraining agency programs and leadership.
Use Case 3: Entity Identification Working Group (Enterprise Data Governance)
Currently, the federal government uses a variety of different codes to identify companies, nonprofits, and other non-federal entities, which makes matching data sets across federal agencies a time-consuming and expensive undertaking. Adoption of the Legal Entity Identifier (LEI) as the default identification code for legal entities will enable agencies to aggregate, compare, and match data sets critical to their regulatory and programmatic missions.
Use Case 4: Mission Support or Operational Data Standards Coordination (Decision-Making and Accountability)
Treasury and the Office of Management and Budget (OMB) have spent over four years working to establish and integrate the DATA Act Information Model Schema (DAIMS), which links budget, accounting, procurement, and financial assistance datasets – operational data – that were previously segmented across federal agencies. The Administration should utilize the DAIMS for modernizing the annual budget process, agency financial reporting, and agency performance reporting, thus allowing for easy use of data to compare, justify, and plan budget goals and agency spending.
Use Case 5: Mission or Programmatic Data Standards Coordination (Enterprise Data Governance; Decision-Making and Accountability; Access, Use, and Augmentation)
To build a common approach to multi-agency programmatic data sharing, the Departments of Homeland Security and Health and Human Services created the National Information Exchange Model (NIEM), which maintains a data dictionary of common fields allowing agencies to create formats using those fields. The Administration should consider endorsing NIEM as the government-wide default for programmatic data standardization and publication projects. This will afford agencies the easier path of reusing common data fields of the NIEM Core, rather than building their own data exchanges and reconciliation processes.
Use Case 6: Establish A Standard Business Reporting Task Force to Standardize Regulatory Compliance (Enterprise Data Governance; Access, Use, and Augmentation)
Standard Business Reporting (SBR), which has been fully implemented in Australia, demonstrates that regulatory agencies can reduce the compliance burden on the private sector by replacing duplicative forms with standardized data, governed by common data standards across multiple regimes. The Administration should convene a task force representing all major U.S. regulatory agencies to create a roadmap for standardizing the data fields and formats that they use to collect information from the private sector. While the full implementation of a U.S. SBR program would require a multi-year effort, the creation of an exploratory task force would put the policy barriers and necessary investments into scope.
Read the Data Coalition’s full submission here.
Other Organization’s Feedback Echo an Open Data Approach
While the responses have not yet been made public in a central portal, we have gathered a few of the key submissions.
The Bipartisan Policy Center (BPC) has issued two separate comment letters. The first letter, on behalf of the former leadership of the Commission on Evidence-Based Policymaking, summarizes the Commission’s recommendations. Their second letter summarizes recommendations made by the BPC coordinated Federal Data Working Group, which the Data Coalition works with. Here we have joined a call to clarify the guidance from the 2013 open data Executive Order (M-13-13) (e.g., define “data asset” and renew the focus on data inventories), leverage NIEM to develop data standards, look into harmonizing entity identifiers across agencies, explore preemptive implementation of the Foundations for Evidence-Based Policymaking Act (H.R. 4174), which includes the OPEN Government Data Act, and to define terminology for types of public sector data (i.e., similar to our comment’s demarcation between operational and programmatic data).
The Center for Data Innovation (CDI) think tank also provided feedback that calls for the administration to support the passage of the OPEN Government Data Act as “the single most effective step” the administration could take to achieve the goals of the Federal Data Strategy. Additionally, CDI calls for improvements to data.gov’s metadata, for OMB to establish an “Open Data Review Board” for incorporating public input in prioritizing open data projects, and for the Administration to establish “data trusts” to facilitate sharing of non-public data. Lastly, they make the point to consider how the Internet of Things (IoT) revolution and Artificial Intelligence (AI) should be included in the conversation.
The data standards organization XBRL-US recommends that the Administration “require a single data standard for all financial data reporting…to establish a single data collection process,” adopt the Legal Entity Identifier for all entities reporting to the federal government and use automated validation rules to ensure data quality at the point of submission.
The new State CDO Network sent a letter emphasizing the important role of State and local governments. They wrote,“[States are] in the unique position of creating and stewarding data based on federal requirements,” while calling for a formal plan to leverage administrative data to address fraud, waste, and abuse.
The Preservation of Electronic Government Information (PEGI) Project calls for an advisory board to make recommendations on data management and stewardship while echoing our call to utilize the open government data principles and also incorporate the FAIR lifecycle data management principles. PEGI also calls for scalable and automated processes for maximizing the release of non-sensitive data on data.gov.
Lastly, the American Medical Informatics Association (AMIA) identifies the publication and the harmonization of data dictionaries across agencies as two fundamental activities. They also call for collecting and creating information in ways that support “downstream information processing and dissemination,” establish a framework to help agencies implement a “portfolio approach” to data asset management, and for the Administration to extend the concept of “data as an asset” to information produced by federal grant recipients and contractors.
The Data Coalition will be working with these groups and others to align the Administration’s efforts to establish a pragmatic, sustainable, and effective Federal Data Strategy.