The SEC collects public information from thousands of corporations and publishes it on its EDGAR website. But almost all of the information the SEC collects is expressed as PDF or plain-text documents, not structured data.
This is a problem because the disconnected and static nature of the SEC’s information portfolio locks valuable data away from those who want to use it—investors, markets, tech companies, and even the regulators’ own staff. Without expensive, one-off analytics projects, it is impossible to perform sophisticated data analysis, automate fraud detection, or track changes in a particular company’s disclosures over time.
Open Data at the SEC ?
In 2009, the SEC began collecting corporate financial statements in the eXtensible Business Reporting Language (XBRL) data format. XBRL is a freely-licensable variant of the eXtensible Markup Language (XML) built specifically to handle financial statements and similar types of business information. In XBRL, each line item and each number in a financial statement has a unique electronic tag, which means financial statements that are expressed in XBRL can be automatically read by software.
The SEC’s adoption of XBRL was one of the U.S. government’s biggest and earliest open data efforts. In fact, it happened so early that the term “open data” wasn’t even in common use yet.
Dual filing disaster ?
Regrettably, when the SEC started its open data program, it never stopped collecting financial statements as plain-text documents as well, so every company has to submit its financial statements twice—once as documents and again as data. As a result, investors have not embraced the data version and companies see it as an unnecessary addition to their compliance requirements. Meanwhile, the SEC never got around to modernizing the rest of the hundreds of forms it collects from public companies.
So far, XBRL has been disappointing. SEC staff continue to focus on the old-fashioned document version, ignoring the data version, and the companies know that. As a result, the quality of XBRL data has disappointed. The quality was so poor, especially at first, that investors and the information companies serving them have been reluctant to stop using the old-school documents and give XBRL data a try.
Companies haven’t yet seen their capital costs reduced (as could have resulted from analysts being able to follow a wider range of companies). Members of Congress have even introduced legislation requiring the SEC to stop collecting data from most companies and regress to documents.
Beyond the financial statements, the SEC also collects insider stock trading information using XML and recently proposed to start collecting a few executive compensation details as open data. But with those minor exceptions, almost the whole universe of SEC corporate information is document-based rather than open data.
Fixing the SEC’s open data and disclosure system ?
The SEC has invited government, industry, and the public to submit comments on how the agency can improve the effectiveness of its corporate disclosure system. Our Coalition submitted a sixteen-page formal comment letter on October 29, 2015. You can read the comment letter embedded below or on SEC.gov here.
Our prescription to fix the SEC’s disclosure system can be summarized in two words: structured data.
Structured data would make regulatory filings more transparent, useful, and efficient for everyone who generates, collects, and uses the information they contain. And the benefit of structured information—easily distilled into actionable insights—to investors, regulators, markets, and filers would be immense. We believe the agency must replace its current paper-based system with one based on standardized data fields and formats instead.
If the SEC makes a plan to adopt consistent data fields and formats in place of the outdated document based forms, the Missed Connection won’t be missed any more. Open data and corporate disclosure will come back together. And they’re a perfect match.
A proposal currently being considered before Congress, the Financial Transparency Act, will ensure the SEC and the 7 other major U.S. financial regulatory agencies make the information they already collect from industry available online as open data—electronically searchable, downloadable in bulk, and without license restrictions. The Financial Transparency Act, should it pass, will legally require the SEC to take the steps outlined in the comment letter below.