Connecting big business (regulation) to Big Data: Columbia report shows the need for action
The Columbia report is a call to action by both the SEC and Congress. The Data Transparency Coalition is going to pursue that action in 2013.
In 2009, the SEC adopted a requirement for public companies to file each financial statement in the eXtensible Business Reporting Language (XBRL) alongside the regular plain-text version. The requirement was slowly phased in over four years, starting with the largest companies and eventually covering all public companies. The XBRL format imposes a data structure on the financial statements and their notes and footnotes by assigning electronic tags to each item and defining how the items relate to one another.
Judging by potential impact, this is the most ambitious data transparency program ever undertaken by the U.S. government. The XBRL reporting requirement transformed all of the public financial statements in the world’s largest capital market from cumbersome text, which must be manually transcribed to allow quantitative analysis by investors and regulators, into an open, standardized, machine-readable format.
In theory, replacing unstructured text with structured data should, by now, have triggered revolutions and disruptions all over the financial industry. The SEC’s XBRL reporting requirement should, by now, have opened up corporate financial statements in the United States to Big Data platforms and applications.
- Investors and analysts serving them should, by now, have started using powerful new software tools to compare and analyze the newly-structured financial statements – and to mash financial figures together with other data sources. They should be making better decisions, evaluating a broader universe of companies, and democratizing the financial industry.
- Aggregators like Bloomberg and Google Finance should, by now, have started saving money and improving accuracy by ingesting corporate financial data directly from the SEC’s structured XBRL feed instead of manually entering the numbers into their own systems (or paying someone else to do that).
- The SEC should, by now, have incorporated structured corporate financial data into its own review processes, instead of relying on manual reviews of the financial statements in Forms 10-K and 10-Q.
- Other federal agencies should, by now, have started automatically checking the financial performance of companies as reported to the SEC before bestowing contracts or loan guarantees (among many other possible uses).
None of these things is happening on a large scale – yet. The Columbia report explains why. The Columbia report also hints at what the SEC and Congress can, and should, do about it.
- Investors are demanding structured data – not unstructured text – to track companies’ financial performance. The Columbia authors “have no doubt that [investors’] analysis of companies will continue to be based off increasing amounts of data that are structured and delivered to users in an interactive [structured] format” (p. i). “[T]here is clear demand for timely, structured, machine-readable data including information in financial reports, and … this need can be met via XBRL as long as the XBRL-tagged data can reduce the total processing costs of acquiring and proofing the data, and that the data are easily integrated (mapped) into current processes” (p. 20).
- Nonetheless, most investors are not making any use of the structured-data financial statements that public companies are now submitting to the SEC. Fewer than ten percent of the Columbia study’s non-scientific sample of investors said they were using XBRL data downloaded directly from the SEC or from XBRL US (p. 61). Instead, most investors were getting their corporate financial information from aggregators like Bloomberg and Google Finance – some free, some not. Moreover, aggregators told Columbia that they were not using XBRL data either. Aggregators were mostly still electronically scraping the old-fashioned plain-text financial statements (which are still being filed alongside the new structured-data financial statements) and manually verifying the numbers – or paying others to do that “labor-intensive” work for them. (pp 26-27.)
- Two problems explain why most investors have not begun to use structured-data financial statements. First, they don’t yet trust the data. “XBRL-tagged SEC data are generally perceived by investors as unreliable,” say the Columbia authors, both because of errors in numbers and categorization and because of companies’ use of unnecessary extensions, hindering comparability (p. 28). Columbia’s review of the quality of structured-data financial statements filed with the SEC (conducted two years ago) revealed that fully 73% of filings had data quality errors (p. 32). Moreover, investors reported “a large number of seemingly unnecessary company-specific tags” (p. 21). Investors surveyed by Columbia were “especially hesitant about using the data until they are comfortable that the XBRL data matches the [plain-text] data in SEC filings” (p. 21). Aggregators, too, were holding off until accuracy and comparability improved.
- Second, investors don’t yet have a wide range of software tools to compare and analyze structured-data financial statements. “End users are also looking for easy-to-use XBRL consumption and analysis tools that do not require programming or query language knowledge. In general, these users are not willing or able to incur the significant disruption to their workflow that they perceived would be required to incorporate XBRL data without state-of-the-art consumption and analytics tools.” (p. 24)
- If these two problems were fixed, investors could make enthusiastic and productive use of structured-data financial statements. “[T]he potential for interactive data to democratize financial information and transform transparency remains stronger than ever, and many participants, including most investors and analysts, wish that the data were useful today,” say the Columbia authors (p. 4). For instance, “virtually all investors” frequently use information that is available only in the footnotes of corporate financial statements to make their decisions – information that is now submitted and published in XBRL as part of companies’ structured-data filings (p. 48.) “With respect to the detailed-tagged footnote data, in particular, several investors and analysts have communicated to us that they view XBRL data as potentially an excellent solution to manually collecting the data they need” (p. 31).
- Even if most investors aren’t directly using structured-data financial statements, there will be indirect benefits to investors and the markets if the SEC starts using such data for its own reviews. The study reported that “the SEC has begun to review the data to identify filer-wide, as well as individual company filing and financial reporting issues. XBRL data could significantly enhance the efficiency of the Division of Corporate Finance’s review of filings and facilitate a “red-flag” ex-ante approach to regulatory oversight.” (p. 25) “Representatives from the FASB and the SEC have both stated on the record that, in their opinions, the amount of time that it takes them to conduct their respective analyses has been reduced significantly by their use of the XBRL-tagged data (p. 26).” Even imperfectly implemented, the XBRL mandate could indirectly benefit investors and the markets by improving the SEC’s review and enforcement processes.
The SEC’s XBRL reporting requirement could deliver transformative data transparency. But it has not. So far its impact has been incremental, not transformative.
To be sure, the problems identified by the Columbia study are problems of execution, not shortcomings of XBRL itself or of the concept of structured data. Investors and the analysts serving them “would like to have the U.S. regulatory filings tagged in a structured (e.g., XBRL) format that would meet their information requirements” (p. 5). For the SEC to eliminate the XBRL reporting requirement entirely – as some filers seem to hope that it will – would be a backward move and a tragic mistake.
How can the SEC fix these problems of reliability and analysis and deliver transformative transparency? The Columbia report suggests four answers:
- First, insist on accuracy and quality! The SEC does not require companies to amend their filings to correct tagging errors and unnecessary extensions. The Columbia report suggests strongly that it should. The Columbia authors fault “the reticence (or inability) of regulators and filers to ensure that the interactive filings data are accurate and correctly-tagged from day one of their release to the public and forward (or, to communicate to the market for this information that they were not insisting on this and why)” (p. 37). It is “critical” to reduce errors and extensions, either through “greater regulatory oversight and potentially requiring the audit of this data” or through third-party quality checks (pp. 42-43). The SEC’s own interests should motivate it to insist on accuracy once it becomes “serious about using the data in its Corporate Finance function and even for enforcement, as it should” (p. 43) (emphasis added). The need to improve quality might require the SEC and the Financial Accounting Standards Board to consider simplifying the underlying XBRL taxonomy (pp. i, 14, 43).
- Second, communicate that structured data is not a supplemental feature of a regulatory filing. Rather, it is the filing! The Columbia authors explain that “the reliability of the data has been compromised by the way filers have approached their XBRL filings … [perceiving] XBRL-tagging [as] an additional task in the financial reporting documentation process rather than as a part of the internal data systems” (p. 29). The SEC framed its XBRL reporting rule as a requirement to “create an XBRL-tagged reproduction of the paper or HTML presentations of their filings” (p. 37), rather than “making individual data points available for the end user to utilize or present as they required” (p. 39). Since filers think structured-data financial statements are “incremental to their existing [plain-text] filings, they do not perceive any user need” (p. 35) – and take few pains to ensure that investors using their structured data filings get an accurate picture of their finances. “We believe this presentation-centric step hindered or diverted what should have been an important evolution from a paper presentation-centric view of financial reporting information to a far more transparent and effective data-centric one” (p. 37). One way to correct this situation would be to move to a data format that is both human-readable and machine-readable, combining the plain text and structured-data tags in a single filing. Inline XBRL would do exactly that, and in fact the SEC is considering adopting this format (n. 48).
- Third, encourage the development of software tools that make structured-data financial statements come alive! This is something of a chicken-and-egg problem. More software tools will be created as investors demand them. But effective, lightweight, cheap XBRL analysis tools are already on offer – notably Calcbench.
- Fourth, expand the mandate! The Columbia report is clear that investors want more regulatory information tagged and structured, not less (p. 28):
i. The data that are required by the SEC to be XBRL-tagged are all relevant in varying degrees to some subset of the investor/analyst population, but more data are required than currently mandated—e.g., earnings release, MD&A, etc.
ii. If anything, users require more, not less, types of machine-readable data to be made available, because a significant amount of information they require are not from SEC filings or financial statements.
iii. The primary focus on data in the SEC filings of annual and quarterly financial statements seriously limits the perceived ongoing usefulness and relevance of the data.
Over and over, the report points out that the SEC’s current mandate for structured data is limited to the financial statements and accompanying notes (pp. 14, 18, 21, 24, 34-35, 42). Everything else that companies must file with the SEC under the U.S. securities laws is still submitted only in plain text. These other materials – earnings releases, corporate actions, executive compensation disclosures, proxy statements, officer and director lists, management discussions – could be valuable if tagged. But they are not. Investors “view access to the full array of footnote, management discussion and analysis (MD&A), and earnings release numerical data as the main reason to consider adapting their workflow to incorporate XBRL-tagged filings” (p. 21). But this demand is “pent-up” because such items are not – yet – included in the SEC’s mandate (p. 24).
If the SEC is unwilling to act, Congress could insist. Our Coalition will call for the reintroduction, this year, of the Financial Industry Transparency Act. That bipartisan proposal, first introduced in 2010 by Reps. Darrell Issa (R-CA), Edolphus Towns (D-NY), and Spencer Bachus (R-AL), would require these steps as a matter of law.