Links
Comment on page

SEC Filings

Financial statements, press releases and fiscal calendars for US public companies; fund manager investment holdings

Overview

This product includes a subset of the US Securities and Exchange Commission (SEC) filings by corporations, funds, and individuals data from EDGAR (SEC's Electronic Data Gathering, Analysis, and Retrieval system). The data includes both parsed and raw text from submitted filings as well as raw text from attached exhibits.
Example topics covered:
  • Fiscal calendars
  • Press releases
  • Earnings
  • Annual and quarterly reports
  • Major company events
  • Quarterly fund holding reports

Key attributes

Geographic Coverage
United States
Entity Level
Company, Report
Time Granularity
Quarterly
Update Frequency
Depending on source; see table below
History
Since January 1, 2019

Description

All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company). Entities may have characteristics (i.e. descriptors of the entity) in an index table and values (i.e. statistics, measure) in a timeseries table. Refer to Cybersyn Concepts for more details.
SEC Filings data revolves around a Company entity (identified by a CIK identifier) and a Report entity (identified by an ADSH identifier).
The Company entity represents organizations submitting 8-K, 10-K, and 10-Q reports to the SEC. Cybersyn is building out a reference spine, referred to as our Company Index, to support joining companies, subsidiaries, and their brands together in a hierarchy. The COMPANY_INDEX table, known as a security master in finance, aggregates commonly used company identifiers (i.e. CIKs, EINs, and LEIs) into a single company_id. The COMPANY_SECURITY_RELATIONSHIPS table maps OpenFIGI and PermID securities (i.e. securities with multiple "levels" such as OpenFIGI FIGI ID and OpenFIGI Share Class ID) to the Company. TheCOMPANY_CHARACTERISTICS table includes categorical characteristics of a Company (e.g. industry, address, previous names). A characteristic may be temporal with start and end dates indicating the range for which the data is valid. The SEC_CIK_INDEX table contains the distinct Company entities associated with SEC filings.
OpenFIGI IDs are unique identifiers for active and inactive securities across all global assets including equities, bonds, futures, and options. The Financial Instrument Global Identifier (i.e. FIGI) is an established global standard and used to facilitate easy mapping across data sources. Learn more and view OpenFIGI documentation here. The TOP_LEVEL_OPENFIGI_ID is the OpenFIGI ID of the highest, most widely encompassing level for a particular security. The hierarchy follows the order "share class" (global) > "composite" (country-level) > "FIGI" (exchange-level). A list of over 2M securities and their associated TOP_LEVEL_OPENFIGI_ID and respective TOP_LEVEL_OPENFIGI_ID_TYPE are provided in the OPENFIGI_SECURITY_INDEX table.
PermIDs are unique and persistent identifiers for active and inactive securities across global asset classes. Refinitiv manages PermIDs as a way to standardize mapping across various data sources. PermIDs cover organizations, instruments, and quotes. Cybersyn uses the instrument-level PermIDs (PERMID_SECURITY_ID) to map securities and organization-level PermIDs (PERMID_COMPANY_ID) to map companies. A list of over 15K securities are provided in the PERMID_SECURITY_INDEX table; over 65K company PermIDs are available in the COMPANY_INDEX table.
The Report entity represents individual filings with the SEC. Company reports can be 10-Ks (annual), 10-Qs (quarterly), or 8-Ks (major events and press releases). These filings are self-reported details on a company’s business, the risks it faces, and the operating and financial results for the fiscal year or quarter, as applicable. They are found in the SEC_REPORT_INDEX and can be joined to the SEC_CIK_INDEX via the CIK identifier. Attributes (the parsed XBRL line items from financial statements) associated with each report can be found in the SEC_REPORT_ATTRIBUTES table. A report’s full text can be found in the SEC_REPORT_TEXT_ATTRIBUTES table, again, is joinable to the SEC_REPORT_INDEX via CIK and ADSH identifiers. Each report in the SEC_REPORT_TEXT_ATTRIBUTES table also includes an SEC_DOCUEMNT_ID, which is a Cybersyn-created ID to uniquely identify report exhibits.
For fund holding reports (13Fs), each entity is an individual filing with the SEC in the SEC_HOLDING_FILING_INDEX. Attributes associated with each report can be found in the SEC_HOLDING_FILING_ATTRIBUTES table. Attributes include the securities’ names, market value, number of shares, and OpenFIGI IDs.
Each company’s reporting calendars can be found in the SEC_FISCAL_CALENDARS table. A fiscal calendar is a financial calendar year mapped to the company's quarterly and annual reporting period.
Additional report types, including those on private companies, are planned. Find bugs or want to request specific report types from EDGAR? Email us at [email protected].

Data Dictionary

Data Sources & Release Frequency

As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.
Tables Names
Source
Source Schedule
SEC_CIK_INDEX SEC_FISCAL_CALENDARS SEC_REPORT_ATTRIBUTES SEC_REPORT_INDEX SEC_REPORT_TEXT_ATTRIBUTES SEC_FISCAL_CALENDARS SEC_HOLDING_FILING_ATTRIBUTES SEC_HOLDING_FILING_INDEX
SEC
Monthly - within the first week
OPENFIGI_SECURITY_INDEX
OpenFIGI
Ad hoc
COMPANY_INDEX COMPANY_CHARACTERISTICS COMPANY_SECURITY_RELATIONSHIPS PERMID_SECURITY_INDEX
OpenFIGI
PermID
Ad hoc
GEOGRAPHY_INDEX
Data Commons is an aggregator of government data sources. Release calendars vary by underlying source.
The US Census Bureau publishes datasets about the US people and it’s economy, release schedules vary by dataset.

Notes & Methodology

Covered quarters

In the sec_report_attributes table, the covered_qtrs denotes how many periods the value covers. For example, a year-to-date measure for revenue from Q1-Q3 would be 3. A point-in-time measure that is “as of” a specific date has a 0 for covered_qtrs.

Text formatting

In the sec_report_text_attributes table, the raw text is pulled from the originally-filed reports. The raw text is stripped of HTML formatting and is presented as a single block of text. Note that this text includes both text-based sections of the reports (such as Management Discussions & Analysis) as well as non-formatted text of values from financial statements.

Geography mapping

Cybersyn mapped portions of companies’ addresses (city, state, country) to a unique geo_id that corresponds to that location. This is a feature of all Cybersyn datasets to allow for easy comparisons across datasets that use geographic identifiers.

Examples & Sample Queries

Aggregate company identifiers and metadata about companies

Search for commonly-used corporate identifiers such as CIK and EIN as well as stock-specific identifiers like ticker symbol and OpenFIGI ID
SELECT
idx.company_id,
idx.company_name,
idx.cik,
idx.ein,
idx.lei,
openfigi.openfigi_share_class_id,
openfigi.primary_ticker,
idx.permid_company_id
FROM cybersyn.company_index AS idx
JOIN cybersyn.company_characteristics AS char
ON (idx.company_id = char.company_id)
JOIN cybersyn.company_security_relationships AS rship
ON (idx.company_id = rship.company_id)
JOIN cybersyn.openfigi_security_index AS openfigi
ON ARRAY_CONTAINS(rship.security_id::VARIANT, openfigi.openfigi_share_class_id)
WHERE char.relationship_type = 'sic_description'
AND char.value = 'Air transportation, scheduled';
Compare financial metrics across companies
Compare revenue numbers from for 2022 for different airline companies.
SELECT i.cik, i.company_name, r.period_start_date, r.period_end_date, r.measure_description, TO_NUMERIC(r.value) AS value
FROM cybersyn.sec_cik_index AS i
JOIN cybersyn.sec_report_attributes AS r ON (r.cik = i.cik)
WHERE i.sic_code_description = 'AIR TRANSPORTATION, SCHEDULED'
AND r.statement = 'Income Statement'
AND r.period_end_date = '2022-12-31'
AND r.covered_qtrs = 4
AND r.metadata IS NULL
AND r.measure_description IN ('Total operating revenues', 'Total operating revenue');
View historical trends for companies
Measure Chipotle’s store count growth over time
SELECT i.cik, i.company_name, r.period_end_date, r.measure_description, MAX(TO_NUMBER(r.value)) AS value
FROM cybersyn.sec_cik_index AS i
JOIN cybersyn.sec_report_attributes AS r ON (r.cik = i.cik)
WHERE company_name = 'CHIPOTLE MEXICAN GRILL INC'
AND r.measure_description = 'Number of restaurants'
GROUP BY i.cik, i.company_name, i.cik, r.period_end_date, r.measure_description;
Pull fiscal calendars for public companies
Pull Walmart’s fiscal calendar does not align with the calendar year. Pull their quarter start and end dates
SELECT company_name, fiscal_year, fiscal_period, period_start_date, period_end_date
FROM cybersyn.sec_fiscal_calendars
WHERE company_name = 'WALMART INC.'
ORDER BY period_end_date;
Track portfolios of fund managers
See Berkshire Hathaway’s most recent public holdings disclosure and join associated tickers under which each holding may be traded globally on all exchanges
WITH latest_filing AS (
SELECT adsh
FROM cybersyn.sec_holding_filing_index
WHERE filing_manager_name = 'Berkshire Hathaway Inc'
ORDER BY filing_date DESC
LIMIT 1
)
SELECT att.*,
securities.global_tickers,
securities.permid_quote_id
FROM cybersyn.sec_holding_filing_attributes AS att
LEFT JOIN cybersyn.permid_security_index AS securities
ON att.permid_security_id = securities.permid_security_id
WHERE att.adsh IN (SELECT * FROM latest_filing)
ORDER BY att.market_value DESC;

Releases & Changelog

10/8/23 - Added PERMID_SECURITY_INDEX table to expand security coverage
  • Added one new table to our dataset, PERMID_SECURITY_INDEX, which includes security identifiers from Refinitiv’s PermID database. These are persistent identifiers for active and inactive securities across global asset classes. The table includes over 15K PermIDs for various securities.
  • This data can be merged to the 13F filings data (SEC_HOLDING_FILING_ATTRIBUTES) and can be mapped back to companies using the COMPANY_SECURITY_RELATIONSHIPS table.
9/17/23 - Added COMPANY_INDEX, COMPANY_CHARACTERISTICS, and COMPANY_SECURITY_RELATIONSHIPS tables; added PermIds
Added three new tables:
  • The COMPANY_INDEX table aggregates commonly used company identifiers (i.e. CIKs, EINs, and LEIs) into a single a single company_id, which can be used across Cybersyn’s datasets as a unique identifier for corporate entities.
  • The COMPANY_SECURITY_RELATIONSHIPS table maps OpenFIGI and PermID securities (i.e. securities with multiple "levels" such as OpenFIGI FIGI ID and OpenFIGI Share Class ID) to the Company.
  • TheCOMPANY_CHARACTERISTICS table includes categorical characteristics of a Company (e.g. industry, address, previous names). A characteristic may be temporal with start and end dates indicating the range for which the data is valid.
Added PermId securities published by Refinitiv.
9/7/23 - Added 13F filings to include data on quarterly investment fund managers’ holdings; added a securities index table based on OpenFIGI data
Expanded our dataset to include individual filings from 13F fund holding reports, which disclose the equity holdings of institutional investment managers. Added three new tables, SEC_HOLDING_FILING_INDEX, SEC_HOLDING_FILING_ATTRIBUTES and OPENFIGI_SECURITY_INDEX.
SEC_HOLDING_FILING_INDEX table contains metadata from individual 13F filings including filing date and filing organization. SEC_HOLDING_FILING_ATTRIBUTES table includes securities' names, market value, number of shares held, and OpenFIGI IDs, which facilitate easier mapping and analysis to outside data sources. Table OPENFIGI_SECURITY_INDEX contains an index of over 2M securities listed on OpenFIGI and can be joined with table SEC_HOLDING_FILING_ATTRIBUTES using TOP_LEVEL_OPENFIGI_ID - the unique identifier for each security in the two tables.
8/13/23 - Added 8-K filings and exhibits for 10-Qs and 10-Ks
Expanded our coverage of SEC documents to include the full text of 8-K filings and associated exhibits. 8-K filings include company press releases, earnings releases, and other major corporate events.
Added the full text of exhibits for 10-K and 10-Q filings. Exhibit types include lists of subsidiaries, merger agreements, and material changes in financial conditions. Exhibits are denoted in the variable and variable_name columns (e.g. 10-K EX-21 Filing Text).
Added the sec_document_id column. This field is a combination of the ADSH (accession number) and the document type (e.g. 10-K). This serves as a unique identifier for each individual component that makes up a filing in cases when one or more exhibits are included in a filing.
6/14/23 Added full text 10-Qs and 10-Ks
Added the full text of 10-K/Q filings. These are contained in the sec_report_text_attributes table.

Disclaimers

The data in this dataset is sourced here. Links to provider terms and disclaimers are provided where appropriate.
Perm ID: Terms
Cybersyn is not endorsed by or affiliated with any of these providers. Contact [email protected] for questions.
Last modified 1mo ago