Comment on page
SEC Filings
Financial statements, press releases and fiscal calendars for US public companies; fund manager investment holdings
This product includes a subset of the US Securities and Exchange Commission (SEC) filings by corporations, funds, and individuals data from EDGAR (SEC's Electronic Data Gathering, Analysis, and Retrieval system). The data includes both parsed and raw text from submitted filings as well as raw text from attached exhibits.
Example topics covered:
- Fiscal calendars
- Press releases
- Earnings
- Annual and quarterly reports
- Major company events
- Quarterly fund holding reports
Geographic Coverage | United States |
Entity Level | Company, Report |
Time Granularity | Quarterly |
Update Frequency | |
History | Since January 1, 2019 |
All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company). Entities may have characteristics (i.e. descriptors of the entity) in an index table and values (i.e. statistics, measure) in a timeseries table. Refer to Cybersyn Concepts for more details.
SEC Filings data revolves around a Company entity (identified by a CIK identifier) and a Report entity (identified by an ADSH identifier).
The Company entity represents organizations submitting 8-K, 10-K, and 10-Q reports to the SEC. Cybersyn is building out a reference spine, referred to as our Company Index, to support joining companies, subsidiaries, and their brands together in a hierarchy. The
COMPANY_INDEX
table, known as a security master in finance, aggregates commonly used company identifiers (i.e. CIKs, EINs, and LEIs) into a single company_id
. The COMPANY_SECURITY_RELATIONSHIPS
table maps OpenFIGI and PermID securities (i.e. securities with multiple "levels" such as OpenFIGI FIGI ID and OpenFIGI Share Class ID) to the Company. TheCOMPANY_CHARACTERISTICS
table includes categorical characteristics of a Company (e.g. industry, address, previous names). A characteristic may be temporal with start and end dates indicating the range for which the data is valid. The SEC_CIK_INDEX
table contains the distinct Company entities associated with SEC filings.OpenFIGI IDs are unique identifiers for active and inactive securities across all global assets including equities, bonds, futures, and options. The Financial Instrument Global Identifier (i.e. FIGI) is an established global standard and used to facilitate easy mapping across data sources. Learn more and view OpenFIGI documentation here. The
TOP_LEVEL_OPENFIGI_ID
is the OpenFIGI ID of the highest, most widely encompassing level for a particular security. The hierarchy follows the order "share class" (global) > "composite" (country-level) > "FIGI" (exchange-level). A list of over 2M securities and their associated TOP_LEVEL_OPENFIGI_ID
and respective TOP_LEVEL_OPENFIGI_ID_TYPE
are provided in the OPENFIGI_SECURITY_INDEX
table.PermIDs are unique and persistent identifiers for active and inactive securities across global asset classes. Refinitiv manages PermIDs as a way to standardize mapping across various data sources. PermIDs cover organizations, instruments, and quotes. Cybersyn uses the instrument-level PermIDs (PERMID_SECURITY_ID) to map securities and organization-level PermIDs (PERMID_COMPANY_ID) to map companies. A list of over 15K securities are provided in the PERMID_SECURITY_INDEX table; over 65K company PermIDs are available in the COMPANY_INDEX table.
The Report entity represents individual filings with the SEC. Company reports can be 10-Ks (annual), 10-Qs (quarterly), or 8-Ks (major events and press releases). These filings are self-reported details on a company’s business, the risks it faces, and the operating and financial results for the fiscal year or quarter, as applicable. They are found in the
SEC_REPORT_INDEX
and can be joined to the SEC_CIK_INDEX
via the CIK identifier. Attributes (the parsed XBRL line items from financial statements) associated with each report can be found in the SEC_REPORT_ATTRIBUTES
table. A report’s full text can be found in the SEC_REPORT_TEXT_ATTRIBUTES
table, again, is joinable to the SEC_REPORT_INDEX
via CIK and ADSH identifiers. Each report in the SEC_REPORT_TEXT_ATTRIBUTES
table also includes an SEC_DOCUEMNT_ID
, which is a Cybersyn-created ID to uniquely identify report exhibits. For fund holding reports (13Fs), each entity is an individual filing with the SEC in the
SEC_HOLDING_FILING_INDEX
. Attributes associated with each report can be found in the SEC_HOLDING_FILING_ATTRIBUTES
table. Attributes include the securities’ names, market value, number of shares, and OpenFIGI IDs.Each company’s reporting calendars can be found in the
SEC_FISCAL_CALENDARS
table. A fiscal calendar is a financial calendar year mapped to the company's quarterly and annual reporting period.Additional report types, including those on private companies, are planned. Find bugs or want to request specific report types from EDGAR? Email us at [email protected].
As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.
Tables Names | Source | Source Schedule |
---|---|---|
SEC_CIK_INDEX
SEC_FISCAL_CALENDARS
SEC_REPORT_ATTRIBUTES
SEC_REPORT_INDEX
SEC_REPORT_TEXT_ATTRIBUTES
SEC_FISCAL_CALENDARS
SEC_HOLDING_FILING_ATTRIBUTES
SEC_HOLDING_FILING_INDEX | Monthly - within the first week | |
OPENFIGI_SECURITY_INDEX | Ad hoc | |
COMPANY_INDEX
COMPANY_CHARACTERISTICS
COMPANY_SECURITY_RELATIONSHIPS
PERMID_SECURITY_INDEX | Ad hoc | |
GEOGRAPHY_INDEX | Data Commons is an aggregator of government data sources. Release calendars vary by underlying source. The US Census Bureau publishes datasets about the US people and it’s economy, release schedules vary by dataset. |
In the
sec_report_attributes
table, the covered_qtrs
denotes how many periods the value covers. For example, a year-to-date measure for revenue from Q1-Q3 would be 3. A point-in-time measure that is “as of” a specific date has a 0 for covered_qtrs
.In the
sec_report_text_attributes
table, the raw text is pulled from the originally-filed reports. The raw text is stripped of HTML formatting and is presented as a single block of text. Note that this text includes both text-based sections of the reports (such as Management Discussions & Analysis) as well as non-formatted text of values from financial statements.Cybersyn mapped portions of companies’ addresses (city, state, country) to a unique
geo_id
that corresponds to that location. This is a feature of all Cybersyn datasets to allow for easy comparisons across datasets that use geographic identifiers.Search for commonly-used corporate identifiers such as CIK and EIN as well as stock-specific identifiers like ticker symbol and OpenFIGI ID
SELECT
idx.company_id,
idx.company_name,
idx.cik,
idx.ein,
idx.lei,
openfigi.openfigi_share_class_id,
openfigi.primary_ticker,
idx.permid_company_id
FROM cybersyn.company_index AS idx
JOIN cybersyn.company_characteristics AS char
ON (idx.company_id = char.company_id)
JOIN cybersyn.company_security_relationships AS rship
ON (idx.company_id = rship.company_id)
JOIN cybersyn.openfigi_security_index AS openfigi
ON ARRAY_CONTAINS(rship.security_id::VARIANT, openfigi.openfigi_share_class_id)
WHERE char.relationship_type = 'sic_description'
AND char.value = 'Air transportation, scheduled';
Compare financial metrics across companies
Compare revenue numbers from for 2022 for different airline companies.
SELECT i.cik, i.company_name, r.period_start_date, r.period_end_date, r.measure_description, TO_NUMERIC(r.value) AS value
FROM cybersyn.sec_cik_index AS i
JOIN cybersyn.sec_report_attributes AS r ON (r.cik = i.cik)
WHERE i.sic_code_description = 'AIR TRANSPORTATION, SCHEDULED'
AND r.statement = 'Income Statement'
AND r.period_end_date = '2022-12-31'
AND r.covered_qtrs = 4
AND r.metadata IS NULL
AND r.measure_description IN ('Total operating revenues', 'Total operating revenue');
View historical trends for companies
Measure Chipotle’s store count growth over time
SELECT i.cik, i.company_name, r.period_end_date, r.measure_description, MAX(TO_NUMBER(r.value)) AS value
FROM cybersyn.sec_cik_index AS i
JOIN cybersyn.sec_report_attributes AS r ON (r.cik = i.cik)
WHERE company_name = 'CHIPOTLE MEXICAN GRILL INC'
AND r.measure_description = 'Number of restaurants'
GROUP BY i.cik, i.company_name, i.cik, r.period_end_date, r.measure_description;
Pull fiscal calendars for public companies
Pull Walmart’s fiscal calendar does not align with the calendar year. Pull their quarter start and end dates
SELECT company_name, fiscal_year, fiscal_period, period_start_date, period_end_date
FROM cybersyn.sec_fiscal_calendars
WHERE company_name = 'WALMART INC.'
ORDER BY period_end_date;
Track portfolios of fund managers
See Berkshire Hathaway’s most recent public holdings disclosure and join associated tickers under which each holding may be traded globally on all exchanges
WITH latest_filing AS (
SELECT adsh
FROM cybersyn.sec_holding_filing_index
WHERE filing_manager_name = 'Berkshire Hathaway Inc'
ORDER BY filing_date DESC
LIMIT 1
)
SELECT att.*,
securities.global_tickers,
securities.permid_quote_id
FROM cybersyn.sec_holding_filing_attributes AS att
LEFT JOIN cybersyn.permid_security_index AS securities
ON att.permid_security_id = securities.permid_security_id
WHERE att.adsh IN (SELECT * FROM latest_filing)
ORDER BY att.market_value DESC;
- Added one new table to our dataset,
PERMID_SECURITY_INDEX
, which includes security identifiers from Refinitiv’s PermID database. These are persistent identifiers for active and inactive securities across global asset classes. The table includes over 15K PermIDs for various securities. - This data can be merged to the 13F filings data (
SEC_HOLDING_FILING_ATTRIBUTES
) and can be mapped back to companies using theCOMPANY_SECURITY_RELATIONSHIPS
table.
9/17/23 - Added
COMPANY_INDEX
, COMPANY_CHARACTERISTICS
, and COMPANY_SECURITY_RELATIONSHIPS
tables; added PermIdsAdded three new tables:
- The
COMPANY_INDEX
table aggregates commonly used company identifiers (i.e. CIKs, EINs, and LEIs) into a single a singlecompany_id
, which can be used across Cybersyn’s datasets as a unique identifier for corporate entities. - The
COMPANY_SECURITY_RELATIONSHIPS
table maps OpenFIGI and PermID securities (i.e. securities with multiple "levels" such as OpenFIGI FIGI ID and OpenFIGI Share Class ID) to the Company. - The
COMPANY_CHARACTERISTICS
table includes categorical characteristics of a Company (e.g. industry, address, previous names). A characteristic may be temporal with start and end dates indicating the range for which the data is valid.
9/7/23 - Added 13F filings to include data on quarterly investment fund managers’ holdings; added a securities index table based on OpenFIGI data
Expanded our dataset to include individual filings from 13F fund holding reports, which disclose the equity holdings of institutional investment managers. Added three new tables,
SEC_HOLDING_FILING_INDEX
, SEC_HOLDING_FILING_ATTRIBUTES
and OPENFIGI_SECURITY_INDEX
.SEC_HOLDING_FILING_INDEX
table contains metadata from individual 13F filings including filing date and filing organization. SEC_HOLDING_FILING_ATTRIBUTES
table includes securities' names, market value, number of shares held, and OpenFIGI IDs, which facilitate easier mapping and analysis to outside data sources. Table OPENFIGI_SECURITY_INDEX
contains an index of over 2M securities listed on OpenFIGI and can be joined with table SEC_HOLDING_FILING_ATTRIBUTES
using TOP_LEVEL_OPENFIGI_ID
- the unique identifier for each security in the two tables.Expanded our coverage of SEC documents to include the full text of 8-K filings and associated exhibits. 8-K filings include company press releases, earnings releases, and other major corporate events.
Added the full text of exhibits for 10-K and 10-Q filings. Exhibit types include lists of subsidiaries, merger agreements, and material changes in financial conditions. Exhibits are denoted in the
variable
and variable_name
columns (e.g. 10-K EX-21 Filing Text
).Added the
sec_document_id
column. This field is a combination of the ADSH (accession number) and the document type (e.g. 10-K). This serves as a unique identifier for each individual component that makes up a filing in cases when one or more exhibits are included in a filing.The data in this dataset is sourced here. Links to provider terms and disclaimers are provided where appropriate.
Cybersyn is not endorsed by or affiliated with any of these providers. Contact [email protected] for questions.
Last modified 1mo ago