Near real-time SEC filings and earnings transcripts, point-in-time history, and more enterprise-ready public domain data from 60+ sources
Overview
Cybersyn’s Enterprise product includes premium public datasets, such as near real-time SEC filings and earnings transcripts. The Enterprise tier is a good fit for production use cases and includes technical support, external derivative usage, point-in-time history (where applicable), and backwards compatibility.
With the Enterprise offering, you will find all public data Cybersyn makes available on Snowflake Marketplace in a single product. All new public data released by Cybersyn in the future will be added.
Premium datasets:
Company Event Transcripts: including transcripts for earnings, annual investor days, etc.
US Securities and Exchange Commission (SEC):
Full SEC EDGAR history of 10K, 10Q, 8K, 13F, N-PORT, Form 3, 4, 5, 144, and additional filings (see complete list here) updated in near real-time
Parsed quarterly and annual revenue segments
Parsed 10-Ks, 10-Qs, 8-Ks, 20-Fs into document sections for easier input into LLMs
Other public domain example sources:
US Bureau of Labor Statistics (BLS): CPI, employment, labor force statistics
Bank for International Settlements (BIS): Foreign exchange rates
US Bureau of Economic Analysis (BEA): GDP, wages and salaries, personal income
US Federal Reserve: Interest rates, commercial paper issuance rates and volumes, outstanding monthly consumer credit
Federal Deposit Insurance Corporation (FDIC): Deposits, insurance, and complaints by financial institution
National Address Database / Overture Maps Foundation: US addresses and points of interest
Data Products Included
Documentation and data dictionaries for each underlying table and source included in the Enterprise product can be found here:
All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company) that Cybersyn provides data on. All timeseries' dates and values that refer to the entity are included in a timeseries table. Descriptors of the timeseries are included in an attributes table. Data is joinable across all Cybersyn products that have a GEO_ID. Refer to Cybersyn Concepts for more details.
Find the performance of a company's revenue segments over time
View Target's quarterly category mix over time.
WITH target_category_segments AS (SELECT company_name, business_segment,date,valueAS segment_valueFROM cybersyn.sec_metrics_timeseriesWHERE cik ='0000027419'AND measure ='REVENUE'AND frequency ='QUARTERLY' AND business_segment IN ('HARDLINES','HOME FURNISHINGS AND DECOR','APPAREL AND ACCESSORIES','BEAUTY AND HOUSEHOLD ESSENTIALS','FOOD AND BEVERAGE')
), quarterly_performance AS (SELECT company_name,date,SUM(segment_value) AS quarterly_total_valueFROM target_category_segmentsGROUP BY1,2)SELECT*, segment_value / quarterly_total_value AS category_mixFROM target_category_segmentsJOIN quarterly_performanceUSING (company_name, date)ORDER BYdate, business_segment;
Find the performance of Darden Restaurants' brands over time.
SELECT*FROM cybersyn.sec_metrics_timeseriesWHERE cik ='0000940944'AND frequency ='QUARTERLY'AND business_segment IN ('OLIVE GARDEN','LONGHORN STEAKHOUSE','FINE DINING','ALL OTHER');
View values as of a specific date and time
Use Cybersyn’s point-in-time history table to look back at what FRED data was published as of October 31, 2023.
Pull all historical revisions to a particular data point
Check the history of reported nonfarm employees in the United States for a given date. Pull historical estimates that may have been revised in subsequent monthly releases.
SELECT*FROM cybersyn.financial_fred_timeseries_historyWHERE variable_name ='All Employees, Total Nonfarm, Seasonally adjusted, Monthly, Persons'ANDdate='2023-09-30'ORDER BY _effective_start_timestamp;
Get timestamps for all updates to Cybersyn’s data
Select all distinct times any new data was inserted or any existing data was modified in the FX rate table. This will create a log of all timestamps where Cybersyn ran a job that made any changes to the underlying data in the table.
SELECT DISTINCT _effective_start_timestampFROM cybersyn.fx_rates_timeseries_historyORDER BY _effective_start_timestamp;
FDIC deposit exposure
Determine which banks have the highest percentage of uninsured deposits.
WITH big_banks AS (SELECT id_rssdFROM cybersyn.financial_institution_timeseriesWHERE variable ='ASSET'ANDdate='2022-12-31'ANDvalue> 1E10)SELECTname,1-valueAS pct_uninsured, ent.is_activeFROM cybersyn.financial_institution_timeseries AS tsINNER JOIN cybersyn.financial_institution_attributes AS att ON (ts.variable = att.variable)INNER JOIN cybersyn.financial_institution_entities AS ent ON (ts.id_rssd = ent.id_rssd)INNER JOIN big_banks ON (big_banks.id_rssd = ts.id_rssd)WHERE ts.date ='2022-12-31'AND att.variable_name ='% Insured (Estimated)'AND att.frequency ='Quarterly'ORDER BY pct_uninsured DESC;
Track house prices and gross income inflow in a specific city
Compare Phoenix house prices vs gross income inflow.
WITH county_map AS (SELECT geo_id, geo_name, related_geo_id, related_geo_nameFROM cybersyn.geography_relationshipsWHERE geo_name ='Phoenix-Mesa-Scottsdale, AZ Metro Area'AND related_level ='County'), gross_income_data AS (SELECT geo_id, geo_name AS msa,date,SUM(value) AS gross_income_inflowFROM cybersyn.irs_origin_destination_migration_timeseries AS tsJOIN county_map ON (county_map.related_geo_id = ts.to_geo_id)WHERE ts.variable_name ='Adjusted Gross Income'GROUP BY geo_id, msa, date), home_price_data AS (SELECT LAST_DAY(date, 'year') AS end_date, AVG(value) AS home_price_indexFROM cybersyn.fhfa_house_price_timeseries AS tsJOIN cybersyn.fhfa_house_price_attributes AS attON (ts.variable = att.variable)WHERE geo_id IN (SELECT geo_id FROM county_map)AND att.index_type ='purchase-only'AND att.seasonally_adjusted = TRUEGROUP BY end_date)SELECT msa, gid.date, gross_income_inflow, home_price_indexFROM gross_income_data AS gidJOIN home_price_data AS hpi ON (gid.date = hpi.end_date)ORDER BYdate;
Releases & Changelog
See "Release & Changelog" section on the respective product pages listed under "Data Products Included" above. The Enterprise product is updated with all public domain releases. Any releases that are specific only to the Enterprise product are below:
6/26/24: Enterprise - Added company event transcripts, incl. earnings transcripts.
Cybersyn now provides transcripts of hosted company events (e.g., earnings calls, updates/briefings, Annual General Meetings, and more) in JSON-format. The events cover thousands of public companies globally. Transcripts are usually made available within 3 hours of the event conclusion.
Data is available in the COMPANY_EVENT_TRANSCRIPT_ATTRIBUTES table.
6/25/24: Enterprise - Added parsed 20-F items into HTML, text, and JSON format.
Expanded the SEC_CORPORATE_REPORT_ITEM_ATTRIBUTES table to include parsed 20-Fs. 20-Fs are annual reports submitted by non-US public companies to the SEC, similar to a 10-K filing by US companies. Parsing 20-Fs into items optimizes SEC filings for LLM use cases.
10-K/Q sections are now parsed into plain text, HTML and JSON structure supporting LLM use cases. The data is available in the below table:
sec_corporate_report_item_attributes
Also updated SEC XBRL methodology to improve data accuracy.
Form 4 and Form 144 data is available in the following tables:
sec_form144_securities_info_index
sec_form144_securities_sold_index
sec_form144_securities_to_be_sold_index
sec_form4_reporting_owners_index
sec_form4_securities_index\
4/3/24: Enterprise - Added parsed segment revenues from the SEC
Parsed/cleaned quarterly & annual segment revenues from 10-Ks and 10-Qs are now available in SEC_METRICS_TIMESERIES. This table translates the raw XBRL data from the SEC into a clean version of SEC revenue segments over time.
Disclaimers
The data in this dataset is sourced on the individual source pages linked here. Links to provider terms and disclaimers are included where appropriate on the individual source pages.
Cybersyn is not endorsed or affiliated with any of these providers. Contact support@cybersyn.com for questions.