Changelog
November 2024
11/7/24: All Listings - Deprecated Datasets.
The following datasets were deprecated:
- CENTRAL_BANK_REPORT_INDEX
- DOMAIN_CHARACTERISTICS
- DOMAIN_INDEX
- DOMAIN_INDEX_HISTORY
- FINANCIAL_FRED_ATTRIBUTES
- FINANCIAL_FRED_TIMESERIES
- FINANCIAL_FRED_VARIABLE_SERIES_ID_CROSSWALK
- FOOD_AGRICULTURE_ORGANIZATION_ATTRIBUTES
- FOOD_AGRICULTURE_ORGANIZATION_ATTRIBUTES_HISTORY
- FOOD_AGRICULTURE_ORGANIZATION_TIMESERIES
- FOOD_AGRICULTURE_ORGANIZATION_TIMESERIES_HISTORY
- NEWYORKFED_CONSUMER_ATTRIBUTES
- NEWYORKFED_CONSUMER_ATTRIBUTES_HISTORY
- NEWYORKFED_CONSUMER_TIMESERIES
- NEWYORKFED_CONSUMER_TIMESERIES_HISTORY
September 2024
9/26/24: All Listings - Added the Cybersyn Data Dictionary table to all listings.
cybersyn_data_dictionary
is a data dictionary for all Cybersyn Foundations data tables, including column definitions, data types, and more.
9/22/24: Finance & Economics - Added a NAICS Code reference table.
naics_code_index
is an index table of NAICS codes with industry definitions and classification year. NAICS is the North American Industry Classification System, which is a standard used by Federal statistical agencies in classifying entity establishments for the purpose of collecting, analyzing, and publishing data related to the U.S. business economy.
9/17/24: All Listings - Added the Cybersyn Variable Catalog table to all listings.
Available in cybersyn_variable_catalog
and online here, this table is a catalog of all Cybersyn variables, including metadata about the variables (e.g., unit, frequency, meassure, applicable geographies and entities, and more).
The purpose of this table is allow users to easily search across all Cybersyn tables to identify the variables needed for analysis.
9/13/24: Cybersyn Foundations - Added point-in-time history tables for specific SEC tables.
Added point-in-time history tables for the following tables:
- sec_broker_dealer_index_history
- sec_cik_index_history
- sec_investment_advisers_index_history
- sec_investment_company_index_history
- sec_swap_dealer_index_history
August 2024
8/23/24: Cybersyn Foundations - Added index of SEC-registered security-based swap dealers and participants.
Available in sec_swap_dealer_index
, this is a list of security-based swap dealers (SBSDs) and major security-based swap participants (MSBSPs) who have filed applications for registration as SBSDs or MSBSPs to the Division of Trading and Markets of the SEC and have not withdrawn their registration or de-registered (Form SBSE).
A security-based swap is a financial contract between two parties that involves the exchange of cash flows or value based on the performance of an underlying security, like a stock or bond. A Security-Based Swap Dealer is an entity that engages in the regular business of buying, selling, or dealing in security-based swaps for its own account or on behalf of clients. A Major Security-Based Swap Participant is an entity that maintains a substantial position in security-based swaps, holds counterparty exposure that could significantly impact the US financial markets, or is highly leveraged relative to its capital.
8/23/24: Cybersyn Foundations - Added an index of investment advisers who are registered with the SEC or who are filing reports as Exempt Reporting Advisers.
sec_investment_advisers_index
originates from Form ADV, which is submitted to the US Securities & Exchange Commission (SEC) by any individual or firm that provides advice about securities to clients for compensation. Investment advisers are required to register with the SEC or state securities regulators, depending on the size of their business.
The form provides information about the adviser's business, ownership structure, conflicts of interest, and disciplinary actions. It also provides other narrative disclosures. Form ADV must be updated at least annually, or when certain significant changes occur, to keep regulators and clients informed of any changes in the adviser's business practices.
8/20/24: US Patents, LLM Foundations - Added raw text of patent applications and grants and a map of patent filings and their associated applications.
Expanded the existing US Patent Grants listing to include patent applications. The uspto_patent_text_attributes
table provides the content of the patent applications and grants in raw text and JSON structure; it also provides the subsections of a patent broken down into smaller chunks, which is optimized for LLM use cases. The uspto_patent_relationships
table is a map of patent filings to associated applications, joinable to the USPTO Patent Index.
8/2/24: Cybersyn Foundations - Added monthly fund-level statistics covered within an NPORT filing to disclose the fund's financial status.
The NPORT filing is a regulatory filing required by the SEC that investment companies, such as mutual funds, must submit on a monthly basis. This report includes detailed information about the fund's portfolio holdings, borrowings, investments, cash positions, and more, providing transparency into the fund's financial status and risk profile.
Each row represents a distinct fund statistic by ADSH (SEC unique document identifier). These tables, sec_nport_timeseries
and sec_nport_attributes
, are joinable to the sec_nport_filing_index
for fund information.
8/2/24: Cybersyn Foundations - Added index of all actively-registered investment company series and classes that have been issued IDs by the SEC.
Found in sec_investment_company_index
, this data includes all actively-registered investment companies. This report includes entities that have received IDs but have not yet begun selling shares to the public, those that have ceased operation but have not yet been re-classified as inactive by the registrant, and those that have ceased operation but have not yet terminated their registration with the SEC.
The CIK is a unique identifier assigned by the SEC to all entities that file disclosures (e.g., investment companies, mutual funds, and other registrants). Investment companies can be structured as series companies, meaning they operate multiple portfolios or series under a single corporate umbrella. EAch series represents a distinct portfolio within the investment company, and each series has its own set of financial statements and investment objectives. Within each series, there can be multiple classes of shares. These classes may differ in terms of fee structures, minimum investment amounts, and other characteristics.
A CIK represents the investment company as a whole. It encompasses all the series and classes managed by that company. Series IDs are unique identifiers for each portfolio or series within an investment company. All series under an investment company share the same CIK, but have different Series IDs. Class IDs are unique to each share class within a series.
8/2/24: Cybersyn Foundations - Added SEC broker-dealer registration and amendment filings.
sec_broker_dealer_index
is a list of broker-dealer registration and amendment filings with the SEC, including company information mapped to the geography index.
A broker-dealer is an individual or firm that is in the business of buying and selling securities on behalf of its clients (as a broker) and for its own account (as a dealer). Broker-dealers are heavily regulated in the US to protect investors and ensure fair and transparent markets.
Form BD is the Uniform Application for Broker-Dealer Registration, which must be filed with the SEC and 1+ SROs (e.g. FINRA) under the Securities Exchange Act of 1934. It provides information about the broker-dealer's business.
Broker-dealers must file Form BD when they initially apply for registration. Amendments must be filed whenever the information becomes inaccurate or incomplete. There is no set frequency for filing Form BD.
July 2024
7/30/24: Global Government - Added monthly US domestic flight segment data from the US Department of Transportation.
The US Department of Transportation (DOT) provides monthly data on domestic flight segments within the United States, categorized by World Area Codes (WAC). It provides details into the types of flights, aircrafts, and segments. This dataset is useful for analyzing domestic air traffic patterns, evaluating airline market share and performance, and conducting assessments within the aviation sector. The data is available in the following tables:
us_department_of_transportation_attributes
us_department_of_transportation_timeseries
airport_index
aircraft_index
aircraft_carrier_index
7/1/24: US Insurance & Healthcare - Added Form 5500 Schedule SB from the US Department of Labor.
Expanded the US Department of Labor data to include information found in Form 5500 Schedule SB. The new tables, us_department_of_labor_form_5500_schedule_sb_attributes
and us_department_of_labor_form_5500_schedule_sb_timeseries
, monitor the financial health and funding adequacy of single-employer defined benefit pension plans.
7/1/24: Weather & Environment - Added electricity consumption, sales, and capacity data from the EIA.
The EIA is part of the US Department of Energy and serves as a primary source of data on energy production, consumption, and sales in the US. Our EIA data covers the following: (1) Electricity price, sales, revenue and number of customers by sector, and (2) State-specific electricity generation, consumption, and capacity data. The data is available in the following tables:
eia_energy_attributes
eia_energy_timeseries
June 2024
6/26/24: Cybersyn Foundations - Added company event transcripts, incl. earnings transcripts.
Cybersyn now provides transcripts of hosted company events (e.g., earnings calls, updates/briefings, Annual General Meetings, and more) in JSON-format. The events cover thousands of public companies globally. Transcripts are usually made available within 3 hours of the event conclusion. Data is available in the company_event_transcript_attributes
table.
6/25/24: Cybersyn Foundations - Added parsed 20-F items into HTML, text, and JSON format.
Expanded the sec_corporate_report_item_attributes
table to include parsed 20-Fs. 20-Fs are annual reports submitted by non-US public companies to the SEC, similar to a 10-K filing by US companies. Parsing 20-Fs into items optimizes SEC filings for LLM use cases.
6/17/24: US Insurance & Healthcare - Added Form 5500 Schedule H from the US Department of Labor.
Expanded the US Department of Labor data to include information found in Form 5500 Schedule H. The new tables, us_department_of_labor_form_5500_schedule_h_attributes
and us_department_of_labor_form_5500_schedule_h_timeseries
, provide detailed financial data of large pension and welfare benefit plans, including assets, liabilities, income, expenses, and other financial information of the plan.
6/4/24: US Insurance & Healthcare - Added Form 5500 Schedule C service provider data from the US Department of Labor.
Expanded the US Department of Labor data to include information found in Form 5500 Schedule C:
- Part I, Item 1: Persons Receiving only Eligible Indirect Compensation.
us_department_of_labor_form_5500_schedule_c_part1_item1_index
Reports information on service providers who received only eligible indirect compensation and not direct compensation. It requires details on the type of services provided and the nature of the eligible indirect compensation. - Part I, Item 2: Persons Receiving Direct or Indirect Compensation.
us_department_of_labor_form_5500_schedule_c_part1_item2_index
Provides details on service providers who received $5,000 or more in direct or indirect compensation. Includes identification information, service codes, and amounts received. - Part I, Item 3: Indirect Compensation over $1,000.
us_department_of_labor_form_5500_schedule_c_part1_item3_index
Requires reporting for service providers receiving significant indirect compensation (excluding eligible compensation) and detailed information on each source of $1,000+. Details must be provided if a formula was used to determine the compensation. - Part II: Service Providers failing to provide information.
us_department_of_labor_form_5500_schedule_c_part2_index
Lists service providers who failed or refused to provide the required information for Schedule C. It includes the service provider's identification and efforts made by the plan to obtain the information. - Part III: Terminated Service Providers.
us_department_of_labor_form_5500_schedule_c_part3_index
Reports on service providers terminated during the plan year. Includes their identification, the nature of services provided, and the reasons for termination.
May 2024
5/28/24: Finance & Economics, Global Government - Added the US Economic Census survey conducted by the US Census Bureau.
The Economic Census is a survey conducted by the US Census Bureau every 5 years that provides detailed economic data on US businesses, such as number of establishments, types of goods and services provided, employment figures, payroll expenses, and operational costs, across different industries and geographic regions. The data is available in the following tables:
us_economic_census_attributes
us_economic_census_timeseries
5/28/24: Global Government - Added state and Continuum of Care (CoC)-level homelessness estimates from the US Department of Housing & Urban Development.
The US Department of Housing and Urban Development (HUD) is a federal agency responsible for national policies on America's housing needs. They conduct the Annual Homeless Assessment Report (AHAR) and Point-in-Time (PIT) count and Housing Inventory Count (HIC) surveys in January of each year. These reports provide a detailed view on the number of homeless individuals, chronically homeless persons, homeless veterans, and homeless children and youth.
The data is available in the following tables:
housing_urban_development_attributes
housing_urban_development_timeseries
5/21/24: Global Government, US Crime - Added national and state crime estimates in the US from the FBI.
The FBI, or Federal Bureau of Investigation, is the principal federal law enforcement agency in the US, taked with investigating and enforcing federal laws. Data includes crime totals across the nation and its 50 states from 1979. The offense category is provided for reference.
The data is available in the following tables:
fbi_crime_attributes
fbi_crime_timeseries
5/16/24: Cybersyn Foundations - Added additional SEC filings (S-1, S-4, SC-13D, SC-14D, 20-F, 40-F, 6-K, DEFM14A, PREM14A, SC-TO)
See here for table details.
5/15/24: Cybersyn Foundations - Added parsed 10K/Qs and updated SEC XBRL methodology
10-K/Q sections are now parsed into plain text, HTML and JSON structure supporting LLM use cases. The data is available in the below table:
sec_corporate_report_item_attributes
Also updated SEC XBRL methodology to improve data accuracy.
April 2024
4/22/24: Weather & Environment - Added global emissions data from Climate Watch
Climate Watch's historical emissions dataset provides a record of past greenhouse gas emissions across various sectors and gases, categorized by both geographic and sector-specific divisions. The data also presents a range of future emission scenarios, each predicated on distinct assumptions regarding economic growth, technological advancements, and policy implementations.
The data is available in the following tables:
climate_watch_attributes
climate_watch_timeseries
4/17/24: Cybersyn Foundations - Added additional SEC filings (Form 3 & Form 5)
Form 3 and Form 5 data is available in the following tables:
sec_insider_trading_reporting_owners_index
sec_insider_trading_securities_index
4/11/24: Cybersyn Foundations - Added additional SEC filings (Form 4 & Form 144)
Form 4 and Form 144 data is available in the following tables:
sec_form144_securities_info_index
sec_form144_securities_sold_index
sec_form144_securities_to_be_sold_index
sec_form4_reporting_owners_index
sec_form4_securities_index
4/3/24: Cybersyn Foundations - Added parsed segment revenues from the SEC
Parsed/cleaned quarterly & annual segment revenues from 10-Ks and 10-Qs are now available in SEC_METRICS_TIMESERIES
. This table translates the raw XBRL data from the SEC into a clean version of SEC revenue segments over time.
March 2024
3/7/24: Global Government - Added regional GDP, retails sales index, and safety data from the UK Office for National Statistics
Added data from the UK Office for Statistics in support of the Women In Data Safety Hackathon. Datasets added include regional GDP by quarter, retail sales index, prisoner statistics, and domestic abuse statistics.
Data is available in the following tables:
united_kingdom_timeseries
united_kingdom_attributes
3/6/24: Cybersyn AI Utilities - Added OpenAI & Anthropic external functions; added phone number parsing functions; renamed product
Added additional functions that allow users to call out to OpenAI & Anthropic and clean/process phone numbers directly in Snowflake SQL. Update product name to "Cybersyn AI Utilities"
February 2024
2/26/24: US Housing & US Real Estate - Added the Building Permit Survey and construction spending from the US Census Bureau
Added new data from the US Census Bureau. An indicator of future construction activity, the Building Permits Survey (BPS) provides the number and valuation of new housing units authorized by building permits at the national, state, and local levels. Construction Spending data quantifies actual expenditures on construction projects, categorized into residential, non-residential, and public sectors
The data is available in the following tables:
us_real_estate_attributes
us_real_estate_timeseries
2/21/24: Finance & Economics - Added daily Nasdaq stock prices & trading volumes; added economic and development indicators from the OECD
Launched daily trading volumes, open/close and high/low prices of all US equities and ETFs executed on the Nasdaq. Cybersyn releases volume and pricing data, inclusive of pre-market and after hours activity, from the previous trading day at 6:00am ET. Data is sourced from Databento, a market data provider that connects directly with Nasdaq.
The data is available in the following table stock_price_timeseries
. Added global economic and development indicators by country from the OECD. Added the OECD's Social Expenditure Database which tracks key indicators of social policy spending across countries. The data is available in the following tables:
oecd_attributes
oecd_timeseries
2/15/24: GitHub Events - Added language
field with the programming language of the repo
Added language
the github_events
table that will provide the programming language (e.g. Python, SQL) of the repository.
2/15/24: Global Government - Added global economic & growth indicators from the World Bank
Expanded coverage of the World Bank to include global and regional growth projections from the Global Economic Prospects report, international debt statistics by country (e.g. average interest on new debt commitments, public sector debt, external assets in debt instruments), global economic indicators by country (e.g. inflation, exchange rates, GDP, retails sales), and world development indicators by country (e.g. access to electricity, bank account ownership, life expectancy). The data is available in the following tables:
world_bank_attributes
world_bank_timeseries
2/15/24: Finance & Economics - Added Survey of Consumer Expectations and Household Debt & Credit reports from the New York Fed
Added quarterly household debt & credit data and monthly consumer expectations for inflation, household finances, and the labor market. The data is available in the following tables:
newyorkfed_consumer_attributes
newyorkfed_consumer_timeseries
2/8/24: GitHub Events - Added last_seen
field with repo name last seen date
Added a new field to the github_repos
table. last_seen
includes the last date a repo_name
was used. This field allows you to identify the latest repo_name
for a particular repo_id
.
The field was added as a workaround for a known issue with the source data: In the GitHub events dataset, a repository name should appear as a combination of the repository organization and name. In 2018, there are repository names that are incomplete. For example, rust-lang/ appeared without its full repository name, rust-lang/rust, for ~3 months. Additionally, in 2019, some repository names appeared with duplicate organizations (e.g. nolimits4web/nolimits4web/swiper). In both cases, GitHub implemented a go-forward fix.
2/7/24: Global Government - Removed legacy US Treasury revenue collections tables
Removed legacy tables: us_treasury_revenue_collections_attributes
and us_treasury_revenue_collections_timeseries
.
Users still have access to revenue collections data in the us_treasury_attributes
and us_treasury_timeseries
tables that were released on 12/14/23.
January 2024
1/31/24: Weather & Environment - Removed legacy tables with data from Our World in Data and CDR.fyi
Removed the following legacy tables from Our World in Data and CDR.fyi:
carbon_intensity_attributes
carbon_intensity_timeseries
carbon_credit_purchase_attributes
carbon_credit_purchase_timeseries
Users still have access to the carbon intensity data in the our_world_in_data_attributes
and our_world_in_data_timeseries
tables that were released on 12/15/23.
1/27/24: Global Government - Added global Environmental, Social, Governance (ESG) statistics from the World Bank
Added annual ESG statistics on global sustainability performance for 200+ countries and territories. Example country-level variables include CO2 emissions, natural resource depletion, energy use, government effectiveness, life expectancy, population with access to safely managed drinking water, and renewable electricity output.
Added governance indicators for 200+ countries and territories. The Worldwide Governance Indicators (WGI) report on six broad dimensions of governance: voice and accountability, political stability and absence of violence, government effectiveness, regulatory quality, rule of law, and control of corruption.
The data is available in the following tables:
world_bank_timeseries
world_bank_attributes
1/26/24: LLM Training, Tech & Tech & Innovation - Added open catalog of scholarly entities and how they are connected from OpenAlex
Added OpenAlex's catalog on scholarly entities and how they are connected to each other. Entities are defined as scholarly works (e.g. journal articles, books, theses), authors, sources, affiliated organizations, topics covered, publishers, and funders. This data is derived from a wide range of sources, offering an extensive overview of academic research and its contributors.
The data is available in the following tables:
openalex_authors_index
openalex_concepts_index
openalex_funders_index
openalex_institutions_index
openalex_publishers_index
openalex_sources_index
openalex_works_index
1/19/24: Global Government - Added weekly national unemployment insurance claims from the US Department of Labor
Expanded US Department of Labor coverage to include national weekly figures on initial and continuing unemployment claims filed during the previous week. Initial claims represent the number of individuals who have filed for unemployment benefits for the first time during a given week. Continuing claims indicate the number of individuals who are still receiving unemployment benefits in subsequent weeks.
The data is available in the following tables:
us_department_of_labor_unemployment_insurance_claims_attributes
us_department_of_labor_unemployment_insurance_claims_timeseries
1/9/24: Weather & Environment - Added global agrifood emissions from the Food and Agriculture Organization (FAO)
Added greenhouse gas (GHG) emissions related to agriculture, forestry, and other land use activities. Example variables include: 1) Total emissions generated from agrifood systems 2) Emissions associated with crop processes including rice cultivation and burning of crop residues 3) Emissions from livestock 4) Emissions from land use including forests, fires, and drained organic soils 5) Intensity of emissions by agricultural commodity.
The data is available in the following tables:
food_agriculture_organization_attributes
food_agriculture_organization_timeseries
1/8/24: Global Government - Added US Consumer Credit, Industrial Production, Financial Accounts, and Commercial Paper from the Federal Reserve
Added the following timeseries from the Federal Reserve to the federal_reserve_timeseries
and federal_reserve_attributes
tables:
- Outstanding monthly revolving and non-revolving US consumer credit including select terms (e.g. interest rates on new car loans, personal loans, credit card plans) and major holders (e.g. credit unions, finance companies, depository institutions) - Fed Report Number G.19
- Monthly industrial production and capacity utilization rates covering manufacturing, mining, and electric and gas utilities. The production index measures real output and the capacity index estimates sustainable potential output. - Fed Report Number G.17
- Quarterly assets and liabilities by sector and financial instruments. Quarterly balance sheets, including net worth, for households, nonprofits and non-financial businesses. - Fed Report Number Z.1
- Daily commercial paper issuance rates and volumes - Derived from data supplied by The Depository Trust & Clearing Corporation
- Monthly owned and managed outstanding receivables at financial services companies. - Fed Report Number G.20
1/5/24: Global Government - Added weekly unemployment insurance (i.e. jobless) claims by state from the US Department of Labor
Added weekly unemployment insurance claims (i.e. jobless claims) which count the number of people filing to receive unemployment insurance benefits by state. The claims are broken into two categories - initial (number of people filing for the first time) and continued (number of people filing for ongoing unemployment benefits). The data is available in the following tables:
us_department_of_labor_unemployment_insurance_claims_attributes
describes the unemployment insurance variables tracked by the US Department of Laborus_department_of_labor_unemployment_insurance_claims_timeseries
provides historical values for each US Department of Labor reported attribute
1/3/24: US Housing & US Real Estate - Added monthly housing prices and weekly mortgage interest rates from Freddie Mac
Added Freddie Mac’s House Price Index (HPI) which provides a monthly measure of typical price inflation for single family homes within the US using data from loans purchased by Freddie Mac or Fannie Mae.
Added Freddie Mac's Primary Mortgage Market Survey (PMMS) which provides a weekly report on the national average of 30/15-year fixed rate mortgages as well as adjustable rate mortgages, tracking borrowing costs for US homebuyers and refinancers.
The data is available in the following tables:
freddie_mac_housing_attributes
details the Freddie Mac House Price Index and mortgage interest rate variables covered.freddie_mac_housing_timeseries
provides historical values for each reported attribute
December 2023
12/20/23: Finance & Economics - Added international economic indicators from the International Monetary Fund (IMF)
Added the below IMF indicators:
- Primary Commodity Price System - prices of commodities by country
- Balance of Payments - countries’ economic transactions with the rest of the world e.g. trade balances, capital flows, and official reserves
- Fiscal Monitor - detailed government finances e.g. revenues, expenditures, transactions in financial assets/liabilities
- International Financial Statistics - macroeconomic and financial indicators e.g. exchange rates, international liquidity, interest rates, population
- Coordinated Portfolio Investment Survey - cross-border holdings of equity and debt securities
- Africa Regional Economic Outlook - economic indicators for Africa
- Asia and Pacific Regional Economic Outlook - economic indicators for the Asia and Pacific region
The data is available in the following tables:
international_monetary_fund_attributes
: Human-readable variable names describing the financial statistic being measuredinternational_monetary_fund_timeseries
: Historical values for the tracked financial variables
12/19/23: Canadian Government - Added month-end assets and liabilities for the Bank of Canada from Statistics Canada
Expanded coverage of Statistics Canada, Canada's national statistical office, with month-end assets and liabilities for the Bank of Canada and Chartered Banks. Added month-end currency outside bank/chartered bank deposits for the Bank of Canada.
Select assets include: Government of Canada direct and guaranteed securities (e.g., Treasury Bills, Government of Canada bonds, mortgage bonds), provincial money market securities, provincial bonds, commercial paper, and corporate bonds.
Select liabilities include: notes in circulation, Canadian dollar deposits, securities sold under repurchase agreements, derivatives, foreign currency liabilities, and equity.
The data can be found in the following tables:
canada_statcan_attributes
includes a unique variable ID and a human-readable variable name. Each field represents a characteristic of the variable, including its unit of measurement, whether it's seasonally adjusted, frequency of measurement, its Statistics Canada report, and other related informationcanada_statcan_timeseries
provides historical values for each variable collected by Statistics Canada byGEO_ID
. Each row represents a distinct timeseries, date, and value by geographic entity.
12/19/23: Finance & Economics - Added Eurozone residential property prices & valuations, CPI indices from the European Central Bank
Added annual, quarterly, and monthly residential property prices and valuations by residential property type (e.g. houses, flats) across countries in the European Union (EU). Includes details on the source of the data, seasonal adjustment of the variable, and measurement type (e.g. average, maximum or minimum valuation, estimates of the over/undervaluation).
Added annual, quarterly, and monthly consumer prices indices by EU countries for a wide range of consumer goods (e.g. clothing, baby food, books, cigarettes).
Data is available in the following tables:
european_central_bank_timeseries
: Historical records for the tracked real estate and financial variables by European country or country group.european_central_bank_attributes
: Human-readable variable names and details describing residential property and CPI statistics.
12/18/23: Global Government - Added industrial development statistics on the manufacturing sector from the United Nations Industrial Development Organization (UNIDO)
Sourced from the INDSTAT 2 (2016-2023), the data provides statistics on industrial performance and trends by country for the manufacturing sector. The following variables over time, by country are included:
- Number of establishments
- Number of employees
- Wages and salaries
- Output
- Value added
- Gross fixed capital formation
- Number of female employees
- Index of industrial production
Data is available in the following tables:
united_nations_industrial_development_organization_attributes
details each manufacturing indicator tracked by ISIC economic activity and INDSTAT yearly version.united_nations_industrial_development_organization_timeseries
provides historical values on an annual basis for each UNIDO variable tracked.
12/18/23: Weather & Environment - Added global emissions by country and industry from the European Commission's Emissions Database for Global Atmospheric Research (EDGAR)
Added global emissions of carbon dioxide (CO2), methane (CH4), nitrous oxide (N2O), and various fluorinated gases (F-gases). F-gases are man-made greenhouse gases emitted from everyday products (e.g. air conditioners, refrigerators) and industrial applications. Both biogenic (emissions that come from natural sources like soils/plants) and fossil fuel sources are included.
Data is available in the following tables:
european_commission_edgar_attributes
details each emissions variable tracked by greenhouse gas type and Intergovernmental Panel on Climate Change (IPCC) industryeuropean_commission_edgar_timeseries
provides historical values on a monthly or annual basis for each tracked variable
12/15/23: Weather & Environment - Added annual temperature change, emissions, and energy consumption by country/region, sector, and source from OWID
Added Our Wold in Data (OWID) metrics covering annual temperature change, emissions, and energy consumption by country/state, sector, and source to the our_world_in_data_attributes
and our_world_in_data_timeseries
tables.
Variable added cover:
- Global warming
- Carbon Dioxide Emissions
- Carbon Intensity
- Emissions from Aviation
- Energy Consumption
- Fugitive Emissions
- Greenhouse Gas Emissions
- Methane Emissions
- Nitrous Oxide Emissions
12/14/23: Tech & Tech & Innovation - Added web domain redirect relationships and domain characteristics
Added two new tables domain_redirect_relationships
and domain_characteristics
:
domain_redirect_relationships
includes the original web domain and the domain that it redirects to as well as the start and end dates that the redirect relationship was observed. At the time of this release, the table covers ~50K domains of the 300M+ domains Cybersyn tracksdomain_characteristics
details which web domains are active/inactive based on the HTTP response status of the domain and whether the domain is the primary landing page or redirects to another domain. At the time of this release, the table covers 800K+ domains that Cybersyn tracks.
12/14/23: Global Government - Added HQM corporate bond yields, U.S. Treasury savings bonds, TreasuryDirect securities, and interest rates for U.S. Treasury securities
Added high quality market corporate bond yields, U.S. Treasury savings bonds (issues, redemptions, maturities), securities issued in TreasuryDirect, and average interest rates on U.S. Treasury securities from the U.S Treasury.
- The
US_TREASURY_ATTRIBUTES
table includes details on each variable - The
US_TREASURY_TIMESERIES
table includes historical values for each variable
12/6/23: Global Government - Added global labor statistics from the International Labour Organization (ILO)
Added global labor data from the International Labour Organization (ILO). Variables include employment, unemployment, population, potential labor force, labor force participation, rate of labor under-utilization, working poverty rate, and annual growth rate of output per worker. Data can be cut by sex, age, occupation, status in employment, economic activity (e.g. education, utilities, construction), rural/urban areas, and economic class.
- The
INTERNATIONAL_LABOUR_ORGANIZATION_ATTRIBUTES
table includes descriptions of each labor statistic - The
INTERNATIONAL_LABOUR_ORGANIZATION_TIMESERIES
table includes historical values for each ILO variable tracked
12/5/23: Finance & Economics - Added interest rates posted by major chartered banks in Canada for selected products; Added secured overnight financing rates (SOFR) and US Treasury bill rates
Added interest rates for selected products posted by the six major chartered banks in Canada to the FINANCIAL_FRED_ATTRIBUTES
, FINANCIAL_FRED_TIMESERIES
, and FINANCIAL_FRED_VARIABLE_SERIES_ID_CROSSWALK
tables. The typical rate is calculated based on the statistical mode of the rates published by the six banks.
The posted rates cover:
- Prime lending rate
- Conventional mortgages
- Guaranteed investment certificates
- Personal daily interest savings
- Non-checkable savings deposits
Added secured over night financing rates (SOFR) and US Treasury bill rates to theFINANCIAL_FRED_ATTRIBUTES
,FINANCIAL_FRED_TIMESERIES
, andFINANCIAL_FRED_VARIABLE_SERIES_ID_CROSSWALK
tables.
Series added (FRED ID):
- Secured Overnight Financing Rate (SOFR)
- 30-Day Average Secured Overnight Financing Rate (SOFR30DAYAVG)
- 90-Day Average Secured Overnight Financing Rate (SOFR90DAYAVG)
- 180-Day Average Secured Overnight Financing Rate (SOFR180DAYAVG)
- Secured Overnight Financing Rate Index (SOFRINDEX)
- 3-Month Treasury Bill Minus Federal Funds Rate (TB3SMFFM)
- 6-Month Treasury Bill Minus Federal Funds Rate (TB6SMFFM)
12/5/23: Canadian Government - Added interest rates posted by major charted banks in Canada for selected products
Added interest rates for selected products posted by the six major chartered banks in Canada to the CANADA_STATCAN_ATTRIBUTES
and CANADA_STATCAN_TIMESERIES
tables. The typical rate is calculated based on the statistical mode of the rates published by the six banks.
The posted rates cover:
- Prime lending rate
- Conventional mortgages
- Guaranteed investment certificates
- Personal daily interest savings
- Non-checkable savings deposits
12/4/23: Finance & Economics - Added Monthly State Retail Sales (MSRS) from the US Census Bureau
Added monthly state-level retail sales by NAICS codes to the FINANCIAL_FRED_ATTRIBUTES
, FINANCIAL_FRED_TIMESERIES
, and FINANCIAL_FRED_VARIABLE_SERIES_ID_CROSSWALK
tables. The US Census Bureau publishes year-over-year percent changes for each state using a composite model of Monthly Retail Trade Survey (MRTS) data, administrative data, and third-party data beginning in January, 2019.
Year-over-year percent change estimates for the following industries are now available at the state level:
- Total Retail Sales Excluding Nonstore Retailers
- Building Material and Garden Equipment and Supplies Dealers
- Clothing and Clothing Accessories Stores
- Electronics and Appliance Stores
- Food and Beverage Stores
- Furniture and Home Furnishings Stores
- Gasoline Stations
- Health and Personal Care Stores
- Motor Vehicle and Parts Dealers
- Sporting Good, Hobby, Musical Instrument and Book Stores
- Miscellaneous Stores
November 2023
11/30/23: Global Government - Added global trade, tariff, and import relationship data from the World Trade Organization (WTO)
Added global trade flows, imposed tariffs, and trade interactions between countries from the World Trade Organization (WTO) . The data details export and import figures for goods and services across different countries and regions, tariff rates and structures that WTO member countries apply to imports from other nations, trade dependencies between countries, and the balance of trade between specific pairs of nations.
- The
world_trade_organization_attributes
table details the global trade, tariff, and import relationship statistics tracked by the World Trade Organization (WTO). - The
world_trade_organization_timeseries
table provides timeseries values by date for the reported trade statistics by country, country group, and global region (as defined by the World Trade Organization).
11/30/23: Global Government - Added global health indicators from the World Health Organization (WHO)
Added 1,100+ health-related indicators for 194 members of the World Health Organization (WHO) and their associated country groups and global regions. Example metrics include alcohol consumption among adolescents and adults, tobacco control policies, abortion rates, accessibility of dementia care services, and adolescent fertility rates. Environmental health indicators including air pollution's impact on mortality rates and disability-adjusted life years (DALYs) as well as deaths attributable to the environment are also included.
- The
world_health_organization_attributes
table details the health statistics tracked by the World Health Organization (WHO). - The
world_health_organization_timeseries
table provides timeseries values by date for the reported health indicators by country, country group, or global region (as defined by global organizations like UNICEF, the United Nations, the World Bank, and the World Health Organization).
11/30/23: Global Government, Finance & Economics - Added country groups and regions from the WHO, WTO, and UN to geography tables
Added additional country groups and geography types to the geography_index
from the World Health Organization (WHO), World Trade Organization (WTO), and United Nations (UN). The member countries of the added geographies are mapped in the geography_relationships
table. Select new geographic regions include:
- BRICS members
- World Trade Organization (WTO) members
- Association of Southeast Asian Nations (ASEAN)
- UNICEF regions
- United Nations regions
- United Nations Sustainable Development Goal (SDG) regions
- World Bank regions
- World Health Organization (WHO) regions and income regions
- World Bank regions and income groups
11/28/23: Global Government - Added US agricultural export sales data from the USDA
Added US Export Sales Reporting (ESR) data on weekly export sales activity for 40+ US agricultural commodities sold abroad from the US Department of Agriculture's (USDA) Foreign Agricultural Service (FAS).
us_department_of_agriculture_commodities_attributes
includes the export of commodities in addition to the existing production, supply, and distribution variables.us_department_of_agriculture_commodities_timeseries
provides the reported metrics for each commodity byGEO_ID
.
11/22/23: Finance & Economics - Added data from the Bank for International Settlements (BIS) on global banking conditions, property prices, and consumer price indicators
Added data from the Bank for International Settlements (BIS) on global banking conditions, property prices, and consumer price indicators
The Bank for International Settlements (BIS) is an international financial institution owned by 60+ central banks that represent countries accounting for ~95% of global GDP. As part of its mission is to support international monetary and financial cooperation, the BIS acts as a bank for central banks across the world.
Added the below BIS data to two new tables, bank_for_international_settlements_attributes
, which describes the metrics tracked by BIS, and bank_for_international_settlements_timeseries
, which provides the values of metrics:
- Residential property prices
- Central bank policy rates
- Consumer price indicators
- Assets and liabilities of internationally active banks, including credit to the non-financial sector as well as the geographical and currency composition of a bank’s balance sheet
11/20/23: Finance & Economics - Added 10 timeseries from FRED covering mortgage rates and additional CPI measures
Added 10 additional timeseries from FRED to the financial_fred_timeseries
and financial_fred_attributes
tables:
- MORTGAGE15US: 15-Year Fixed Rate Mortgage Average in the United States
- MORTGAGE30US: 30-Year Fixed Rate Mortgage Average in the United States
- CUSR0000SEFV: Consumer Price Index for All Urban Consumers: Food Away from Home in U.S. City Average, Seasonally Adjusted
- CUUR0000SEFV: Consumer Price Index for All Urban Consumers: Food Away from Home in U.S. City Average, Not Seasonally Adjusted
- CUSR0000SETG01: Consumer Price Index for All Urban Consumers: Airline Fares in U.S. City Average, Seasonally Adjusted
- CUUR0000SETG01: Consumer Price Index for All Urban Consumers: Airline Fares in U.S. City Average, Not Seasonally Adjusted
- CUSR0000SS62031: Consumer Price Index for All Urban Consumers: Admission to Movies, Theaters, and Concerts in U.S. City Average, Seasonally Adjusted
- CUUR0000SS62031: Consumer Price Index for All Urban Consumers: Admission to Movies, Theaters, and Concerts in U.S. City Average, Not Seasonally Adjusted
- CUSR0000SEHB: Consumer Price Index for All Urban Consumers: Lodging Away from Home in U.S. City Average, Seasonally Adjusted
- CUUR0000SEHB: Consumer Price Index for All Urban Consumers: Lodging Away from Home in U.S. City Average, Not Seasonally Adjusted
11/20/23: Global Government - Expanded American Community Survey (ACS) history for 1,400+ population variables since 2005 for ~500K geographies
Added historical data from the American Community Survey (ACS) to the american_community_survey_attributes
and american_community_survey_timeseries
tables for over 1,400 population variables dating back to 2005 at the following geographic entity levels: country, states, counties, cities, zip codes, core-based statistical areas (CBSAs), census tracts, and census block groups. Example population variable additions include age, race, income, employment status, immigration status, and household status. Data is as up-to-date as the latest ACS publication.
11/16/23: Weather & Environment - Added 6 tables covering disaster declaration and National Flood Insurance Program (NFIP) insurance data from FEMA
Added Federal Emergency Management Agency (FEMA) data on federally-declared disasters in the United States, disaster recovery public programs, and the National Flood Insurance Program (NFIP) insurance policies and claims. Six new tables were included in the release:
fema_disaster_declaration_index
- Details for each federally-declared disaster (e.g. name, type, date, public assistance funding amounts). Table is unique bydisaster_id
.fema_disaster_declaration_areas_index
- Geographic entities (e.g. counties, cities) impacted by each federally-declared disaster. A disaster declaration can include multiple geographic locations.fema_mission_assignment_index
- Work orders issued by FEMA since 2012 to other government agencies, supporting emergency response activation across the US (e.g. requesting transportation support from the US Department of Transportation during a hurricane).fema_national_flood_insurance_program_claim_index
- Details on National Flood Insurance Program (NFIP) claims including features of the insured property, information on the flood event precipitating the claim, the cost of the damage, and subsequent insurance payout amounts.fema_national_flood_insurance_program_policy_index
- Details on National Flood Insurance Program (NFIP) policies including features of the insured property, building and contents insurance coverage, deductibles, rates, and policy durations.fema_region_index
- Human readable names and details for each FEMA Region
11/13/23: Canadian Government - Added 6 timeseries from Statistics Canada covering Canadian debt securities and real estate development
Added 6 new timeseries to the canada_statcan_timeseries
and canada_statcan_attributes
tables from 3 of Statistics Canada’s underlying sources:
- Bank of Canada: Government of Canada debt gross new issues, retirements and net new issues, and par values
- Annual Survey of Service Industries: Real estate agents, brokers, and appraisers summary statistics (e.g. salaries, wages, operating revenue)
- Canada Mortgage and Housing Corporation:
- Absorptions and unabsorbed inventory, newly completed dwellings, by type of dwelling unit in select census metropolitan areas
- Absorptions and unabsorbed inventory, newly completed dwellings, by type of dwelling unit in census agglomerations of 50K+
- Housing starts, under construction and completions in select census metropolitan areas
- Housing starts, under construction and completions in census agglomerations of 50K+
11/7/23: Cybersyn Public Domain Pro - Added point-in-time history for 45 tables
Added point-in-time history tables for the following Cybersyn datasets.
- American Community Survey
- Canada Statcan
- Carbon Credit Purchases
- Carbon Intensity
- Company Index and Characteristics
- Data Commons
- Domain Index
- FHFA
- Geography Characteristics
- Home Mortgage Disclosures
- IMEI
- IRS
- NOAA Weather Stations and Metrics
- OpenFIGI Security Index
- PermID Security Index
- Points of Interest (POIs)
- Urban Crime
- US Addresses & POI
- USDA
- US Treasury
- USPS Address Changes
11/6/23: Global Government - Added trade-related datasets from the US Dept Commerce's International Trade Administration
Added 5 new tables containing trade-related datasets from the US Department of Commerce's International Trade Administration (ITA). The ITA provides data on trade events, trade leads, export business service providers, ITA export assistance centers, and export restricted entities.
international_trade_administration_business_service_providers_index
provides a directory of US and foreign-based businesses providing services that many small- and medium-sized exporters require (e.g., legal, tax, consulting, market research, export management, travel facilitation services).international_trade_administration_trade_events_index
provides trade events (including conferences, webinars, trade shows, workshops, and more) for US businesses interested in selling their products and services overseas.international_trade_administration_export_assistance_centers_index
provides a directory of all of the International Trade Administration's (ITA) domestic and international export assistance centers and the areas they service.international_trade_administration_export_screened_entities_index
lists parties from the Consolidated Screening List (CSL) for which the United States government maintains restrictions on certain exports, reexports, or transfers of items. These may be sanctioned individuals from the Department of Commerce, Department of State, Treasury, or Office of Foreign Assets Control.international_trade_administration_trade_leads_index
provides contract opportunities for US businesses selling their products and services overseas.
11/3/23: Global Government - Added calendar_index
table
Added the calendar_index
table which compiles common calendars into a single table. Each calendar type has a unique calendar_id
, which allows users to select which calendar type they want to use. Individual periods within each calendar type include period start and end dates.
The calendar_index
currently includes regular calendar periods (days, weeks, months, quarters, and years) and 4-5-4 retail calendar periods (4-5-4 retail months, quarters, and years).
The 4-5-4 retail calendar is a standardized accounting and reporting calendar system used by many retailers, where each fiscal year is divided into 13 weeks, aiming to align with seasonal variations and facilitate more accurate financial comparisons.
11/3/23: Global Government - Added global agricultural commodity production and distribution data from the USDA
Added two tables sourced from the US Department of Agriculture's (USDA) Foreign Agricultural Service (FAS) which provides production, supply, and distribution data on agricultural commodities for both the United States and other producing and consuming countries since 1960.
us_department_of_agriculture_commodities_attributes
describes the production, supply, and distribution metrics tracked for each commodity by the USDA.us_department_of_agriculture_commodities_timeseries
table provides the reported metrics for each commodity and country.
October 2023
10/25/23 - Finance & Economics - Added detailed branch-level data and Summary of Deposits (SOD) data from the FDIC
The Summary of Deposits (SOD) data from the FDIC is an annual survey, capturing branch-level deposits as of June 30 for all FDIC-insured institutions, including U.S. branches of foreign banks.
fdic_branch_locations_index
: provides details on FDIC-insured bank branches, including branch-specific location information as well as institution-level regulatory and insurance data.fdic_summary_of_deposits_attributes
: describes the deposit types tracked by the Summary of Deposits (SOD) survey.fdic_summary_of_deposits_timeseries
: provides the results of the annual Summary of Deposits (SOD) survey going back to 1994 for banks’ branch-level deposits.
10/19/2023: Global Government - Added US Federal Government Revenue Collections from the US Treasury Fiscal Data
The US Treasury provides a daily overview of net federal revenue collections from income tax deposits, customs duties, fees for government services, fines, and loan repayments. These collections undergo electronic and/or non-electronic processing, involving various channels such as mail, internet, banking, and over-the-counter transactions, all of which are comprehensively incorporated within this dataset.
The us_treasury_revenue_collections_timeseries
table provides daily net collections amounts broken down by tax category and processing channel. The us_treasury_revenue_collections_attributes
table details each collection method reported by the US Treasury.
10/18/23: US Insurance & Healthcare Provider Foundation - Added Form 5500 Schedule A Part 1 insurance data from the US Department of Labor.
Expanded the US Department of Labor data to include information found on Form 5500 Schedule A Part 1. The new table, us_department_of_labor_form_5500_broker_index
, provides commission and fee amounts received by a broker for an insurance policy. Additional information about the brokers in the table includes their address, classification as an insurance broker, as well as notes pertaining to the compensation disbursed to them.
The us_department_of_labor_form_5500_broker_index
can be joined to insurance carrier and policy information to individual Form 5500 filings, using insurance_policy_id
and ack_id
.
10/16/23: Finance & Economics - Added exchange rates from the Bank for International Settlements (BIS)
Due to the discontinuation of certain currency conversion pairs by the European Central Bank (ECB), our primary source for daily FX rates, we have sourced a number of these pairs from an alternative source, the Bank for International Settlements (BIS), for ongoing history. The Bank for International Settlements will be used to get data for the following currency conversion pairs after September 27, 2023: USD:AED, USD:ARS, USD:CLP, USD:COP, USD:DZD, USD:MAD, USD:PEN, USD:QAR, USD:SAR, USD:TWD, USD:UAH.
Full history from the Bank for International Settlements was added for the following currency pairs: USD:ALL, USD:AUD, USD:BAM, USD:BHD, USD:BND, USD:EUR, USD:GBP, USD:IRR, USD:ISK, USD:KWD, USD:KZT, USD:LKR, USD:MKD, USD:MUR, USD:NPR, USD:NZD, USD:OMR, USD:RSD, USD:RUB, USD:TND, USD:TTD, USD:UYU, USD:VEF, USD:XDR.
10/11/23: Global Government - Added population variables to the American Community Survey tables.
Expanded the american_community_survey_attributes
and american_community_survey_timeseries
tables to include additional population variables related to income, age, and educational attainment.
New series include Household Income in the Past 12 Months (Inflation-Adjusted), Educational Attainment for the Population 25 Years and Over, and Age of Householder By Household Income in the Past 12 Months (Inflation-Adjusted). These series are available by multiple breakdowns (ex. income, age, gender, etc.).
10/11/23: SEC Filings, Global Government - Added company_index
table
The company_id
is a unique identifier assigned by Cybersyn to each company and is joinable to the company_index
, which provides company names and other helpful identifiers such as CIK, LEI, PermID, and more. Note that when the company_id
is null, then the row represents data for all companies.
10/09/23: US Housing & US Real Estate - Added U.S. Census Regions & Divisions
Expanded the geography_index
table to include U.S. Census Regions and Divisions.
Expanded the geography_hierarchy
table to include the relationships between U.S. Census regions and U.S. Census divisions; U.S. Census regions and U.S. states; and U.S. Census divisions and U.S. states.
Census regions include the United States Northeast, Midwest, etc. and census divisions include the United States Middle Atlantic, East North Central, etc.
10/8/23: SEC Filings - Added permid_security_index
table to expand security coverage
-
Added one new table to our dataset,
permid_security_index
, which includes security identifiers from Refinitiv’s PermID database. These are persistent identifiers for active and inactive securities across global asset classes. The table includes over 15K PermIDs for various securities. -
This data can be merged to the 13F filings data (
sec_holding_filing_attributes
) and can be mapped back to companies using thecompany_security_relationships
table.
10/06/23: Tech & Tech & Innovation - Added in repository of web domains plus included GitHub Events & US Patents tables
Cleaned and aggregated over 300M domains in a single source to track the list of websites globally into new domain_index
table.
Added GitHub Events and US Patents tables to the product, rebranded product from "IMEI Type Allocation Codes" to "Tech & Innovation."
10/2/23: Global Government - Added population data from the American Community Survey
Expanded our population dataset to include annual estimates from the American Community Survey (ACS) for 2021 and 2022 at multiple geographic levels in the United States.
September 2023
9/18/23: Finance & Economics - Added 6 timeseries from FRED covering core CPI and industry and commodity-specific PPI data
Added 6 new timeseries from FRED to the financial_fred_timeseries
and financial_fred_attributes
tables:
- PPIFIS: Producer Price Index by Commodity: Final Demand
- PCU4841214841212: Producer Price Index by Industry: General Freight Trucking, Long-Distance Truckload
- PCU4841224841221: Producer Price Index by Industry: General Freight Trucking, Long-Distance Less Than Truckload
- PCU3313133131: Producer Price Index by Industry: Alumina and Aluminum Production and Processing
- WPU101707: Producer Price Index by Commodity: Metals & Metal Products: Cold Rolled Steel Sheet and Strip
- CPILFESL: (CORE CPI) Consumer Price Index for All Urban Consumers: All Items Less Food & Energy in U.S. City Average
9/17/23: SEC Filings, LLM Training - Added company_index
, company_characteristics
, and company_security_relationship
tables; added PermIds
Added three new tables:
- The
company_index
table aggregates commonly used company identifiers (i.e. CIKs, EINs, and LEIs) into a single a singlecompany_id
, which can be used across Cybersyn’s datasets as a unique identifier for corporate entities. - The
company_security_relationship
table maps OpenFIGI and PermID securities (i.e. securities with multiple "levels" such as OpenFIGI FIGI ID and OpenFIGI Share Class ID) to the Company. - The
company_characteristics
table includes categorical characteristics of a Company (e.g. industry, address, previous names). A characteristic may be temporal with start and end dates indicating the range for which the data is valid.
Added PermId securities published by Refinitiv.
9/15/23: Weather & Environment - Added daily weather data from 80K+ weather stations across 180 countries; updated product name
New tables added:
noaa_weather_station_index
contains metadata on the weather stations from the Global Historical Climatology Network daily (GHCNd) database, including mappings to Cybersyn’sgeo_id
at the country-, state- and zip-level (where applicable).noaa_weather_metrics_attributes
includes the daily weather variables tracked globally and their measurement details.noaa_weather_metrics_timeseries
provides the details of the daily global weather variables recorded at each weather station.
Updated product name from "Emissions & Environment Essentials" to "Weather & Environmental Essentials".
9/15/23: US Insurance & Healthcare Provider Foundation - Added healthcare provider emails; combined telephone
and telephone_extension
into one field; changed telephone
to array to accommodate numerous values
- Added
email
field to thenppes_provider_addresses
table with provider emails per address. - Combined
telephone
andtelephone_extension
from thenppes_provider_addresses
table into a single field,telephone
, and removed thetelephone_extension
field. - Aggregated all values for
telephone
,fax
, andemail
that are associated with the same NPI and address into arrays in one row. Rows in thenppes_provider_addresses
table are now uniquely defined by NPI and full address.
9/15/23: US Points of Interest & Addresses, Global Government - Added FIPS 10-4 country codes and state abbreviations
Expanded the geography_characteristics
table to include mappings of FIPS 10-4 country codes and U.S. state abbreviations to country and state-level geo_id
s, respectively.
9/10/23: US Insurance & Healthcare Provider Foundation - Added table to relate taxonomy classifications to practitioners’ license numbers
Added the nppes_provider_taxonomy_and_license_numbers
table that relates taxonomy classifications to practitioners’ license numbers. This table provides users the ability to filter for license numbers based on practitioners’ primary taxonomy.
9/7/23: SEC Filings - Added 13F filings to include data on quarterly investment fund managers’ holdings; added a securities index table based on OpenFIGI data
Expanded our dataset to include individual filings from 13F fund holding reports, which disclose the equity holdings of institutional investment managers. Added three new tables, sec_holding_filing_index
, sec_holding_filing_attributes
and openfigi_security_index
.
sec_holding_filing_index
table contains metadata from individual 13F filings including filing date and filing organization. sec_holding_filing_attributes
table includes securities' names, market value, number of shares held, and OpenFIGI IDs, which facilitate easier mapping and analysis to outside data sources. Table openfigi_security_index
contains an index of over 2M securities listed on OpenFIGI and can be joined with table sec_holding_filing_attributes
using top_level_openfigi_id
- the unique identifier for each security in the two tables.
9/6/23: US Insurance & Healthcare Provider Foundation - Added contact information and broker payments to Form 5500 data
- Added new fields to
us_department_of_labor_form_5500_filing_index
with Form 5500 contact information including name and phone numbers:ADMIN_SIGNED_NAME
,SPONSOR_SIGNED_NAME
,DIRECT_FILING_ENTITY_SIGNED_NAME
,ADMIN_PHONE_NUM
andSPONSOR_DIRECT_FILING_ENTITY_PHONE_NUM
. - Added new fields to
us_department_of_labor_form_5500_policy_index
with payments to agents and brokers:COMMISSIONS_PAID_TO_BROKER
andFEES_PAID_TO_BROKER
. - Removed ~10k rows from
us_department_of_labor_form_5500_policy_index
that had NULL values for each field as they filed no data around insurance policies from Form 5500 Schedule A.
August 2023
8/31/23: US Insurance & Healthcare Provider Foundation - Added deactivated NPI numbers to NPPES data
Added a new table, nppes_npi_index
, that contains information on when NPIs were first issued, deactivated, or reactivated - dating back to 2005. This table also includes a boolean flag to indicate if an NPI is currently active.
Note that while all NPIs appear in the nppes_npi_index
table, only actively registered NPIs as well as NPIs deactivated after August 1, 2023 appear in the nppes_practitioner_attributes
and nppes_organization_attributes
tables. This means the dataset does not include attribute-level data (names, type of providers, specialization) on providers with NPIs deactivated before August 2023.
8/27/23: US Points of Interest & Addresses, US Housing & US Real Estate - Added points of interest data from Overture Maps Foundation
Added the point_of_interest_index
table, which includes names and categories for points of interest in the US. Each POI is uniquely identified by a poi_id
.
To tie POIs to addresses, we added a new column, address_id
, to the us_addresses
table to uniquely identify each individual address. This column allows users to join addresses to POIs using the new point_of_interest_addresses_relationships
table with poi_id
and address_id
as the join keys for the point_of_interest_index
table and us_addresses
table, respectively.
8/27/23: US Points of Interest & Addresses, US Housing & US Real Estate - Added 7.2M new addresses, removed 49.8M duplicate addresses, deleted 1.2M addresses with Null
street
value
Added 7.2M new addresses covering points of interest from Overture Maps Foundation to the us_addresses
table.
Removed 49.8M addresses that were duplicative aside from minor variability in coordinates. Removed 1.2M rows from rows from the us_addresses
table where the street
value contained a string with value Null
.
8/27/23: US Points of Interest & Addresses, US Housing & US Real Estate - Added country-level geospatial boundaries to the geography_characteristics
table
Added country-level geospatial boundaries to the geography_characteristics
table with data from Overture Maps Foundation.
8/13/23: SEC Filings - Added 8-K filings and exhibits for 10-Qs and 10-Ks
Expanded our coverage of SEC documents to include the full text of 8-K filings and associated exhibits. 8-K filings include company press releases, earnings releases, and other major corporate events.
Added the full text of exhibits for 10-K and 10-Q filings. Exhibit types include lists of subsidiaries, merger agreements, and material changes in financial conditions. Exhibits are denoted in the variable
and variable_name
columns (e.g. 10-K EX-21 Filing Text
).
Added the sec_document_id
column. This field is a combination of the ADSH (accession number) and the document type (e.g. 10-K). This serves as a unique identifier for each individual component that makes up a filing in cases when one or more exhibits are included in a filing.
8/11/23: Global Government, US Housing & US Real Estate, US Addresses & POI & Geographic Areas - Added geospatial boundaries data for territories in the US and Canada
The Census Bureau and Statistics Canada publish geospatial boundaries data for their territories at multiple geographic levels. We added a table geography_characteristics
with the boundary coordinates from the most recent releases in both WKT and GeoJSON formats. The table is joinable at different levels using Cybersyn's geo_id
. This geo_id
is compatible with all Cybersyn listings that have geographic identifiers. Currently, the geographic levels covered include:
- State (US and Canada)
- County (US only)
- Census Tract (US only)
- ZIP Code (US only)
- Dissemination Area and Aggregate Dissemination Area (Canada only)
- Census Division and Census Subdivision (Canada only)
- Census Agglomeration and Census Agglomeration Part (Canada only)
- Census Metropolitan Division and Census Metropolitan Division Part (Canada only)
8/10/23: Finance & Economics - Added crosswalk to FRED series IDs & 107 new series from GDP, Employment Situation, Housing Starts, and Residential Construction reports
Added a new table, financial_fred_variable_series_id_crosswalk
, that enables a join between Cybersyn’s variable and FRED’s unique series ID
Added new series from the following four reports:
- Gross Domestic Product (data produced by the US Bureau of Economic Analysis)
- Employment Situation (US Bureau of Labor Statistics)
- New Residential Construction (US Department of Housing and Urban Development)
- Quarterly Starts and Completions by Purpose and Design (US Department of Housing and Urban Development)
8/7/23: Global Government - Added text-based US government contracts data from SAM.gov
The US government publishes contract opportunities and proposals to do business with the federal government via the System for Award Management (sam.gov) for contracts and awards with a value of at least $25,000. The data goes back to January 2002 and includes metadata providing descriptions of government contracts and the corresponding awards granted for those contracts.
July 2023
7/31/23: Finance & Economics - Added release_name
and release_source
for better discoverability
release_name
: The collection, group of data, or report from which a time series originates. This column can be used as a filter to find related series.release_source
: The organization (e.g. FDIC, Federal Reserve) that FRED collects the data from.
Two columns were added to financial_fred_attributes
to provide better categorization and discoverability.
7/28/23: Canadian Government - Removed the age_group
measure
and labour_force_statistic
columns
Legacy columns from the July 16, 2023 release were removed.
age_group
measure
labour_force_statistic
7/25/23: Finance & Economics - Added 124 time series from FRED covering a variety of economic data
124 additional time series added from FRED to the financial_fred_timeseries
and financial_fred_attributes
.
7/16/23: Canadian Government - Added prices, output, and household income; schema changes
- New datasets:
- Household income and finances: Household income and consumption, household credit liabilities, and household savings rates
- Prices and output: Core consumer price index, gross domestic product by industry group, and new housing price indices
- Updated datasets:
- StatCan archived a number of retail trade series and published new series. Old series are marked “Archived…” in the
report
field. The latest series have been included to replace these.
- StatCan archived a number of retail trade series and published new series. Old series are marked “Archived…” in the
- Schema changes:
- The following columns in the
canada_statcan_attributes
view were updated. The deprecated columns will be removed on 7/28/2023.age_group
will be folded intodemographic_group
, which applies more broadly to age ranges, income groups, and household makeups (e.g., 35 to 44 years, lowest income quintile, elderly persons not in an economic family)measure
will be renamedreport
. Thereport
column displays the StatCan dataset from which the data originates (e.g. Consumer Price Index, New Housing Price Index)labour_force_statistic
will be renamedstatistic
. The new statistic column will provide the label for the specific economic metric that is being reported (e.g. Median After Tax Income, Number of families)
- The following columns in the
7/3/23: US Insurance & Healthcare Provider Foundation - Added Form 5500 insurer information
Added information from US Department of Labor Form 5500 filings about company benefit providers and the insurance/benefit plans they offer. New tables include us_department_of_labor_form_5500_filing_index
and us_department_of_labor_form_5500_policy_index
.
June 2023
6/14/23: SEC Filings - Added full text 10-Qs and 10-Ks
Added the full text of 10-K/Q filings. These are contained in the sec_report_text_attributes
table.
6/13/23: Finance & Economics - Added Bureau of Labor Statistics datasets
Added Consumer Price Index (CPI), Average Prices (AP), Job Openings and Labor Turnover Survey (JOLTS), State and Metro Area Employment , Hours, & Earnings (SAE), Local Area Unemployment Statistics (LAUS) from the Bureau of Labor Statistics.
6/1/23: Global Government - Added 3,000 new US zip codes from USPS and US Census
- Using USPS address change data, we added 3,000 zip codes (mostly PO Box) to the
dc_geo_index
. - Using both the USPS address change and US Census Bureau data, we increased the coverage in
geography_relationships
table with 6,500 new zip and city relationships. We now map 86% of zip codes to a city.
May 2023
5/26/23: Finance & Economics - Added 5 global central bank policy rates
Central bank policy rates from Brazil, Canada, England, Mexico, and Japan added to fred_timeseries
and fred_attributes
.
5/19/23: Global Government - Updated product name from Cybersyn Data Commons to Cybersyn Global Government
Rebranded Cybersyn Data Commons as Cybersyn Global Government. We updated the naming conventions for schemas, tables, and column names to make them consistent across all of Cybersyn’s existing and future data products. Cybersyn will continue to support and update your older version of the Data Commons tables.
5/19/23: US Addresses & POI & Geographic Areas, GitHub Events - Added source data from National Address Database (NAD)
Added the National Address Database (NAD) as a source to increase our US address coverage:
- Increased the coverage from 140 million addresses to more than 188 million.
- There is now at least one address in more than 85% of zip codes, up from 74% previously.
- Increased the portion of cities that are mapped to distinct IDs joinable to our other data sets from 24% to over 77%