US Insurance & Healthcare

Healthcare provider and insurance-related data including NPIs from the NPPES and benefits plans of all large US employers via Form 5500 filings


This product includes data on actively registered US healthcare providers and on the benefits plans (e.g. healthcare, medical, life insurance) of all large US employers. The dataset is well suited to serve as a spine for any healthcare provider analysis because of the unique NPI (National Provider Identifier) code that is used across HIPAA covered entities.

Examples topics covered:

  • Healthcare provider names, emails, and phone numbers

  • Healthcare provider mailing addresses and business/practice locations

  • Healthcare provider license numbers

  • Healthcare provider specialties

  • NPI issuance, deactivation and reactivation dates

  • Point-in-time data for deactivated NPIs

  • NPI to NUCC taxonomy mapping

  • Insurance providers of specific companies

  • Insurance carrier market penetration

Data Sources, Attributes, Sample Queries

A detailed description of the data is available by source. Source pages include key attributes (e.g. geographic coverage, time granularity, history, entity level), release frequency, notes & methodologies, and sample queries.

All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company) that Cybersyn provides data on. All timeseries' dates and values that refer to the entity are included in a timeseries table. Descriptors of the timeseries are included in an attributes table. Data is joinable across all Cybersyn products that have a GEO_ID. Refer to Cybersyn Concepts for more details.

As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.

Releases & Changelog

10/18/23 - Added Form 5500 Schedule A Part 1 insurance data from the US Department of Labor

Expanded the US Department of Labor data to include information found on Form 5500 Schedule A Part 1. The new table, us_department_of_labor_form_5500_broker_index, provides commission and fee amounts received by a broker for an insurance policy. Additional information about the brokers in the table includes their address, classification as an insurance broker, as well as notes pertaining to the compensation disbursed to them.

The us_department_of_labor_form_5500_broker_index can be joined to insurance carrier and policy information to individual Form 5500 filings, using INSURANCE_POLICY_ID and ACK_ID.

9/15/23 - Added healthcare provider emails; combined TELEPHONE and TELEPHONE_EXTENSION into one field; changed TELEPHONE to array to accommodate numerous values
  • Added EMAIL field to the nppes_provider_addresses table with provider emails per address.

  • Combined TELEPHONE and TELEPHONE_EXTENSION from the nppes_provider_addresses table into a single field, TELEPHONE, and removed the TELEPHONE_EXTENSION field.

  • Aggregated all values for TELEPHONE, FAX, and EMAIL that are associated with the same NPI and address into arrays in one row. Rows in the nppes_provider_addresses table are now uniquely defined by NPI and full address.

9/10/23 - Added table to relate taxonomy classifications to practitionersโ€™ license numbers

Added the NPPES_PROVIDER_TAXONOMY_AND_LICENSE_NUMBERS table that relates taxonomy classifications to practitionersโ€™ license numbers. This table provides users the ability to filter for license numbers based on practitionersโ€™ primary taxonomy.

9/6/23 - Added contact information and broker payments to Form 5500 data
  • Added new fields to us_department_of_labor_form_5500_filing_index with Form 5500 contact information including name and phone numbers: ADMIN_SIGNED_NAME, SPONSOR_SIGNED_NAME, DIRECT_FILING_ENTITY_SIGNED_NAME, ADMIN_PHONE_NUM and SPONSOR_DIRECT_FILING_ENTITY_PHONE_NUM.

  • Added new fields to us_department_of_labor_form_5500_policy_index with payments to agents and brokers: COMMISSIONS_PAID_TO_BROKER and FEES_PAID_TO_BROKER.

  • Removed ~10k rows from us_department_of_labor_form_5500_policy_index that had NULL values for each field as they filed no data around insurance policies from Form 5500 Schedule A.

8/31/23 - Added deactivated NPI numbers to NPPES data

Added a new table, nppes_npi_index, that contains information on when NPIs were first issued, deactivated, or reactivated - dating back to 2005. This table also includes a boolean flag to indicate if an NPI is currently active.

Note that while all NPIs appear in the nppes_npi_index table, only actively registered NPIs as well as NPIs deactivated after August 1, 2023 appear in the nppes_practitioner_attributes and nppes_organization_attributes tables. This means the dataset does not include attribute-level data (names, type of providers, specialization) on providers with NPIs deactivated before August 2023.

7/3/23 - Added Form 5500 insurer information

Added information from US Department of Labor Form 5500 filings about company benefit providers and the insurance/benefit plans they offer. New tables include us_department_of_labor_form_5500_filing_index and us_department_of_labor_form_5500_policy_index.


The data in this dataset is sourced on the individual source pages. Links to provider terms and disclaimers are provided where appropriate.

Cybersyn is not endorsed or affiliated with any of these providers. Contact for questions.

Last updated

Copyright ยฉ 2024 Cybersyn