US Census Bureau

Overview

The US Census Bureau publishes data about the American people and economy.

The agency's American Community Survey is an ongoing survey that provides the most up-to-date social, economic, housing, and demographic statistics available. Both 1 year and 5 year estimates are published annually for various geographic entity levels including state, city, zip code, and census block groups. Unlike the Decennial Census, the American Community survey is published every year and sent to a subset of US addresses (~3.5 million).

Example topics covered:

  • Income by zip code

  • Population growth by city

  • Homeownership rates

  • Monthly rent amount

  • Educational attainment

  • Poverty rates

  • Commuting/journey to work

  • Fertility

  • Internet access

Metrics from the U.S. Decennial Census published by the U.S. Census Bureau are included in the Data Commons tables detailed here.

An indicator of future construction activity, the US Census Bureau's Building Permits Survey provides the number and valuation of new housing units authorized by building permits at the national, state, and local levels. Additionally, Construction Spending data quantifies actual expenditures on construction projects, categorized into residential, non-residential, and public sectors. Example topics covered:

  • Number of new housing units authorized by building permit, by state & structure type

  • Total valuation of building permits by state & structure type

  • Total dollar value of construction work done each month on new structures or improvements to existing structures for private and public sectors

Key Attributes

As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.

Notes

The AMERICAN_COMMUNITY_SURVEY_ATTRIBUTES table describes each ingested ACS variable:

  • The American Community Survey population variables cover many subjects (from top-line population estimates to social, housing, economic, and demographic topics).

  • The variable and variable name are both unique fields. The variable represents the ACS alphanumeric code for a population table. Each character represents the type of table, subject of the table, table number within a subject, race, and identification for Puerto Rico geographies. See here for more details. All of this information has been parsed into the variable name and other fields in the attributes table.

  • The series level columns in the attributes table describe characteristics of each variable. For example, ACS series B01001 describes Sex by Age. ACS data is hierarchical in nature.

    • B01001_001 refers to total population.

    • B01001_002 introduces the first characteristic for this series: sex. This variable represents the male population. "Male" is found in the series level 1 column.

    • B01001_003 introduces the second characteristic for this series: age. This variable represents the male population aged under 5 years. "Male" is found in the series level 1 column; "Under 5 years" is found in the series level 2 column.

    • B01001_004 represents the male population aged 5 to 9 years. "Male" is found in the series level 1 column; "5 to 9 years" is found in the series level 2 column.

    • This pattern continues.

    • It is important to note that each series has its own hierarchical structure. This means that sex will not always be the characteristic of series level 1 for a particular series id, and age will not always be the characteristic of series level 2. When choosing which variables to work with, it is recommended to evaluate a set of variables by series id before narrowing down, if necessary.

ACS Measurement Type: ACS provides both population estimates and margins of error (MOE). The MOEs are provided at a 90% confidence level and represent the precision of the ACS estimate. From these MOEs, users can calculate the lower and upper bounds of a range expected to contain the true value of a population estimate 90% of the time.

ACS Measurement Period: The American Community Survey provides both 1-year and 5-year estimates annually.

  • 1-year estimates are based on 12 months of collected data (e.g., data for 2022 ACS 1-year estimates was collected between January 1, 2022 and December 31, 2022). 1-year estimates are only released for geographies with populations of 65,000+. This data is available from 2005 onwards, depending on the variable.

  • 5-year estimates are based on 60 months of collected data (e.g., data for 2022 ACS 5-year estimates was collected between January 1, 2018 and December 31, 2022). This data is available from 2009 onwards, depending on the variable.

ACS Data Release: Data release: Data is released at different times based on the measurement period (1-yr vs. 5-yr estimates) and the geography population size. See here for the 2022 data release schedule.

The AMERICAN_COMMUNITY_SURVEY_TIMESERIES table records the history of each variable by geographic entity:

  • This table can be joined to the GEOGRAPHY_INDEX table to easily search based on a geography type ("level") or specific geography (e.g., state of Pennsylvania).

ACS Source Data Limitations:

  • Due to the COVID-19 pandemic in 2020, the Census Bureau did not release its standard 2020 ACS 1-year estimates. They developed experimental estimates, which are currently not available.

  • We currently collect ACS data for the following geographic entities: country, state, core-based statistical area (CBSA), county, city, zip code, census tract, and census block group. We currently only collect a subset of the commonly used variables available for census tract and census block group. Please reach out to support@cybersyn.com if there are additional ACS series or geographies that you need to access.

Building Permits Survey (BPS): This data provides two different measurements for each variable: (1) estimates with imputation and (2) reported only. Estimates with imputation include reported data for respondents and imputed data for nonrespondents. Reported only data excludes imputed data for nonrespondents.

Slightly less than half of the permit-issuing places in the United States are surveyed monthly. The remainder of places are surveyed annually. Monthly statistics are obtained by directly cumulating the data for all places that are requested to report monthly. Annual statistics are obtained by accumulating data for all places, both monthly and annual reporters.

The Building Permits Survey covers all "permit-issuing places," which are jurisdictions that issue building or zoning permits. Zoning permits are used only for areas that do not require building permits but require zoning permits. Areas for which no authorization is required to construct a new privately-owned housing unit are not included in the survey. For more information on the sample design and reliability of the data, see here. EAV Model: All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company) that Cybersyn provides data on. All timeseries' dates and values that refer to the entity are included in a timeseries table. Descriptors of the timeseries are included in an attributes table. Data is joinable across all Cybersyn products that have a GEO_ID. Refer to Cybersyn Concepts for more details.

Tables & Sources

Census Bureau data is also used in Cybersyn's core geography tables.

Cybersyn Products

ACS tables above are available in the following Cybersyn data products:

Examples & Sample Queries

Getting started by ACS series ID

SELECT
    geo.geo_name AS state,
    att.variable_name,
    att.series_level_1,
    att.series_level_2,
    ts.date,
    ts.value,
    att.unit
FROM cybersyn.american_community_survey_timeseries ts
JOIN cybersyn.american_community_survey_attributes att
ON ts.variable = att.variable
JOIN cybersyn.geography_index geo
ON ts.geo_id = geo.geo_id
WHERE att.series_id = 'B01001'
  AND att.measurement_type = 'Estimate'
  AND att.measurement_period = '1YR'
  AND att.series_level_2 IS NOT NULL -- Filtering to the age-specific variables
  AND geo.level = 'State' -- Filtering to state-level data
ORDER BY ts.date, geo.geo_name, att.variable;

Find the population of adults with disabilities by state

SELECT DISTINCT att.*
FROM marketplace_cybersyn.cybersyn.american_community_survey_attributes att
JOIN marketplace_cybersyn.cybersyn.american_community_survey_timeseries ts
ON att.variable = ts.variable
JOIN marketplace_cybersyn.cybersyn.geography_index geo
ON ts.geo_id = geo.geo_id
WHERE att.census_subject = 'Disability Status'
  AND att.measurement_type = 'Estimate'
  AND geo.level = 'State'
ORDER BY 1,2;

Total Population by Census Block Group

SELECT 
    geo.geo_name AS state,
    att.variable_name,
    ts.date,
    ts.value,
    att.unit
FROM cybersyn.american_community_survey_timeseries ts
JOIN cybersyn.american_community_survey_attributes att
ON ts.variable = att.variable
JOIN cybersyn.geography_index geo
ON ts.geo_id = geo.geo_id
WHERE att.series_id = 'B01003'
  AND att.measurement_period = '5YR'
  AND att.measurement_type = 'Estimate'
  AND geo.level = 'CensusBlockGroup'
  AND ts.date >= '2018-01-01'
  AND ts.value > 0
ORDER BY ts.date, geo.geo_name;

Evaluate population by sex for a geographic entity level

Find the population by sex for a block group in Manhattan in the most recent published data.

WITH zip_stats AS (
    SELECT
        YEAR(ts.date) AS year,
        ts.geo_id AS zip,
        rship.related_geo_name AS state,
        ts.value AS population,
        LAG(ts.value, 1) OVER (PARTITION BY zip ORDER BY year ASC) AS prev_year_population,
        population / prev_year_population - 1 AS pct_growth,
        population - prev_year_population AS absolute_change
    FROM cybersyn.american_community_survey_timeseries AS ts
    JOIN cybersyn.american_community_survey_attributes AS att
        ON ts.variable = att.variable
    JOIN cybersyn.geography_index AS geo
        ON ts.geo_id = geo.geo_id
    JOIN cybersyn.geography_relationships AS rship
        ON ts.geo_id = rship.geo_id AND rship.related_level = 'State'
    WHERE
        att.series_type = 'Total Population'
        AND att.measurement_type = 'Estimate'
        AND att.measurement_period = '5YR'
        AND geo.level = 'CensusZipCodeTabulationArea'
        AND ts.value > 25000
)

SELECT
    *,
    ROW_NUMBER() OVER (PARTITION BY year ORDER BY pct_growth DESC NULLS LAST) AS annual_rank
FROM zip_stats
WHERE year >= 2012
QUALIFY ROW_NUMBER() OVER (PARTITION BY year ORDER BY pct_growth DESC NULLS LAST) <= 10
ORDER BY year, annual_rank;

Disclaimers

The data in this product is sourced from US Census Bureau. This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.

Cybersyn is not endorsed by or affiliated with any of these providers. Contact support@cybersyn.com for questions.

Last updated

Copyright © 2024 Cybersyn