National Address Database, OpenAddresses, Overture Maps Foundation

Overview

The National Address Database (NAD) is a US Department of Transportation initiative to create a single, public-domain database of address information, focusing on accuracy and nationwide coverage to support various governmental and emergency response needs.

Overture Maps Foundation is an open data project steered by Amazon, Meta, Microsoft, and TomTom that aggregates map data from multiple sources and shares locations of points of interest. It aims to provide a comprehensive, open-source, global mapping platform that includes address data among various other geographical and location-based information, enhancing the detail and usability of open mapping data.

OpenAddresses is an open source project that compiles free and open global address data, aggregating crowdsourced and publicly available datasets to create a comprehensive, unified resource.

Example topics covered:

  • Points of interest

  • Business locations

  • Street names

  • House numbers

  • Postal codes

  • Longitude and latitude coordinates

Key Attributes

Geographic Coverage

United States

Entity Level

Address, Point of Interest

Release Frequency

NAD: Quarterly, exact timing varies OpenAddresses: Weekly, Sunday ~3:30pm ET Overture Maps: Roughly twice a quarter though a regular release schedule has not been established

As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.

Notes

Each address record includes latitude and longitude coordinates for geolocation. These coordinates can be combined with the geospatial boundaries included in the geography_characteristics table. Geospatial boundaries are provided as GeoJSON and WKT polygons and are represented as coordinates. These geospatial boundaries are also referred to as โ€shapefilesโ€, โ€œgeographic boundaries,โ€ โ€œbounding coordinates,โ€ and โ€œgeographic area coordinates.โ€

Each POI includes the name, location, category or type of place or business, and a POI_ID unique identifier. POIs can be mapped back to addresses using the relationships table.

Address Normalization: In the us_addresses table, Cybersyn normalizes the street names, city, state, and zip codes for each address line in the dataset using our geography_index table to create consistency across city names (e.g., โ€œSaint Paul, MNโ€ vs. โ€œSt. Paul, MNโ€) and to verify accuracy. Street abbreviations are also standardized in the data (e.g., โ€œRdโ€ -> โ€œRoadโ€).

Zip codes are determined using the address coordinates in combination with geospatial data from the US Census Bureau and are validated using data from the US Postal Service (USPS).

Point of Interest to Address Mapping: Note that more than one point of interest can map to a single address. For example, a fast food restaurant might share a location with a gas station or numerous doctors might have their own practices at a single address.

EAV Model: All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company) that Cybersyn provides data on. Index tables contain all entities of a certain type. Timeseries tables contain all timeseries' dates and values that refer to an entity type. Additional tables, such as the relationships table and attributes table, are used to describe the entities and timeseries. Data is joinable across all Cybersyn products that have a GEO_ID. Refer to Cybersyn Concepts for more details

Tables & Sources

TableSource(s)

US_ADDRESSES

POINT_OF_INTEREST_INDEX POINT_OF_INTEREST_ADDRESSES_RELATIONSHIPS

Cybersyn Products

Tables above are available in the following Cybersyn data products:

Examples & Sample Queries

Find the nearest competitor to a given merchant

Find the closest Loweโ€™s to any given Home Depot location

WITH joined_data AS (
    SELECT poi.poi_id, poi.poi_name, addr.longitude, addr.latitude,
           addr.number, addr.street_directional_prefix, addr.street,
           addr.street_type, addr.street_directional_suffix,
           addr.unit, addr.city, addr.state, addr.zip
    FROM cybersyn.point_of_interest_index AS poi
    JOIN cybersyn.point_of_interest_addresses_relationships AS map
        ON (poi.poi_id = map.poi_id)
    JOIN cybersyn.us_addresses AS addr
        ON (map.address_id = addr.address_id)
)
SELECT *,
       ST_DISTANCE(
           ST_MAKEPOINT(home_depot.longitude, home_depot.latitude),
           ST_MAKEPOINT(lowes.longitude, lowes.latitude)
       ) / 1609 AS distance_miles
FROM joined_data AS home_depot
JOIN joined_data AS lowes
WHERE home_depot.poi_name = 'The Home Depot'
  AND lowes.poi_name = 'Lowe''s Home Improvement'
QUALIFY ROW_NUMBER() OVER (PARTITION BY home_depot.poi_id ORDER BY distance_miles NULLS LAST) = 1;

Query all POIs of a specific type (e.g., coffee shop) within a given ZIP code

Generate a list of all coffee shops in a given ZIP code along with their addresses

SELECT
    poi.poi_name,
    poi.category_main,
    poi.category_alternate,
    addr.number,
    addr.street,
    addr.street_directional_prefix,
    addr.street,
    addr.street_type,
    addr.street_directional_suffix,
    addr.unit,
    addr.city,
    addr.state,
    addr.zip
FROM cybersyn.point_of_interest_index AS poi
JOIN cybersyn.point_of_interest_addresses_relationships AS map
    ON (poi.poi_id = map.poi_id)
JOIN cybersyn.us_addresses AS addr
    ON (map.address_id = addr.address_id)
WHERE addr.zip = '10003'
  AND poi.category_main = 'Coffee Shop';

Find addresses within a zip code to send direct mail campaign to

Query US addresses by zip code to find relevant addresses in your target area

SELECT NUMBER, STREET, STREET_TYPE, CITY, STATE, ZIP
FROM CYBERSYN.US_ADDRESS
WHERE ZIP = '02114'
LIMIT 5;

Reverse geocoding

Query addresses near longitude and latitude coordinates to find nearby addresses

SELECT LONGITUDE, LATITUDE, NUMBER, STREET, STREET_TYPE, CITY, STATE, ZIP
FROM CYBERSYN.US_ADDRESS
WHERE LONGITUDE BETWEEN  -74.5 AND -74
    AND LATITUDE BETWEEN 40.0 AND 40.5
LIMIT 5;

Use geographic boundaries to filter addresses

Query addresses in the largest zip code within a US state (e.g., Florida) by total tabulation area

WITH zip_areas AS (
    SELECT
        geo.geo_id,
        geo.geo_name AS zip,
        states.related_geo_name AS state,
        countries.related_geo_name AS country,
        ST_AREA(TRY_TO_GEOGRAPHY(value)) AS area
    FROM cybersyn.geography_index AS geo
    JOIN cybersyn.geography_relationships AS states
        ON (geo.geo_id = states.geo_id AND states.related_level = 'State')
    JOIN cybersyn.geography_relationships AS countries
        ON (geo.geo_id = countries.geo_id AND countries.related_level = 'Country')
    JOIN cybersyn.geography_characteristics AS chars
        ON (geo.geo_id = chars.geo_id AND chars.relationship_type = 'coordinates_geojson')
    WHERE geo.level = 'CensusZipCodeTabulationArea'
),

zip_area_ranks AS (
    SELECT
        *,
        ROW_NUMBER() OVER (PARTITION BY country, state ORDER BY area DESC, geo_id) AS zip_area_rank
    FROM zip_areas
)

SELECT addr.number, addr.street, addr.street_type, addr.city, addr.state, addr.zip, areas.country
FROM cybersyn.us_addresses AS addr
JOIN zip_area_ranks AS areas
    ON (addr.id_zip = areas.geo_id)
WHERE addr.state = 'FL' AND areas.country = 'United States' AND areas.zip_area_rank = 1
LIMIT 10;

Errata & Future Improvements

We note known issues and planned future improvements. If you would like to submit a bug report or feature request, email us at support@cybersyn.com.

  • The addresses for a small fraction of locations are incorrectly parsed. In particular, addresses of non-standard format such street intersections may be parsed incorrectly. The STATE, CITY, ZIP, and coordinates for these addresses are generally correct, but the STREET, NUMBER and UNIT may contain errors in these cases.

Disclaimers

The data in this product is sourced from the following:

  • National Address Database (NAD) published by the US Department of Transportation

  • Overture Maps Foundation

  • OpenAddresses. Copyright (c) 2023 OpenAddresses All rights reserved.

Cybersyn is not endorsed by or affiliated with any of these providers. Contact support@cybersyn.com for questions.

Last updated

Copyright ยฉ 2024 Cybersyn