Comment on page
Crime counts & classifications for major cities at zip code level
Police department crime data from New York, Los Angeles, San Francisco, Houston, Chicago, and Seattle. This dataset has been reformatted to cover date of occurrence, offense classification/description, and estimated zip code location for crimes.
All Cybersyn products follow the EAV (entity, attributes, value) model with a unified schema. Entities are tangible objects (e.g. geography, company). Entities may have characteristics (i.e. descriptors of the entity) in an index table and values (i.e. statistics, measure) in a timeseries table. Data is joinable across all Cybersyn products that have a
GEO_ID. Refer to Cybersyn Concepts for more details.
As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.
Cybersyn has normalized the data with the following changes:
- Each jurisdiction currently uses a different offense classification system, Houston uses NIBR (the new national standard), Chicago uses IUCR, NYC uses the NY State Penal Code. Thus, different cities will have different offense codes for similar crimes. Cybersyn mapped these granular crime codes to broad "offense categories" using Chicago's IUCR system. Most cities are expected to transition to NIBRS in the near future. See the ‘reporting_system’ column for the code system used.
- Jurisdictions that use NIBRS may log crimes in the offense-level starting in 2021. For these jurisdictions, incidents may have multiple rows for each offense reported. Jurisdictions using older classification systems will only have one row per incident, classified by the worst offense recorded. See ‘reporting_level’ column to know which level the incident was reported in.
- When unavailable in the source data, zip codes are mapped based on incident lat/long or reported address location.
- Chicago, San Francisco, and Seattle data is updated daily; Los Angeles is updated weekly; Houston data is updated monthly; NYC data is updated Quarterly.
Use Case: Historical crime incidents in a location
Times series of crime incidents in a specific zip code
FROM cybersyn.urban_crime_timeseries AS ts
JOIN cybersyn.geography_index AS geo
ON (ts.geo_id = geo.geo_id)
WHERE geo.geo_name = '60620' --zip code of interest
AND ts.variable_name = 'Daily count of incidents, all incidents'
ORDER BY ts.date;
Use Case: Locations with the highest level of crime
List of zip codes with highest levels of specific crimes (e.g., theft) in 2020
YEAR(date)::STRING AS year,
SUM(value) AS annual_incidents
WHERE YEAR(date) = '2020'
AND variable_name = 'Daily count of incidents, theft'
GROUP BY geo_id, year
ORDER BY annual_incidents DESC;
There are no updates at this time.
Last modified 2mo ago