Comment on page
Tech & Innovation Essentials
Technology-focused reference and activity data, such as web domains, patents, and GitHub repos.
This product includes technology-focused reference and activity data centered around various new innovations in the tech space.
Example topics covered include:
- A repository of over 300M web domains
- Github repository events and stars
- Patent applications and inventor information
Geographic Coverage | Global |
Entity Level | IMEI, Domain, GitHub repository, Contributor, Patent |
Update Frequency |
Cybersyn has cleaned and aggregated over 300M domains in a single source to track the list of websites globally. The domains are cleaned into a standardized format stripping away any protocols and subdomains (e.g., cybersyn.com) and include helpful reference columns such as the “core” domain (cybersyn) and the public suffix domain (com).
The GitHub Archive dataset offers access to public GitHub activity, presenting a look into open-source developers' contributions to repositories. See more details here.
The USPTO patent data includes patent grants in the US with publications dating back to January 1976. See more details here.
IMEI Type Allocation Codes (TAC) Data is sourced from the Open Source Mobile Communications project which maintains the database mapping the TAC to brand and model names. The data covers approximately 6,000 unique models. It includes links to GSMArena for both brand and model when available.
As with all Public Domain datasets, Cybersyn aims to release data on Snowflake Marketplace as soon as the underlying source releases new data. We check periodically for changes to the underlying source and, upon detecting a change, propagate the data to Snowflake Marketplace immediately. See our release process for more details.
Tables Names | Source | Source Schedule |
---|---|---|
domain_index | Daily at 7am ET | |
github_events
github_repos
github_stars | Daily at 11pm ET | |
uspto_patent_index
uspto_contributor_index
uspto_patent_contributor_relationships | Weekly - Tuesday | |
IMEI_tac_device | OSMOCOM updates the data when they receive updates from their community. |
Cybersyn builds Streamlit demos to visualize the data available in this product and provide a jumping off point.
Pull lists of websites
Screen for websites that are registered using the “.ai” suffix domain
SELECT domain_id, core_domain, public_suffix_domain
FROM cybersyn.domain_index
WHERE public_suffix_domain = 'ai'
LIMIT 500;
Use Case: TAC of the Apple iPhone 13
Query the database to find the iPhone 13 TAC to see how many are connected to your network
SELECT TAC, BRAND_NAME, MODEL_NAME FROM
CYBERSYN.TAC_DEVICE
WHERE BRAND_NAME = 'Apple' AND MODEL_NAME = 'iPhone 13'
Cleaned and aggregated over 300M domains in a single source to track the list of websites globally into new
domain_index
table.Added GitHub Archive and US Patents Grants tables to the product, rebranded product from "IMEI Type Allocation Codes" to "Tech & Innovation Essentials."
The data in this dataset is sourced here. Links to provider licenses, terms and disclaimers are provided where appropriate:
Cybersyn is not endorsed by or affiliated with any of these providers. Contact [email protected] for questions.
Last modified 1mo ago