Web Domains
Overview
Domain Project is the world's largest public internet domains dataset.
ICANN's Centralized Zone Data Service provides zone files for top level domains (TLDs) via an online portal.
Majestic Million provides rankings for the top million domains with the most referring subnets.
Example topics covered:
- 300M+ cleaned web domains in a standardized format
- Redirect domains and the start/end date for which the redirect was observed
- Active/inactive HTTP responses status
Key Attributes
Geographic Coverage | Global |
Entity Level | Domain |
Notes
Cybersyn has cleaned and aggregated over 300M domains in a single source to track the list of websites globally. The domains are cleaned into a standardized format stripping away any protocols and subdomains (e.g., cybersyn.com) and include helpful reference columns such as the “core” domain (cybersyn) and the public suffix domain (com). For a subset of these domains, Cybersyn provides information on redirects including the redirect domain and the start/end dates for which the redirect relationship was observed. Details on whether a domain is active/inactive based on the HTTP response status and whether a domain is the primary landing page or redirects are also included.
Cybersyn periodically does GET requests for domains to determine the status response code received and any redirect destinations. Cybersyn periodically does GET requests for domains to determine the redirect destinations.
Cybersyn Products
Tables above are available in the following Cybersyn data products:
Sample Queries
Pull a list of websites with a specific domain
Screen for websites that are registered using the “.ai” suffix domain
SELECT domain_id, core_domain, public_suffix_domain
FROM cybersyn.domain_index
WHERE public_suffix_domain = 'ai'
LIMIT 500;
Pull a list of active websites with a specific domain
Select only domains that use the ‘.ai’ top level domain and for which the most recent HTTP response check by Cybersyn was successful
SELECT domain_id
FROM cybersyn.domain_characteristics
WHERE domain_id ILIKE '%.ai'
AND relationship_type = 'successful_http_response_status'
AND value = 'true'
AND relationship_end_date IS NULL;
Disclaimers
The data in this product is sourced from the following:
- Domain Project: License; Copyright (c) 2020-2021, Bohdan Turkynewych All rights reserved.
- ICANN
- Majestic Million: License
Cybersyn is not endorsed by or affiliated with any of these providers. Contact support@cybersyn.com for questions.