Skip to main content

Web Domains

Overview

Domain Project is the world's largest public internet domains dataset.

ICANN's Centralized Zone Data Service provides zone files for top level domains (TLDs) via an online portal.

Majestic Million provides rankings for the top million domains with the most referring subnets.

Example topics covered:

  • 300M+ cleaned web domains in a standardized format
  • Redirect domains and the start/end date for which the redirect was observed
  • Active/inactive HTTP responses status

Key Attributes

Geographic CoverageGlobal
Entity LevelDomain

Notes

Cybersyn has cleaned and aggregated over 300M domains in a single source to track the list of websites globally. The domains are cleaned into a standardized format stripping away any protocols and subdomains (e.g., cybersyn.com) and include helpful reference columns such as the “core” domain (cybersyn) and the public suffix domain (com). For a subset of these domains, Cybersyn provides information on redirects including the redirect domain and the start/end dates for which the redirect relationship was observed. Details on whether a domain is active/inactive based on the HTTP response status and whether a domain is the primary landing page or redirects are also included.

Cybersyn periodically does GET requests for domains to determine the status response code received and any redirect destinations. Cybersyn periodically does GET requests for domains to determine the redirect destinations.

Cybersyn Products

Tables above are available in the following Cybersyn data products:

Sample Queries

Pull a list of websites with a specific domain

Screen for websites that are registered using the “.ai” suffix domain

SELECT domain_id, core_domain, public_suffix_domain
FROM cybersyn.domain_index
WHERE public_suffix_domain = 'ai'
LIMIT 500;

Pull a list of active websites with a specific domain

Select only domains that use the ‘.ai’ top level domain and for which the most recent HTTP response check by Cybersyn was successful

SELECT domain_id
FROM cybersyn.domain_characteristics
WHERE domain_id ILIKE '%.ai'
AND relationship_type = 'successful_http_response_status'
AND value = 'true'
AND relationship_end_date IS NULL;

Disclaimers

The data in this product is sourced from the following:

  • Domain Project: License; Copyright (c) 2020-2021, Bohdan Turkynewych All rights reserved.
  • ICANN
  • Majestic Million: License

Cybersyn is not endorsed by or affiliated with any of these providers. Contact support@cybersyn.com for questions.