Comment on page
URL Parsing Functions
Toolbox of utilities for parsing parts of domains and URLs based on user-defined functions (UDFs)
Cybersyn URL Parsing Functions is a Snowflake Native App that includes a library of functions to help clean and process URLs directly in Snowflake SQL. The user-defined functions (UDFs) allow users to easily extract website parts from URLs within Snowflake queries, which is particularly useful for data cleaning and analysis tasks related to clickstream and web traffic data.
Each UDF takes a single parameter, URL, which should be a string.
Function | Text |
---|---|
GETDOMAIN( url ) | Returns the domain with the suffix or top level domain without any URL parameters or subdomains. |
GETDOMAINONLY( url ) | Returns only the core domain text without any subdomain, parameters, suffix or top level domain. |
GETPUBLICSUFFIXDOMAINONLY( url ) | Returns only the public suffix domain. Public suffixes include suffixes under which users or companies can register domains. |
GETSUBDOMAIN( url ) | Returns the domain in addition to the subdomain. The subdomain “www” is excluded from results. |
GETSUBDOMAINONLY( url ) | Returns only the subdomain portion of the domain. |
Internally, Cybersyn uses these functions to extract portions of URLs to more accurately and consistently parse and match domains across datasets. Use these functions to clean your own data or to easily join your internal data to Cybersyn’s web data.
Parse the domain details for the following url: https://chat.openai.com/?model=gpt-4-browsing
-- GETDOMAIN(https://chat.openai.com/?model=gpt-4-browsing) - openai.com
-- GETDOMAINONLY(https://chat.openai.com/?model=gpt-4-browsing) - openai
-- GETPUBLICSUFFIXDOMAINONLY(https://chat.openai.com/?model=gpt-4-browsing) - com
-- GETSUBDOMAIN(https://chat.openai.com/?model=gpt-4-browsing) - chat.openai.com
-- GETSUBDOMAINONLY(https://chat.openai.com/?model=gpt-4-browsing) - chat
No updates at this time.
Cybersyn is not endorsed by or affiliated with any of these developers. Contact [email protected] for questions.
Last modified 1mo ago