AI Utilities

Toolbox of functions for calling OpenAI and Anthropic models including ChatGPT and Claude in Snowflake SQL

Overview

Cybersyn AI Utilities is a toolbox of functions for calling OpenAI and Anthropic models such as ChatGPT and Claude in Snowflake SQL. Functions to clean and process URLs and phone numbers within Snowflake queries are also included.

AI UDFs

The user is required to have an API key for the OpenAI or Anthropic functions

Utility UDFs

Unlike the AI functions, the phone number and URL parsing functions do not call external AI models.

Internally, Cybersyn uses these user defined functions to call out to AI tools and more accurately and consistently parse and match data across tables.

OpenAI Functions

The OpenAI functions allow you to call OpenAI models (e.g. ChatGPT) from Snowflake.

FunctionDescription

EVALUATE_OPENAI_PROMPT(model, system_prompt, user_prompt)

Returns string response from OpenAI system_prompt: contents of the system message user_prompt: contents of the user message

model: OpenAI model used, full list of models can be found here

Example

SELECT CybersynFunctionsAppAI.cybersyn.evaluate_openai_prompt(
    'gpt-3.5-turbo',
    'You are financial market expert',
    'What is the name of the company with symbol SNOW. Just the name in json format.'
);

Returns:

{   "company_name": "Snowflake Inc." }

Set Up Instructions

  1. Create network rule to allow for external access:

CREATE OR REPLACE NETWORK RULE DB.SCHEMA.OPENAI_NETWORK_RULE
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('api.openai.com');
  1. Create secret with OpenAI API key:

CREATE OR REPLACE SECRET DB.SCHEMA.OPENAI_API_KEY
  TYPE = GENERIC_STRING
  SECRET_STRING = 'sk-';
  1. Create external access integration:

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION OPENAI_EXTERNAL_ACCESS_INTEGRATION
  ALLOWED_NETWORK_RULES = (DB.SCHEMA.OPENAI_NETWORK_RULE)
  ALLOWED_AUTHENTICATION_SECRETS = (DB.SCHEMA.OPENAI_API_KEY)
  ENABLED = true;
  1. Grant access for the application to use the api key and external integration (replace CybersynFunctionsAppAI with the installed app name):

GRANT USAGE ON DATABASE DB TO APPLICATION CybersynFunctionsAppAI;
GRANT USAGE ON SCHEMA DB.SCHEMA TO APPLICATION CybersynFunctionsAppAI;
GRANT USAGE ON INTEGRATION OPENAI_EXTERNAL_ACCESS_INTEGRATION TO APPLICATION CybersynFunctionsAppAI;
GRANT READ ON SECRET DB.SCHEMA.OPENAI_API_KEY TO APPLICATION CybersynFunctionsAppAI;
  1. Call stored procedure to enable native app to have external access to OpenAI API:

CALL CybersynFunctionsAppAI.cybersyn.init_openai(
    api_key => 'DB.SCHEMA.OPENAI_API_KEY', 
    external_integration => 'OPENAI_EXTERNAL_ACCESS_INTEGRATION');

Anthropic Functions

The Anthropic functions allow you to call Anthropic AI models (e.g. Claude) from Snowflake.

FunctionDescription

EVALUATE_ANTHROPIC_PROMPT(model, max_tokens, system_prompt, user_prompt)

Returns string response from Anthropic

max_tokens: number representing maximum tokens to return

system_prompt: contents of the system message

user_prompt: contents of the user message

model: Anthropic model used, full list of models can be found here

Example

SELECT CybersynFunctionsAppAI.cybersyn.evaluate_anthropic_prompt(
    'claude-3-sonnet-20240229',
    100,
    'You are financial market expert',
    'What is the name of the company with symbol SNOW. Just the name in json format.'
);

Returns:

{"name": "Snowflake Inc."}

Set Up Instructions

  1. Create network rule to allow for external access:

CREATE OR REPLACE NETWORK RULE DB.SCHEMA.ANTHROPIC_NETWORK_RULE
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('api.anthropic.com');
  1. Create secret with Anthropic API key:

CREATE OR REPLACE SECRET DB.SCHEMA.ANTHROPIC_API_KEY
  TYPE = GENERIC_STRING
  SECRET_STRING = 'sk-';
  1. Create external access integration

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION ANTHROPIC_EXTERNAL_ACCESS_INTEGRATION
  ALLOWED_NETWORK_RULES = (DB.SCHEMA.ANTHROPIC_NETWORK_RULE)
  ALLOWED_AUTHENTICATION_SECRETS = (DB.SCHEMA.ANTHROPIC_API_KEY)
  ENABLED = true;
  1. Grant access for the application to use the api key and external integration (replace CybersynFunctionsAppAI with the installed app name):

GRANT USAGE ON DATABASE DB TO APPLICATION CybersynFunctionsAppAI;
GRANT USAGE ON SCHEMA DB.SCHEMA TO APPLICATION CybersynFunctionsAppAI;
GRANT USAGE ON INTEGRATION ANTHROPIC_EXTERNAL_ACCESS_INTEGRATION TO APPLICATION CybersynFunctionsAppAI;
GRANT READ ON SECRET DB.SCHEMA.ANTHROPIC_API_KEY TO APPLICATION CybersynFunctionsAppAI;
  1. Call stored procedure to enable native app to have external access to Anthropic API

CALL CybersynFunctionsAppAI.cybersyn.init_anthropic(
    api_key => 'DB.SCHEMA.ANTHROPIC_API_KEY', 
    external_integration => 'ANTHROPIC_EXTERNAL_ACCESS_INTEGRATION');

Token Counter Functions

Functions that call external API to find number of tokens in the string. This can be used to estimate cost of the prompt.

FunctionDescription

TOKEN_COUNT(prompt, model)

Returns number of tokens in the prompt

prompt: contents of the message

model: is OpenAI model to use, list of models can be found here

Example

SELECT CybersynFunctionsAppAI.cybersyn.token_count(
    'You are financial market expert. What is the name of the company with symbol SNOW. Just the name in json format.',
    'gpt-4'
);

Returns:

25

Set Up Instructions

  1. Create network rule to allow for external access:

CREATE OR REPLACE NETWORK RULE DB.SCHEMA.TOKEN_NETWORK_RULE
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('openaipublic.blob.core.windows.net');
  1. Create external access intergration

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION TOKEN_EXTERNAL_ACCESS_INTEGRATION
  ALLOWED_NETWORK_RULES = (DB.SCHEMA.OPENAI_NETWORK_RULE)
  ENABLED = true;

3. Grant access for the application to use external integration(replace CybersynFunctionsAppAI with the installed app name):

GRANT USAGE ON DATABASE DB TO APPLICATION CybersynFunctionsAppAI;
GRANT USAGE ON SCHEMA DB.SCHEMA TO APPLICATION CybersynFunctionsAppAI;
GRANT USAGE ON INTEGRATION TOKEN_EXTERNAL_ACCESS_INTEGRATION TO APPLICATION CybersynFunctionsAppAI;

4. Call stored procedure to enable native app to have external access to OpenAI API:

CALL CybersynFunctionsAppAI.cybersyn.init_token_count(
    external_integration => 'TOKEN_EXTERNAL_ACCESS_INTEGRATION');

Phone Number Parsing Functions

Clean and process phone numbers directly in Snowflake SQL. The user-defined functions (UDFs) allow users to easily extract phone number from a string directly within Snowflake queries, which can be particularly useful for data cleaning and analysis tasks related to contact information.

The UDFs are all callable in Snowflake SQL and take two parameters:

  • String with a phone number

  • Country code (optional parameter, default value is set to 'US')

FunctionDescription

GETPHONENUMBER(phone, country)

Returns phone number in a national format if it is possible to extract, otherwise result is null

Examples

SELECT CybersynFunctionsAppAI.CYBERSYN.GETPHONENUMBER('My phone number is 619-582-9502'); -- returns (619) 582-9502

SELECT CybersynFunctionsAppAI.CYBERSYN.GETPHONENUMBER('My phone number is 619-582-950'); -- returns null

SELECT CybersynFunctionsAppAI.CYBERSYN.GETPHONENUMBER('020-8366-9179', 'GB'); -- returns 020 8366 9179

URL Parsing Functions

Clean and process URLs directly in Snowflake SQL. The user-defined functions (UDFs) allow users to easily extract website parts from URLs within Snowflake queries, which is particularly useful for data cleaning and analysis tasks related to clickstream and web traffic data.

Each UDF takes a single parameter, URL, which should be a string.

FunctionDescription

GETDOMAIN(url)

Returns the domain with the suffix or top level domain without any URL parameters or subdomains.

GETDOMAINONLY(url)

Returns only the core domain text without any subdomain, parameters, suffix or top level domain.

GETPUBLICSUFFIXDOMAINONLY(url)

Returns only the public suffix domain. Public suffixes include suffixes under which users or companies can register domains.

GETSUBDOMAIN(url)

Returns the domain in addition to the subdomain. The subdomain β€œwww” is excluded from results.

GETSUBDOMAINONLY(url)

Returns only the subdomain portion of the domain.

Internally, Cybersyn uses these functions to extract portions of URLs to more accurately and consistently parse and match domains across datasets. Use these functions to clean your own data or to easily join your internal data to Cybersyn’s web data.

Examples

Parse the domain details for the following url: https://chat.openai.com/?model=gpt-4-browsing

SELECT CybersynFunctionsAppAI.CYBERSYN.GETDOMAIN('https://chat.openai.com/?model=gpt-4-browsing');  -- returns openai.com

SELECT CybersynFunctionsAppAICYBERSYN.GETDOMAINONLY('https://chat.openai.com/?model=gpt-4-browsing'); -- returns openai

SELECT CybersynFunctionsAppAI.CYBERSYN.GETPUBLICSUFFIXDOMAINONLY('https://chat.openai.com/?model=gpt-4-browsing'); -- returns com

SELECT CybersynFunctionsAppAI.CYBERSYN.GETSUBDOMAIN('https://chat.openai.com/?model=gpt-4-browsing'); -- returns chat.openai.com

SELECT CybersynFunctionsAppAI.CYBERSYN.GETSUBDOMAINONLY('https://chat.openai.com/?model=gpt-4-browsing'); -- returns chat

Releases & Changelog

3/6/24: Added OpenAI & Anthropic external functions; added phone number parsing functions; renamed product

Added additional functions that allow users to call out to OpenAI & Anthropic and clean/process phone numbers directly in Snowflake SQL. Update product name to "Cybersyn AI Utilities"

Disclaimer

Your use of OpenAI is subject in all respects to the terms and conditions of OpenAI set forth here. Your use of Anthropic is subject in all respects to the terms and conditions of Anthropic set forth here and here. The URL function is built off of a library licensed from here.

Cybersyn does not have any logging or data collection mechanism as part of your usage of our function.

Cybersyn is not responsible for any use you may make of these services or any result derived therefrom. Contact support@cybersyn.com for questions.

Last updated

Copyright Β© 2024 Cybersyn