FAQ
Find answers to frequently asked questions (FAQ)
General
Are you a data broker?
Strictly speaking, no. We do not aim to redistribute others’ proprietary datasets standalone. Any dataset that we make available as a listing on Snowflake Marketplace meets at least some of the following criteria:
- Is transformed to match our centralized schema
- Has normalized entities that match our index of that entity
- Is combined with other datasets
- Is extrapolated or modeled to be representative of the real world
I have a dataset I would like to monetize, do you consult companies in monetization?
Yes, although it depends on the dataset and your business model.
If your dataset is valuable on its own and you intend to monetize it directly, you likely do not need us: you can simply list your dataset on the Snowflake Marketplace. We do not provide consulting services on this subject, although the Snowflake Marketplace team is fantastic!
If your dataset would be more valuable if combined with other data assets or requires significant transformation to make it commercially viable, then we might be the right fit for a partnership. Please Contact Us.
What terms apply?
Most of our listings (both free and paid) follow our standard terms of service. But sometimes we have custom terms for our paid products that will be included in the specific product listing when you purchase via the Snowflake Marketplace.
Cybersyn Data Products & Updates
How do you determine which public data products are free and which are paid?
We aim to make all public domain data, typically from government releases, free of charge for internal use.
We structure that data to make it compatible across products with a common geo_id
. Often, government releases serve
as good benchmarks for external, proprietary data (for example, a proprietary real-time measure of inflation should
align to the monthly Bureau of Labor Statistics Inflation release).
For production use cases, the Cybersyn Foundations paid product includes technical support, external derivative usage, point-in-time history, and backwards compatibility in addition to enterprise-only public datasets.
How often do datasets update and how do I receive updates?
Each dataset (i.e. listing on the Snowflake Marketplace) updates at a different frequency, largely driven by the release time of the underlying data generating process. For instance, public domain datasets from government agencies update as frequently as the government releases new data. You can find release schedules from underlying sources on a dataset's documentation page in the "Data Sources and Release Frequency" section.
All updates across Cybersyn datasets are tracked in our changelog. These updates are summarized in Cybersyn Release notes sent bi-weekly via email. By default, the email used to mount a dataset in Snowflake Marketplace receives the release notes.
How do you measure data quality? How will I know if there are data quality issues?
For public domain data our intention is to pass through data, as accurately and quickly as possible, based on the government releases.
For proprietary data, our responsibility is to ensure that the data is correct, in addition to simply being timely. This is ultimately a subjective decision. Further, it is often controlled by upstream data generators.
Our aim is to alert users to known issues. By default, the email used to mount a dataset in Snowflake Marketplace receives updates for known data issues.
If you see issues with data, please email us at snowflake-public-data@snowflake.com.
Queries on your dataset are slow; how do I fix this?
We aim to optimize our datasets for the most common queries we anticipate users to run. Our data products are intended for broad use cases, so sometimes we may have missed optimizing for your specific needs. We commonly try to optimize performance on our datasets by clustering tables by fields that we anticipate will be used most frequently, but there may be use cases that we do not anticipate. See the Snowflake documentation here for more information.
You can also improve query performance by utilizing other optimization methods, specifically Query Acceleration Service (QAS) and Search Optimization. You will need to copy the data from the Cybersyn shared schema (the schema created when you mounted the listing) into your own schema and enable these accelerations.
Finally, it is worth using Snowflake’s query profiler to understand bottlenecks. Typically, we can help if a datasets experienced a skewed join. If the majority of your query time is spent on table scans, you should increase the size of your warehouse.
If you have a query you are seeing poor performance on, please email us at snowflake-public-data@snowflake.com.
Will you start charging me for a pipeline I built?
We commit not to charge customers for any public domain dataset that has already been made available by Cybersyn for free. Our public datasets serve as lead-gen for our proprietary datasets as well as for benchmarking purposes.
Can I re-distribute your data or use your data in my product?
By default, our terms of service (the contract you agree to when mounting a listing) do not allow for data redistribution. Our data is intended for internal use. If you have a redistribution use case, please Contact Us.
How do you determine which public datasets will be added to the Snowflake Marketplace?
We prioritize datasets based on customer feedback. In very broad strokes, we are most interested in datasets that can be used across industries, are economy oriented, and eventually could correspond to proprietary versions.
Please Contact Us if there is a dataset you would like to have added to our pipeline.
I found a bug, can you fix it?
Yes! Most likely. Please Contact Us or email us at snowflake-public-data@snowflake.com. We would appreciate it if you could include details or a reproducible example of the bug.
Snowflake Marketplace
What is the Snowflake Marketplace?
A marketplace to discover and access third-party data and services directly in the Snowflake Data Cloud. Data consumers securely access live and governed shared data sets directly from their Snowflake account, and receive automatic updates in real time.
Are you only available on Snowflake?
As of today, yes.
What forms of payment are accepted for your products?
The Snowflake Marketplace accepts payment through the following methods:
- Marketplace Capacity Drawdown (using your organization’s Capacity commitment with Snowflake)
- ACH payment
- Wire transfer
- SWIFT transfer
- Credit card
To find out more on paying for listings, see here.
How do you pay with Snowflake credits?
Yes, your organization can use Marketplace Capacity Drawdown (see Snowflake documentation here) to purchase our products. Depending on when the Snowflake agreement was executed, it may require an amendment to your organization’s service agreement to use this option for payment.