Snowflake Destination

Sync data from RudderStack to Snowflake.

Snowflake is a popular cloud-based data warehouse known for its speed, scalability, and reliability.

info
Refer to the Warehouse Schema guide for how events are mapped to Snowflake tables.
success
Find the open source code for this destination in the GitHub repository.

Setting user permissions in Snowflake

To enable RudderStack access, make sure you have ACCOUNTADMIN or an account that has the MANAGE GRANTS privilege.

The following sections illustrate how to set up a virtual warehouse, a database, a role, and an user in Snowflake:

Creating a virtual warehouse

In your Snowflake console, create a X-Small warehouse:

Create a virtual warehouse
info
Set your data warehouse size as per your data volume requirements.

Alternatively, create a new warehouse by running the following SQL commands:

CREATE WAREHOUSE "RUDDER_WAREHOUSE"
  WITH WAREHOUSE_SIZE = 'XSMALL'
    WAREHOUSE_TYPE = 'STANDARD'
    AUTO_SUSPEND = 600
    AUTO_RESUME = TRUE;
info
Set AUTO_SUSPEND to ~10 mins and enable AUTO_RESUME to avoid any extra costs.

Creating a database

warning
Create a new database to avoid conflicts with your existing data; RudderStack creates its own tables while storing your events.

The following image demonstrates the Create Database option in Snowflake.

Create a database

Alternatively, you can create a new database by running the following SQL command:

CREATE DATABASE "RUDDER_EVENTS";

Creating a role for RudderStack

Run the following SQL commands to create a new role with the required permissions to load your data into the newly created warehouse:

  1. Create a new role called RUDDER:
CREATE ROLE "RUDDER";
  1. Grant access to the warehouse RUDDER_WAREHOUSE:
GRANT USAGE ON WAREHOUSE "RUDDER_WAREHOUSE" TO ROLE "RUDDER";
  1. Grant access to the database RUDDER_EVENTS:
GRANT USAGE ON DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";
GRANT CREATE SCHEMA ON DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";
GRANT ALL ON ALL SCHEMAS IN DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";
info
Alternatively, you can also create up a custom role and specify it in the dashboard settings while setting up the Snowflake destination in RudderStack. Note that this role must have the necessary permissions for RudderStack to load the data into your warehouse.

Creating a user

Finally, create a user to connect RudderStack to the newly created warehouse using the following SQL query:

CREATE USER "RUDDER_USER"
  MUST_CHANGE_PASSWORD = FALSE
  DEFAULT_ROLE = "RUDDER"
  PASSWORD = "<your_password>";
GRANT ROLE "RUDDER" TO USER "RUDDER_USER";

Configuring Snowflake destination in RudderStack

To start sending data to Snowflake, you first need to add it as a destination in RudderStack and connect it to a data source. Follow these steps to configure Snowflake as a destination in RudderStack:

  1. From your RudderStack dashboard, configure the data source. Then, select Snowflake from the list of destinations.
  2. Assign a name to your destination and click Continue.

Connection settings

  • Account: Enter the account ID of your Snowflake warehouse. This account ID is part of the Snowflake URL.

The following examples illustrate the slight differences in the account ID for various cloud providers:

Account ID exampleCorresponding Snowflake URLSnowflake cloud provider
qya56091.us-east-1

qya56091.us-east-2.aws
https://qya56091.us-east-1.snowflakecomputing.com

https://qya56091.us-east-2.aws.snowflakecomputing.com
AWS
rx18795.east-us-2.azurehttps://rx18795.east-us-2.azure.snowflakecomputing.comMicrosoft Azure
ah76025.us-central1.gcphttps://ah76025.us-central1.gcp.snowflakecomputing.comGoogle Cloud Platform
warning

In case of AWS, .aws is present in the account locator of some region’s accounts and must be included in the Account field above.

For more information on account locator formats depending on your region or cloud provider, refer to the Snowflake documentation.

warning
Make sure your role has the necessary permissions for RudderStack to load the data into the warehouse.
  • Use Key Pair Authentication: Turn on the toggle to authenticate the user via a key pair.
info

For enhanced security, RudderStack recommends using the key pair authentication over the basic authentication mechanism(username and password).

See the following sections in the Snowflake documentation to generate and use the key pair:

If Use Key Pair Authentication is turned on, enter the following settings:

  • Private Key: Specify the private key. Make sure to include the delimiters.
  • Private Key Passphrase: Specify the password you set while encrypting the private key. Leave this field blank if your private key is not encrypted.
danger
The user authentication will fail if your private key is encrypted and you do not specify the passphrase.
  • Password: If Use Key Pair Authentication is toggled off, then enter the password for the user specified in the User field.
  • Namespace: Enter the schema name for the warehouse where RudderStack will create all tables. If not specified, RudderStack sets the namespace to the source name by default.
warning
You cannot change the namespace later.
  • Sync Frequency: Specify how often RudderStack should sync the data to your Snowflake warehouse.
  • Sync Starting At: This optional setting lets you specify the particular time of the day (in UTC) when you want RudderStack to sync the data to the warehouse.
  • Exclude Window: This optional setting lets you set a time window when RudderStack will not sync the data to the warehouse.

Advanced settings

RudderStack provides the following advanced settings:

  • Warehouse Append: This setting is turned on by default - RudderStack appends your incoming Event Stream data to the existing data in your warehouse. Turning it off causes RudderStack to merge your incoming data into your warehouse to ensure 100% non-duplicate data.
info

The append operation helps to achieve faster data syncs while reducing warehouse costs. However, note that it may increase the number of duplicates in the warehouse, especially if the existing data is older than two weeks. A common scenario where duplication might occur is when the SDKs retry sending events in case of failures.

A merge strategy ensures deduplication but can lead to longer sync times and increased warehouse costs.

  • Skip User Table: Toggle on this setting to send events exclusively to the identifies table and skip the users table, eliminating the need for a merge operation on the users table.
  • Skip Tracks Table: Toggle on this setting to skip sending events to the tracks table.
  • JSON Columns: Lets you ingest semi-structured event data not defined by a fixed schema. You can specify the required JSON column paths in this setting in dot notation, separated by commas. This option applies to all incoming track events for this destination. See JSON Column Support for more information.

Configuring the object storage

RudderStack lets you configure the following object storage settings during set up:

  • Use RudderStack-managed Object Storage: Enable this setting to use RudderStack-managed buckets for object storage.
warning
This option is applicable only for RudderStack-hosted data planes. For self-hosted data planes, you will have to specify your own object storage configuration settings.
  • Choose your Cloud: Select the cloud provider for your Snowflake instance. Refer to the following settings depending on your cloud provider:

See How RudderStack stores data in an object storage platform for more information.

Configuring cloud storage integration with Snowflake

Storage integration setting can be used to run the COPY command. This section lists the steps to configure the Storage Integration setting specified in the Configuring the object storage section above.

IPs to be allowlisted

To enable network access to RudderStack, allowlist the following RudderStack IPs depending on your region and RudderStack Cloud plan:

Plan
Region
US
EU
Free, Starter, and Growth
  • 3.216.35.97
  • 18.214.35.254
  • 23.20.96.9
  • 34.198.90.241
  • 34.211.241.254
  • 52.38.160.231
  • 54.147.40.62
  • 3.123.104.182
  • 3.125.132.33
  • 18.198.90.215
  • 18.196.167.201
Enterprise
  • 3.216.35.97
  • 34.198.90.241
  • 44.236.60.231
  • 54.147.40.62
  • 100.20.239.77
  • 3.66.99.198
  • 3.64.201.167
  • 3.123.104.182
  • 3.125.132.33
info
All the outbound traffic is routed through these RudderStack IPs.

FAQ

While configuring the Snowflake destination, what should I enter in the Account field?

While configuring Snowflake as a destination in RudderStack, you need to enter your Snowflake connection credentials which include the Account field, as shown below:

Snowflake account ID example

The Account field corresponds to the account ID of your Snowflake warehouse and is a part of the Snowflake URL.

The following examples illustrate the slight differences in the Snowflake account ID for various cloud providers:

Account ID exampleCorresponding Snowflake URLSnowflake cloud provider
qya56091.us-east-1

qya56091.us-east-2.aws
https://qya56091.us-east-1.snowflakecomputing.com

https://qya56091.us-east-2.aws.snowflakecomputing.com
AWS
rx18795.east-us-2.azurehttps://rx18795.east-us-2.azure.snowflakecomputing.comMicrosoft Azure
ah76025.us-central1.gcphttps://ah76025.us-central1.gcp.snowflakecomputing.comGoogle Cloud Platform
warning

In case of AWS, .aws is present in the account locator of some region’s accounts and must be included in the Account field above.

For more information on account locator formats depending on your region or cloud provider, refer to the Snowflake documentation.

info
For a more comprehensive FAQ list, refer to the Warehouse FAQ guide.

Questions? Contact us by email or on Slack