Level up with top data content curated by team RudderStack

Get the data reading guide

Join Databricks, dbt, Fivetran, Hinge, & EssenceVC for a live discussion on the modern data stack.

Register Now
Blog banner

Product

RudderStack Cloud Extract Makes Cloud-to-Warehouse Pipelines Easy

Eric Dodds
Growth at RudderStack

When it comes to your customer data there, tend to be very common silos for many businesses. One is around the event data from different customer touchpoints, frequently residing in your customer data platform and streamed to different tools or in individual analytics tools (product, marketing, etc.).

Another common silo is your customer master data, frequently stored in Salesforce. Another could be around your paid advertising campaigns with data residing in Google Ads, Facebook Ads, and LinkedIn Ads. It only gets more complicated and siloed the more tools you use. Bringing all of these different types of customer data into one place is incredibly difficult, but if you can do it, your analysis can be deeper and drive more meaningful insights.

To help make this challenging technical problem simple, we are launching RudderStack Cloud Extract, including integrations with popular cloud tools like Salesforce, ZenDesk, and many more (even Google Sheets). Cloud Extract enables you to access and integrate data from your product, sales, marketing, support, and finance teams’ cloud tools (and databases/data lakes) to expand the types of analysis your teams can do and make the insights your teams derive more specific, accurate, and actionable.

In this post, we detail Cloud Extract. We explain how Cloud Extract works, some of the benefits of using it, and how to set it up in RudderStack.

How Cloud Extract Works

Cloud Extract allows you to collect raw data from different cloud tools, including Marketo, Facebook Ads, Google Ads, Google Analytics, Google Search Console, HubSpot, LinkedIn Ads, and many more. You can also pull data from databases and data lakes like Postgres, S3, and others.

The raw datasets from Cloud Extract can be routed to supported data warehouses (Snowflake, Google BigQuery, Amazon Redshift, ClickHouse, and PostgreSQL) for analysis.

image-939446c825c4dd63e16240bf6460b4a6bfb35384-630x354-png

Cloud extract architecture

Benefits of Using Cloud Extract

RudderStack Event Stream has always made it easy to aggregate customer event data from all of your digital touchpoints into your data warehouse. Cloud Extract extends this functionality, making it easy to send customer data from all your different third-party tools into your data warehouse. This enables you to easily combine data from different customer teams like product, sales, marketing, or support, so you can surface answers to more complicated, nuanced business questions. Then, with RudderStack Warehouse Actions, you can easily feed these business insights into your pipelines, enriching customer events with data from in-warehouse analysis for activation in your downstream customer tools.

As an example, let's say you want to perform a deeper analysis on churn and understand the relationship between the number and type of customer support tickets and product usage. Normally, these are siloed data sets. With RudderStack, though, you can use Event Stream to collect product usage behavior and Cloud Extract to pull support tickets from Zendesk, then combine both data sets in your warehouse for analysis.

Let's say you identify some leading indicators of churn and use queries on your warehouse to build a "likely to churn" cohort. Using Warehouse Actions, you can pull that cohort of users back through RudderStack as identify calls to update user profiles in all of your downstream tools, giving teams the ability to see and take action on users who have a higher likelihood to churn.

Setting up Cloud Extract in RudderStack

Cloud Extract integrations are easy to set up and maintain. Just use the ready-made connectors to connect to any data source, and your data will start flowing through RudderStack.

Let’s use a real-world example to walk through the setup. We will set up Salesforce as a Cloud Extract source and send the Lead object data to Snowflake.

  • Log in to your RudderStack dashboard.
  • Click on Sources on the left panel of your dashboard. Select Salesforce, and then click on Next.
image-cf614ccdd39f423c98e758be2dcb6f9df7966915-630x301-png

Picking Sources

  • Name your source and click on Next.
image-6690b354c0c0ac5f1f7fcbeb45af0dba2827b9aa-630x222-png

Naming the source

  • Next, you will be required to authenticate your Salesforce account. To do so, click on Connect with Salesforce. After granting the necessary permissions, your account should be successfully connected and visible on the dashboard. Then, click on Next.
image-04257b4820dffca857eeb46b3a050d38108c20fe-630x177-png

Authenticating the Salesforce Account

Note: If you have already logged into your Salesforce account previously, clicking on the Connect with Salesforce option will automatically connect that account to RudderStack. To connect to a different account, you will have to log out of your Salesforce account, then log in to the account you want to connect.
  • In the next window, select the Run Frequency. This configuration controls how often RudderStack will pull data from your Salesforce integration. Then, click on Next.
image-fae14490771ad2342560534891aeb2e208093fd2-630x216-png

Deciding Run Frequency

  • Next, choose the Salesforce data you want to pull through RudderStack and click on Next. You can choose to import selected Salesforce resources or choose all of them. In this example, we want to import the Lead object data, so we select Lead as shown.
image-bf1b7accd716e6154c47cd12a2634a30f90d6b51-630x410-png

Choosing the Data

You have successfully configured Salesforce as a source in your RudderStack pipeline. RudderStack will start ingesting data at the specified frequency.

  • Connect this source to your data warehouse by clicking on Connect Destinations or Add Destinations, as shown:
image-1692da42894fd9a99cfeda5135bdbbb2bcc5ac53-630x364-png

Connecting to the Warehouse

In this example, we will be importing the Lead data into Snowflake. Before setting up Snowflake as a destination, we need to create a database and a warehouse in Snowflake. We have also created an S3 bucket that acts as a staging area for the data flowing into Snowflake.

image-a129a326c63e5c66fb775cd2dd2abb1533bc8d43-630x488-png

Importing Lead Data

  • Next, name the destination and Connect it to your Cloud Source by adding all the required credentials in the Connection Credentials section.
image-bfe1c6eede0b79fc1f2fefd1052fbb40b7ca0bbf-630x404-png

Naming Destinations

image-4fa63a222878aa9a48f06709d355d615364b2d8d-621x1175-png

Cloud Extract

Next, you can choose to add a Transformation. You can choose no transformation, an existing transformation, or create a new one (in this example, we do not need any transformation).

image-c2e603e8fd9d96fb632d20930b724944589f7f8d-630x464-png

Adding a Transformation

That’s it! You have successfully added Snowflake as a destination for your Salesforce source. Your data will sync according to the schedule you defined. You can also trigger sync manually by clicking on Sync Now.

image-07c6603b32e0a601b4a214e34ce8cd71a1de2fe5-630x389-png

Syncing

Once the sync is completed, you can go to your Snowflake Dashboard to verify that the new Lead table is present and has been populated with data from the Salesforce:

image-10711f7514b7b2e6a209fe96488b39a1a8d99ba6-630x357-png

Snowflake Dashboard

To explore the different Cloud Extract Sources and to know more details, visit our documentation.

One of our clients, Proposify, has the following to say:

At Proposify, we get a lot of traffic from organic search sources. Being able to have insight into GSC data to monitor relevant search trends, keyword rankings, and landing page performance is crucial to inform everything from content, SEO, and inbound marketing. RudderStack’s Cloud Extract lets us seamlessly integrate this data into our Redshift warehouse and data modeling workflows for a complete view of our acquisition efforts. It’s a powerful turnkey solution! - Max Werner, Data Operations Manager at Proposify

Sign up for Free and Start Sending Data

Test out our event stream, ELT, and reverse-ETL pipelines. Use our HTTP source to send data in less than 5 minutes, or install one of our 12 SDKs in your website or app. Get started.

image-36d8549083e836ee871ba2a4deb563754e0de204-400x400-png
About the author
Eric Dodds
Eric leads growth at RudderStack and has a long history of helping companies architect customer data stacks to use their data to grow.
Subscription
Subscribe

We'll send you updates from the blog and monthly release notes.