RudderStack Logo
  • Product
    • RudderStack Cloud

      Fully managed, scalable and production ready customer data pipelines for your data infrastructure.

    • RudderStack Open Source

      All the core features and integrations that make RudderStack the customer data pipeline of your data infrastructure.

    • Event Stream
    • Warehouse Actions
    • Cloud Extract
  • Learn
    • Blog

      Read articles, feature announcements, community highlights and everything around data.

    • Video Library

      Watch tutorials on how to get the most out of RudderStack and your Customer Data.

    • Migration Guides

      Howtos and best practises for migrating from platforms like Snowplow and Segment to RudderStack.

    • Documentation
    • Segment Comparison
    • Snowplow Comparison
    • Case Studies
  • Integrations
  • Docs
  • Pricing
  • Login
  • Sign up free
Sign up free
RudderStack Cloud Extract Makes Cloud-to-Warehouse Pipelines Easy

RudderStack Cloud Extract Makes Cloud-to-Warehouse Pipelines Easy

By Eric Dodds/January 25, 2021

When it comes to your customer data there tend to be very common silos for many businesses. One is around the event data from different customer touchpoints, frequently residing in your customer data platform and streamed to different tools, or in individual analytics tools (product, marketing, etc.).

Another common silo is your customer master data, frequently stored in Salesforce. Another could be around your paid advertising campaigns with data residing in Google Ads, Facebook Ads, and LinkedIn Ads. It only gets more complicated and siloed the more tools you use. Bringing all of these different types of customer data into one place is incredibly difficult, but, if you can do it, your analysis can be deeper and drive more meaningful insights.

To help make this challenging technical problem simple, we are launching RudderStack Cloud Extract, including integrations with popular cloud tools like Salesforce, ZenDesk, and many more (even Google Sheets). Cloud Extract enables you to access and integrate data from your product, sales, marketing, support, and finance teams’ cloud tools (and databases/data lakes) to expand the types of analysis your teams can do and make the insights your teams derive more specific, accurate, and actionable.

In this post, we detail Cloud Extract. We explain how Cloud Extract works, some of the benefits of using it, and how to set it up in RudderStack.

How Cloud Extract Works

Cloud Extract allows you to collect raw data from different cloud tools, including Marketo, Facebook Ads, Google Ads, Google Analytics, Google Search Console, HubSpot, LinkedIn Ads, and many more. You can also pull data from databases and data lakes like Postgres, S3, and others.

The raw datasets from Cloud Extract can be routed to supported data warehouses (Snowflake, Google BigQuery, Amazon Redshift, ClickHouse, and PostgreSQL) for analysis.

rudderstack cloud extract architecture
Cloud Extract Architecture

Benefits of Using Cloud Extract

RudderStack Event Stream has always made it easy to aggregate customer event data from all of your digital touchpoints into your data warehouse. Cloud Extract extends this functionality, making it easy to send customer data from all your different third-party tools into your data warehouse. This enables you to easily combine data from different customer teams like product, sales, marketing, or support, so you can surface answers to more complicated, nuanced business questions. Then, with RudderStack Warehouse Actions, you can easily feed these business insights into your pipelines, enriching customer events with data from in-warehouse analysis for activation in your downstream customer tools.

As an example, let’s say you want to perform a deeper analysis on churn and understand the relationship between number and type of customer support tickets and product usage. Normally, these are siloed data sets. With RudderStack, though, you can use Event Stream to collect product usage behavior and Cloud Extract to pull support tickets from Zendesk, then combine both data sets in your warehouse for analysis.

Let’s say you identify some leading indicators of churn and use queries on your warehouse to build a “likely to churn” cohort. Using Warehouse Actions, you can pull that cohort of users back through RudderStack as identify calls to update user profiles in all of your downstream tools, giving teams the ability to see and take action on users who have a higher likelihood to churn.

Setting up Cloud Extract in RudderStack

Cloud Extract integrations are easy to set up and maintain. Just use the ready-made connectors to connect to any data source, and your data will start flowing through RudderStack.

Let’s use a real world example to walk through setup. We will set up Salesforce as a Cloud Extract source and send the Lead object data to Snowflake.

  1. Log in to your RudderStack dashboard.
  2. Click on Sources on the left panel of your dashboard. Select Salesforce, and then click on Next.

Cloud2
Picking Sources

  1. Name your source and click on Next.

Cloud3
Naming the Source

  1. Next, you will be required to authenticate your Salesforce account . To do so, click on Connect with Salesforce. After granting the necessary permissions, your account should be successfully connected and visible on the dashboard. Then, click on Next.

Cloud4
Authenticating the Salesforce Account

Note: If you have already logged into your Salesforce account previously, clicking on the Connect with Salesforce option will automatically connect that account to RudderStack. To connect  a different account, you will have to log out of your Salesforce account, then log in to the account you want to connect.
  1. In the next window, select the Run Frequency. This configuration controls how often RudderStack will pull data from your Salesforce integration. Then, click on Next.

Cloud5
Deciding Run Frequency

  1. Next, choose the Salesforce data you want to pull through RudderStack and click on Next. You can choose to import selected Salesforce resources or choose all of them. In this example, we want to import the Lead object data, so we select Lead as shown.

Cloud6
Choosing the Data

You have successfully configured Salesforce as a source in your RudderStack pipeline. RudderStack will start ingesting data at the specified frequency.

  1. Connect this source to your data warehouse by clicking on Connect Destinations or Add Destinations, as shown:

Cloud7
Connecting to the Warehouse

In this example, we will be importing the Lead data into Snowflake. Before setting up Snowflake as a destination, we need to create a database and a warehouse in Snowflake. We have also created an S3 bucket which acts as a staging area for the data flowing into Snowflake.

Cloud8
Importing Lead Data

  1. Next, name the destination and Connect it to your Cloud Source by adding all the required credentials in the Connection Credentials **section.**

Cloud9
Naming Destinations

Cloud10
Cloud Extract

  1. Next, you can choose to add a Transformation. You can choose no transformation, an existing transformation, or create a new one (in this example we do not need any transformation).

Cloud11
Adding a Transformation

That’s it! You have successfully added Snowflake as a destination for your Salesforce source. Your data will sync according to the schedule you defined. You can also trigger a sync manually by clicking on Sync Now.

Cloud12
Syncing

Once the sync is completed, you can go to your Snowflake Dashboard to verify that the new Lead table is present and has been populated with data from the Salesforce:

Cloud13
Snowflake Dashboard

To explore the different Cloud Extract Sources and to know more details, visit our documentation.

One of our clients, Proposify has the following to say:

At Proposify, we get a lot of traffic from organic search sources. Being able to have insight into GSC data to monitor relevant search trends, keyword rankings, and landing page performance is crucial to inform everything from content, SEO, and inbound marketing. RudderStack’s Cloud Extract lets us seamlessly integrate this data into our Redshift warehouse and data modeling workflows for a complete view of our acquisition efforts. It’s a powerful turnkey solution! - Max Werner, Data Operations Manager at Proposify

Try RudderStack Today

Start building a smarter customer data pipeline. Use all your customer data. Answer more difficult questions. Send insights to your whole customer data stack. Sign up for RudderStack Cloud Free today.

Join our Slack to chat with our team, check out our open source repos on GitHub, subscribe to our blog, and follow us on social: Twitter, LinkedIn, dev.to, Medium, YouTube. Don’t miss out on any updates. Subscribe to our blogs today!

Eric Dodds
Eric Dodds
Eric leads our Customer Success team and has a long history of helping companies architect customer data stacks and use their data to grow.

Recent Posts

Reverse ETL is Just Another Data Pipeline
Reverse ETL is Just Another Data Pipeline
By Soumyadeb Mitra/February 24, 2021
Astasia from RedPoint Ventures wrote a great post on new technologies supporting “reverse ETL” functionality in the customer data…
Read More →
The Complete Customer Data Stack: Data Collection (Part 1)
The Complete Customer Data Stack: Data Collection (Part 1)
By Kostas/February 21, 2021
The Importance of Categories Even the best possible data stack is completely useless without data. For this reason, the first…
Read More →
The Complete Customer Data Stack: Data Collection (Part 2)
The Complete Customer Data Stack: Data Collection (Part 2)
By Kostas/February 28, 2021
Relational Data and Beyond In part one, we talked about the importance of taking a holistic view of both data and infrastructure…
Read More →

Subscribe

We'll send you updates from the blog and monthly release notes.

Explore RudderStack Today


⚡ Our Free plan includes 500,000 events per month so you can explore and test the product.

Install an SDK, connect a destination, and see data start to flow.


Sign up free

Company

  • About
  • Contact Us
  • We're Hiring!
  • Privacy Policy
  • Terms of Service

Product

  • RudderStack Cloud
  • Open Source
  • Segment Comparison
  • Snowplow Comparison

Resources

  • Blog
  • Video Library
  • Documentation
  • Slack Community
  • The DataStack Show Podcast

JOIN THE CONVERSATION

Learn more about the product and how other engineers are building their customer data pipelines.

Join our Slack Community

READ OUR DOCUMENTATION

Technical documentation on using RudderStack to collect, route and manage your event data securely.

Go to docs
RudderStack Logo
© RudderLabs Inc.