Announcing Databricks as a Reverse ETL Source in RudderStack

If you’re using Databricks, you can now easily sync data from your data lakehouse to your entire stack with our DataBricks Reverse ETL (rETL) source. This new destination is a big unlock for teams building models or audiences in Databricks who want to push that data to marketing and product tools.

This new integration supports Databricks Unity Catalog and works with RudderStack’s visual data mapper and mirror mode features.

Top use cases for the Databricks Reverse ETL source

With RudderStack's Databricks Reverse ETL source, I can set up micro pipelines to process and sync the most valuable data in our lakehouse to tools used by our product and marketing teams near real-time. This saves us weeks of custom integration work to set up higher frequency pipelines and build out upserting functionality

—Adam Silver, Senior Analytics Engineer at RecRoom

Easily push AI/ML outputs to marketing tools to power better customer experiences
Quickly respond to marketing requests for new data points
Enrich lead and user records with comprehensive data
Efficiently update millions of customer records from your warehouse

Key integration features

Models: Utilize the power of custom SQL queries to build your models. Run these queries on your lakehouse through RudderStack and dispatch the results to any downstream business tool.
Audiences: Our audiences feature enables non-technical users to build basic audiences with a robust user interface. This feature enables data teams to set up self-serve workflows for business teams.
Visual Data Mapper (VDM): Use an intuitive UI to map your Databricks column names to fields in downstream SaaS tools.
Mirror Mode: Mirror Mode ensures your data sent to downstream tools remains an exact replica of your source data in Databricks. It performs the insertion, updating, and deletion of records while synchronizing data to Databricks, ensuring your models stay updated.
Unity Catalog Support: RudderStack supports Databricks Unity Catalog, meaning no changes to your existing modeling or workflows. Simply connect Databricks as a source and start sending data.

Get started

You can start sending data from Databricks to any of RudderStack’s 200+ destination integrations in a few simple steps. Log in to your RudderStack account, create a new source, select “Databricks,” and enter your Databricks credentials.

Next, add a destination, select the Databricks table you want to sync, configure the sync settings and schedule, and data will begin flowing from Databricks to the destination. Check out the Databricks Source documentation for more details.