Feature launch: Append and table skipping for warehouse FinOps
In the past three years, burgeoning cloud costs have collided with cost-cutting pressures driven by macroeconomic uncertainty. Cloud data warehouse spend has come under increasing scrutiny. But customer data is more important than ever for companies fighting to win in this fiercely competitive environment.
Today we’re introducing two new features to help you turn your customer data into competitive advantage while optimizing warehouse spend. All RudderStack customers can now reduce compute time and costs by ingesting events via append. We also now give you the capability to skip users and track tables, the most compute-intensive tables created by RudderStack, as your requirements permit.
These features are perfect for CFOs, data leaders, and data engineers looking to reduce the rising expenses of cloud data computing. They make it easier for you to collect event data and use it to fuel impactful use cases without incurring significant cost increases for every sync.
Control costs for maximum warehouse value
Understanding your entire customer journey and unlocking use cases for attribution, churn reduction, and personalization requires collecting behavioral event data across your digital ecosystem. Our Event Stream product makes this easy. However, when it comes to warehouse spend, streaming data often represent a substantial portion of warehouse data and, consequently, the costs. We recognized this as an acute pain point through conversations with multiple customers. FinOps is a prevalent concern, so we decided to build a comprehensive solution.
The core optimization involves moving ingestion from a merge to an append operation. Until today, our warehouse syncs have defaulted to merge operations. These operations require scanning the entire table before merging each sync. Our new approach uses an append operation and moves the merge operation from the database to the server. The latest operation stores message IDs for the last 14 days and removes duplicates before appending them to the warehouse. This reduces sync time while ensuring no duplicates within the 14-day window. You can run de-duplication operations in your warehouse to remove any older duplicates. Going forward, for every new warehouse that is created, the default will be the append operation, and you’ll have the option to move to the merge operation if preferred.
To enable you to further optimize your warehouse costs, we also now enable you to skip the user and track table via a simple config in your warehouse settings. User and track tables are the most compute-intensive tables created by RudderStack and are typically included with every sync. However, If you don’t use these tables, perform transformations, or use dbt to stitch tables downstream, you can now choose to omit these tables from your syncs.
Customers already using these features – like Mattermost – have reduced warehouse operation time by over 50%
Badri Veeraragavan
Director of Product, RudderStack
These enhancements significantly reduce operation time and warehouse costs. While there may be a small increase in duplicate data (around 0.5%), your overall savings will far outweigh this minor trade-off.
Reducing your warehouse costs is often complex. These features simplify two key optimizations, making it easy for you to deliver quick and substantial FinOps wins.
How it works
To get started, simply adjust your warehouse configuration settings in the RudderStack dashboard. Our team is available to guide you through the process and ensure a smooth transition. Click through the demo below to check it out.
Get started
At RudderStack, we're committed to helping turn customer data into competitive advantage while optimizing your warehouse spend. If you'd like a more in-depth walkthrough of these features or have any questions, please don't hesitate to reach out for a demo.
Stay tuned for more exciting features for warehouse optimization, like custom partitioning, which will further help you maximize the use of your warehouse without incurring significant cost increases.