Customer Data Infrastructure (CDI) is a typical example of a Data-Intensive Application. Martin Kleppmann’s book Design Data-Intensive Applications does an amazing job of explaining what a data-intensive application is. CDI, at its core, is an infrastructure for capturing, processing, and routing streams of events from applications. Routing in Customer Data Infrastructure Routing might not be the most common
Dealing with event data is dirty work at times. Developers may transmit events with errors because of a change a developer made. Also, sometimes errors could be introduced if the data engineering team decides to change something on the data warehouse schema. Due to these changes to the schema, data type conflict may occur. How
This blog presents an approach for routing data to RudderStack using Amazon Kinesis and AWS Lambda Functions. Introduction Many organizations today make use of streaming event data from their applications and websites. For collecting the data streams, they use tools like Amazon Kinesis. But how can these businesses turn the data streams into actionable insights?
In our previous blog, The Tale of Identity Graph and Identity Resolution, we described the problem of identity resolution. We used a concrete example of a user visiting an eCommerce site from multiple websites. Specifically, we showed how the app events can be associated with multiple identities and how these identities can be tied together using
We are pleased to announce a new major release for RudderStack. Make sure you check out our Github Repository for our RudderStack v0.1.9 release. RudderStack v0.1.9 release includes some new features as well as critical bug fixes. Overall, it focuses on the enhancement of already existing features for increased control and performance. Integrations This release
In our previous article on Game Analytics for Mobile, we showed how to build an open-source analytics solution using RudderStack. As highlighted in the article, understanding users is crucial to every analysis. Hence, it is essential to tie the events or activities to the individual users generating those events. Analytics platforms help collect this data. Unfortunately, this is
RudderStack is an open-source platform for collecting and routing your customer event data (commonly known as customer data infrastructure or CDI platform). RudderStack is enterprise-ready, with a special focus on data privacy and security. This blog talks more about RudderStack and RudderStack Transformations that allow you to customize your customer data platforms. We started building
It is 2020, and discussions around data-ownership and privacy have finally crossed the Atlantic and reached the American boardroom, thanks to CCPA coming into force. Companies, both big and small are – or should be – taking stock of all their data-sharing activities, and the first thing usually discussed is the use of data analytics.
Introduction Personally Identifiable Information (PII) is the information that may be used to identify and track an individual. GDPR mandates software companies to encrypt any PII and ensure that they protect the users’ identity from any misuse. As a result, In a post-GDPR world, all organizations need to detect and mask/obfuscate/delete PII data flowing through