Get a walkthrough of the key Profiles concepts within a guided terminal session.
6 minute read
Profiles Tutorial (pb tutorial) is available only for Snowflake and Google BigQuery warehouses currently.
Also, note that you can use it only with the Profiles Builder (PB) CLI tool. If you are developing your Profiles projects within the web-based editor (IDE), then see IDE project migration for the steps to develop locally.
Profiles Tutorial (pb tutorial command) is a guided interactive tutorial within the Profiles CLI .
This tutorial walks you through the key Profiles concepts and how they work. You will also build a basic Profiles project with an ID Stitcher model and a handful of features, ultimately producing an ID Graph in your warehouse and a view of the customers along with some customer features/attributes.
Overview
The Profiles Tutorial has multiple interactive components and most of the interaction takes place within a terminal session. The goal of this tutorial is to familiarize you with the Profiles product - this includes details on the YAML configuration, how data unification works, what the outputs look like after a run, and how to troubleshoot and build a solid ID graph around a defined entity.
As a part of the tutorial, you will build a demo Profiles project configuration YAML which you will edit directly following the directions provided in the terminal. You will also seed your warehouse with some sample data that is sent to a target schema within your warehouse. You can query this data directly as well as the materialized tables produced by Profiles within the tutorial session.
With the help of this tutorial, you can build your own Profiles project and extend it further to unify your data around a defined entity, building a C360 degree view of this entity, and much more.
Sample data
The sample data used for this tutorial is based on a fictional business called Secure Solutions, LLC.
This fictional business sells security IOT devices and a security management subscription service. They have a number of Shopify stores and a subscription management service, and one physical store where customers can buy security equipment and checkout at a kiosk.
This fictional business decided to use Profiles to more quickly and easily unify different data sources related to their customers as well as help produce a customer 360 table that they can activate in downstream tools.
Prerequisites
Python environment (v3.9.0 to v3.11.10).
Profiles v0.20.0 or above installed locally within the above Python environment.
pip3installprofiles-rudderstack
profiles-mlcorelib library (v0.7.0 or above) installed within the above Python environment (and same as the environment for the profiles-rudderstack library).
If you are developing a Profiles project in the RudderStack dashboard using the web-based editor (IDE), then follow these steps:
Navigate to your Profiles project in the RudderStack dashboard.
Make sure that your latest IDE session is saved to ensure that you have the latest changes. If your project is connected to a remote repo, you can commit those changes as well.
Download the project:
- If in a remote repo, navigate there and download locally.
- If the project is not connected to a remote repo, then navigate to the project’s settings and click Download this project, as shown:
Move the project configuration to the desired local directory.
Navigate to the root of that directory within your terminal.
Before performing the runs locally, verify that you have set up a warehouse connection. See Step 2: Create warehouse connection to create a local file containing your warehouse credentials.
You can successfully run PB and utilize the Profiles Copilot feature after completing the above steps.
Workflow
Run the below command in your terminal to start a tutorial session:
pbtutorial
Note that the session takes between 30 minutes to one hour to complete. Once the tutorial is complete, the session automatically ends within your terminal.
Tutorial components
Introduction to the tutorial and how to interact with it.
Introduction the the fictional business to provide context to the sample data.
Profiles project creation.
Profiles runs: Create an ID Graph.
ID graph QA - Phase 1
Fix the over stitching issues that happen in the first run.
ID graph QA - Phase 2: This includes more detailed analysis and sample queries to run to debug some users who got merged.
Run again to produce a healthy ID graph.
Feature creation: You will build a handful of features on the user entity defined in the tutorial.
Final run to produce a feature view, which is a view within the warehouse where each record is a unique user and each column is a defined feature or attribute on that user.
Get started with your own project
Once you exit the tutorial session and want to begin a a new Profiles project with your own data, follow this Quickstart guide to get started.
FAQ
Is this tutorial powered by an LLM?
No, this tutorial is based on a simple static script meant as a step-by-step guide with some simple built-in validations.
What is the connection built between Profiles and my warehouse?
One of the first steps to creating a Profiles project is to create a site configuration (connection) YAML file that your Profiles project configuration references to run queries within your warehouse. This YAML file is created at the beginning of this tutorial.
You will enter your warehouse credentials within the session terminal and the tutorial then generates the siteconfig.yaml file locally within your user home directory in a hidden folder called .pb.
Can I use the same connection file while creating my own Profiles prohect?
Yes, the siteconfig.yaml file will be locally stored in a .pb directory within the /Users/<user_name> on your local machine. The tutorial creates a connection block within the siteconfig.yaml file. You can then use this as a reference to create a new connection for developing your own project in your desired warehouse account, database, and schema locations.
Does the tutorial session seed my warehouse with data?
Yes, a part of the tutorial will seed your target schema (specified in the connection setup) with sample data for Secure Solutions LLC. It will add the below three tables:
1. PAGES
2. TRACKS
3. IDENTIFIES
These tables will serve as source data for the Profiles configuration built during the tutorial session, for runs.
Does the tutorial output any tables in my warehouse?
Yes, the source data (see above) is used as an input into Profiles’ semantic models configured in the tutorial session.
The materialized tables and views will output to the same target schema configured in the site configuration/connection setup step.
This site uses cookies to improve your experience while you navigate through the website. Out of
these
cookies, the cookies that are categorized as necessary are stored on your browser as they are as
essential
for the working of basic functionalities of the website. We also use third-party cookies that
help
us
analyze and understand how you use this website. These cookies will be stored in your browser
only
with
your
consent. You also have the option to opt-out of these cookies. But opting out of some of these
cookies
may
have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This
category only includes cookies that ensures basic functionalities and security
features of the website. These cookies do not store any personal information.
This site uses cookies to improve your experience. If you want to
learn more about cookies and why we use them, visit our cookie
policy. We'll assume you're ok with this, but you can opt-out if you wish Cookie Settings.