Version:

Profiles Tutorial

Get a walkthrough of the key Profiles concepts within a guided terminal session.

info

Profiles Tutorial (pb tutorial) is available only for Snowflake and Google BigQuery warehouses currently.

Also, note that you can use it only with the Profiles Builder (PB) CLI tool. If you are developing your Profiles projects within the web-based editor (IDE), then see IDE project migration for the steps to develop locally.

Profiles Tutorial (pb tutorial command) is a guided interactive tutorial within the Profiles CLI .

This tutorial walks you through the key Profiles concepts and how they work. You will also build a basic Profiles project with an ID Stitcher model and a handful of features, ultimately producing an ID Graph in your warehouse and a view of the customers along with some customer features/attributes.

Overview

The Profiles Tutorial has multiple interactive components and most of the interaction takes place within a terminal session. The goal of this tutorial is to familiarize you with the Profiles product - this includes details on the YAML configuration, how data unification works, what the outputs look like after a run, and how to troubleshoot and build a solid ID graph around a defined entity.

As a part of the tutorial, you will build a demo Profiles project configuration YAML which you will edit directly following the directions provided in the terminal. You will also seed your warehouse with some sample data that is sent to a target schema within your warehouse. You can query this data directly as well as the materialized tables produced by Profiles within the tutorial session.

With the help of this tutorial, you can build your own Profiles project and extend it further to unify your data around a defined entity, building a C360 degree view of this entity, and much more.

Sample data

The sample data used for this tutorial is based on a fictional business called Secure Solutions, LLC.

This fictional business sells security IOT devices and a security management subscription service. They have a number of Shopify stores and a subscription management service, and one physical store where customers can buy security equipment and checkout at a kiosk.

This fictional business decided to use Profiles to more quickly and easily unify different data sources related to their customers as well as help produce a customer 360 table that they can activate in downstream tools.

Prerequisites

  • Python environment (v3.9.0 to v3.11.10).
  • Profiles v0.20.0 or above installed locally within the above Python environment.
pip3 install profiles-rudderstack
  • profiles-mlcorelib library (v0.7.0 or above) installed within the above Python environment (and same as the environment for the profiles-rudderstack library).
pip install profiles-mlcorelib>=0.7.0

IDE project migration

If you are developing a Profiles project in the RudderStack dashboard using the web-based editor (IDE), then follow these steps:

  1. Navigate to your Profiles project in the RudderStack dashboard.
  2. Make sure that your latest IDE session is saved to ensure that you have the latest changes. If your project is connected to a remote repo, you can commit those changes as well.
  3. Download the project: - If in a remote repo, navigate there and download locally. - If the project is not connected to a remote repo, then navigate to the project’s settings and click Download this project, as shown:
Download Profiles project
  1. Move the project configuration to the desired local directory.
  2. Navigate to the root of that directory within your terminal.
  3. Before performing the runs locally, verify that you have set up a warehouse connection. See Step 2: Create warehouse connection to create a local file containing your warehouse credentials.

You can successfully run PB and utilize the Profiles Copilot feature after completing the above steps.

Workflow

Run the below command in your terminal to start a tutorial session:

pb tutorial

Note that the session takes between 30 minutes to one hour to complete. Once the tutorial is complete, the session automatically ends within your terminal.

Tutorial components

  • Introduction to the tutorial and how to interact with it.
  • Introduction the the fictional business to provide context to the sample data.
  • Profiles project creation.
  • Profiles runs: Create an ID Graph.
  • ID graph QA - Phase 1
  • Fix the over stitching issues that happen in the first run.
  • ID graph QA - Phase 2: This includes more detailed analysis and sample queries to run to debug some users who got merged.
  • Run again to produce a healthy ID graph.
  • Feature creation: You will build a handful of features on the user entity defined in the tutorial.
  • Final run to produce a feature view, which is a view within the warehouse where each record is a unique user and each column is a defined feature or attribute on that user.

Get started with your own project

Once you exit the tutorial session and want to begin a a new Profiles project with your own data, follow this Quickstart guide to get started.

FAQ

Is this tutorial powered by an LLM?

No, this tutorial is based on a simple static script meant as a step-by-step guide with some simple built-in validations.

What is the connection built between Profiles and my warehouse?

One of the first steps to creating a Profiles project is to create a site configuration (connection) YAML file that your Profiles project configuration references to run queries within your warehouse. This YAML file is created at the beginning of this tutorial.

You will enter your warehouse credentials within the session terminal and the tutorial then generates the siteconfig.yaml file locally within your user home directory in a hidden folder called .pb.

Can I use the same connection file while creating my own Profiles prohect?

Yes, the siteconfig.yaml file will be locally stored in a .pb directory within the /Users/<user_name> on your local machine. The tutorial creates a connection block within the siteconfig.yaml file. You can then use this as a reference to create a new connection for developing your own project in your desired warehouse account, database, and schema locations.

Does the tutorial session seed my warehouse with data?

Yes, a part of the tutorial will seed your target schema (specified in the connection setup) with sample data for Secure Solutions LLC. It will add the below three tables: 1. PAGES 2. TRACKS 3. IDENTIFIES

These tables will serve as source data for the Profiles configuration built during the tutorial session, for runs.

Does the tutorial output any tables in my warehouse?

Yes, the source data (see above) is used as an input into Profiles’ semantic models configured in the tutorial session.

The materialized tables and views will output to the same target schema configured in the site configuration/connection setup step.



Questions? Contact us by email or on Slack