YAML schema reference for defining a Data Graph with the Rudder CLI — entities, events, and relationships.
Available Plans
growth
enterprise
11 minute read
This reference documents the YAML schema for defining a Data Graph with the Rudder CLI. Use it alongside the CLI to author, version-control, and sync data graph definitions as code.
File structure
A data graph YAML file has the following top-level structure:
Unique ID for the data graph. Used as its stable identifier across syncs.
spec.account_id Required
String
The ID of the warehouse account the data graph reads from.
spec.models Required
List
List of entity and event models that make up the data graph. See Models for more information.
Models
The spec.models list contains all the entities and events the data graph exposes to the Audience Builder. Each model points at a warehouse table and optionally declares relationships to other models.
Model fields
Field
Type
Description
id Required
String
Unique ID for the model within this data graph. Used as the target of relationships (see Relationships).
display_name Required
String
Name shown in the Audience Builder UI (for example, Customers, Sales).
type Required
String
Either entity (dimension-style table) or event (timestamped fact table).
table Required
String
Fully qualified warehouse table name, for example, ECOMMERCE_DB.E_MART.DIM_CUSTOMERS.
description
String
Human-readable description of the model. Shown as a tooltip in the builder.
primary_id Required
String
Column that uniquely identifies a row in the table. Required for entities, Optional for events.
timestamp Required
String
Column holding the event timestamp. Required when type: event. Used for time-window filtering in the Audience Builder. Optional for entities.
relationships Optional
List
List of relationships this model has to other models. See Relationships for more information.
columns Optional
List
Per-column overrides that give warehouse columns a marketer-friendly alias (display_name) and optional description, surfaced in the Audience Builder.
Entity: A dimension-like table representing a business object (Customers, Products, Stores). Use type: entity and set primary_id.
Event: A fact-like table where each row represents something that happened at a point in time (Sales, Customer Interactions, Loyalty Points). Use type: event and set timestamp. Events can be filtered with a time window in the Audience Builder.
Relationships
Relationships connect two models so marketers can filter one model using conditions on related records (for example, “customers with 3 or more orders”). Relationships are declared on the source model under its relationships list.
Relationship fields
Field
Type
Description
id Required
String
Unique ID for the relationship within the source model.
display_name Required
String
Name shown in the Audience Builder UI (for example, Has Sales, Belongs To Account).
Reference to the target model in the form #data-graph-model:<model-id>.
source_join_key Required
String
Column on the source model used in the join.
target_join_key Required
String
Column on the target model used in the join.
Target reference format
Relationship targets use the #data-graph-model:<model-id> reference format, where <model-id> is the id of another model in the same data graph. For example:
target:"#data-graph-model:sales"
Column metadata
By default, the Audience Builder shows the raw warehouse column names (for example, EMAIL_ADDRESS or CREATED_TS). Use the optional columns block on a model to give specific columns a marketer-friendly alias (display_name) and an optional description. Both surface when building audiences and expressions, making the underlying warehouse columns easier to read and choose.
The columns block is sparse — list only the columns you want to override. Columns you don’t list keep their raw warehouse names.
models:- id:"customers"display_name:"Customers"type:"entity"table:"ECOMMERCE_DB.E_MART.DIM_CUSTOMERS"primary_id:"CUSTOMER_KEY"columns:- name:"EMAIL_ADDRESS"# Warehouse column name (must match the table).display_name:"Email"# Friendly name shown in the Audience Builder.description:"Primary contact email"- name:"CUSTOMER_KEY"display_name:"Customer ID"# Alias only — no description.- name:"LOYALTY_NOTES"description:"Free-form loyalty notes"# Description only — no alias.
Column fields
Field
Type
Description
name Required
String
Warehouse column name — must match a column in the model’s table.
display_name Conditional
String
Friendly name shown in the Audience Builder instead of the raw column name. Required unless description is set.
Maximum 255 characters — should be case-insensitive and unique within the model.
description Conditional
String
Human-readable note shown alongside the column in the Audience Builder. Required unless display_name is set.
Maximum 255 characters.
Note that:
Each columns entry must set at least one of display_name or description.
To clear one field while keeping the other, omit it from the entry.
To remove all metadata for a column, drop its entry — the next apply clears it, since the columns block is the source of truth.
Complete example
The following example defines a small e-commerce data graph with two entities (Customers, Accounts), one event (Sales), and the relationships between them:
version:"rudder/v1"kind:"data-graph"metadata:name:"ecommerce-data-graph"spec:id:"ecommerce-data-graph"account_id:"<warehouse-account-id>"# RudderStack generates this ID when you connect a warehouse to your RudderStack workspace.models:# --- Customers (entity) ---- id:"customers"display_name:"Customers"type:"entity"table:"ECOMMERCE_DB.E_MART.DIM_CUSTOMERS"description:"Customers with demographics and loyalty info"primary_id:"CUSTOMER_KEY"columns:- name:"EMAIL_ADDRESS"display_name:"Email"description:"Primary contact email"- name:"LOYALTY_TIER"display_name:"Loyalty Tier"relationships:- id:"customer-has-sales"display_name:"Has Sales"cardinality:"one-to-many"target:"#data-graph-model:sales"source_join_key:"CUSTOMER_KEY"target_join_key:"CUSTOMER_KEY"- id:"customer-belongs-to-account"display_name:"Belongs To Account"cardinality:"many-to-one"target:"#data-graph-model:accounts"source_join_key:"ACCOUNT_KEY"target_join_key:"ACCOUNT_KEY"# --- Accounts (entity) ---- id:"accounts"display_name:"Accounts"type:"entity"table:"ECOMMERCE_DB.E_MART.DIM_ACCOUNTS"description:"Customer account records for individual, household, and corporate grouping"primary_id:"ACCOUNT_KEY"# --- Sales (event) ---- id:"sales"display_name:"Sales"type:"event"table:"ECOMMERCE_DB.E_MART.FACT_SALES"description:"Sales transactions with amounts, status, and store/channel links"timestamp:"CREATED_AT"
Validate the data graph
Validate your data graph YAML before syncing it to your workspace:
rudder-cli validate -l data-graph.yaml
This command returns validation errors and warnings if the YAML is invalid.
Validation rules
Spec version
Filter by phase
Showing 5 of 5 rules
Relationship cardinality must be valid for the source and target model types
A-to-B and B-to-A relationships are distinct and both allowed
spec.yaml
version:rudder/v1kind:data-graphmetadata:name:my-data-graphspec:id:my-data-graphaccount_id:wh-account-123models:- id:userdisplay_name:Usertype:entitytable:db.schema.usersprimary_id:user_idrelationships:- id:user-to-accountdisplay_name:User to Accountcardinality:one-to-onetarget:"#data-graph-model:account"source_join_key:account_idtarget_join_key:account_id- id:accountdisplay_name:Accounttype:entitytable:db.schema.accountsprimary_id:account_idrelationships:- id:account-to-userdisplay_name:Account to Usercardinality:one-to-onetarget:"#data-graph-model:user"source_join_key:account_idtarget_join_key:account_id
Duplicate source-target model pairs are not allowed
This site uses cookies to improve your experience while you navigate through the website. Out of
these
cookies, the cookies that are categorized as necessary are stored on your browser as they are as
essential
for the working of basic functionalities of the website. We also use third-party cookies that
help
us
analyze and understand how you use this website. These cookies will be stored in your browser
only
with
your
consent. You also have the option to opt-out of these cookies. But opting out of some of these
cookies
may
have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This
category only includes cookies that ensures basic functionalities and security
features of the website. These cookies do not store any personal information.
This site uses cookies to improve your experience. If you want to
learn more about cookies and why we use them, visit our cookie
policy. We'll assume you're ok with this, but you can opt-out if you wish Cookie Settings.