YAML Reference for Defining Data Catalog Project

Define a data catalog project using YAML files containing definitions of the catalog resources.
Available Plans
  • free
  • starter
  • growth
  • enterprise

This guide provides an overview of how to define a data catalog project using YAML files that contain the definitions of your data catalog resources.

Overview

In the context of the rudder-cli tool, a data catalog project typically consists of a root directory that contains all the project files. Within this root directory, each YAML file can contain definitions for resources of a particular type, for example, events, properties, and tracking plans.

The location and naming of these YAML files is flexible, as you can store the YAML files anywhere within the project’s root directory or subdirectories.

You can also group some resources of the same type in the same file, allowing structures that can best serve your project’s requirements. For example, you could have:

  • A events.yaml file in the project’s root directory that defines multiple events.
  • Another file subdirectory/user-events.yaml that defines addtional events.
info
The rudder-cli tool processes all valid YAML files within the project structure to recognize the defined resources.

The following sections detail the specific YAML formats and parameter definitions for each data catalog resource type.

Events

You can define one or more events in the YAML file by setting kind to events.

The spec parameter of the YAML file has the following structure:

PropertyTypeDescription
events
Required
Array of event definitionsAn array of event definitions grouped together in the same file.

Event definition

The event definitions have a structure that depends on the event type. All definitions share some common properties, as listed in the below table:

PropertyTypeDescription
id
Required
StringUnique identifier for the event within the project. This parameter must be unique across all events.
event_type
Required
StringEvent type. Acceptable values are track, identify, page, screen, and group.
descriptionStringEvent description.

Additionally, track events (event_type: track) also support the following property:

PropertyTypeDescription
display_name
Required
StringThe track event name. In other words, this parameter corresponds to the event property of the corresponding RudderStack track event.

Example

version: "rudder/0.1"
kind: "events"
metadata:
  name: "myeventgroup"
spec:
  events:
    - id: "product_viewed"
      display_name: "Product Viewed"
      event_type: "track"
      description: "This event is triggered every time a user views a product."
    - id: "added_to_cart"
      display_name: "Added To Cart"
      event_type: "track"
      description: "This event is triggered every time the user adds a product to their cart."
    - id: "identify"
      event_type: "identify"
      description: "An event that identifies the user."
    - id: "page"
      event_type: "page"

Properties

You can define one or more properties in the YAML file by setting kind to properties.

The spec parameter of the YAML file has the following structure:

PropertyTypeDescription
properties
Required
Array of property definitionsAn array of property definitions grouped together in the same file.

Property definition

A property definition has the following structure:

PropertyTypeDescription
id
Required
StringUnique identifier for the property within the project. This parameter must be unique across all properties in all the YAML files within the project.
display_name
Required
StringThis parameter corresponds to the field inside an event’s properties or traits JSON.
type
Required
StringProperty type.

Acceptable values are: string, integer, number, object, array, boolean, and null.
descriptionStringProperty description.
propConfigProperty config objectAdditional validation rules for the property’s values.

Property config

PropertyTypeDescription
minLengthIntegerMinimum length of the property’s string value.
maxLengthIntegerMaximum length of the property’s string value.
patternStringRegular expression that the property’s string values need to match with.
enumArray of stringsList of all valid values for the property.

Example

version: "rudder/v0.1"
kind: "properties"
metadata:
  name: "my_properties"
spec:
  properties:
    - id: "write_key"
      display_name: "Write Key"
      type: "string"
      description: "KSUID identifier for the source embedded in the SDKs."
      propConfig:
        minLength: 24
        maxLength: 48
    - id: "source_type"
      display_name: "Source Type"
      type: "string"
      description: "The source type."
      propConfig:
        enum:
        - "web"
        - "server"
        - "mobile"
        - "iot"
    - id: "source_name"
      display_name: "Source name"
      description: "Name of the source."
      type: "string"
      propConfig:
        minLength: 2
        maxLength: 255

Tracking plans

You can define a tracking plan in the YAML file by setting kind to tp.

The spec parameter of the YAML file has the following structure:

PropertyTypeDescription
id
Required
StringUnique identifier for the tracking plan within the project. This parameter must be unique across all the tracking plans in the project.
display_name
Required
StringA readable short name for the tracking plan.
descriptionStringTracking plan description.
rules
Required
Array of rules definitionsContains the list of events in the tracking plan along with the rules for their expected properties.

Rules definition

PropertyTypeDescription
type
Required
StringThe rule type. The only acceptable value currently is event_rule.
id
Required
StringRule ID.
event
Required
Rule event definition objectEvent definition associated with the rule along with the validation rules for the tracking plan.
properties
Required
Array of rule property definitionsList of properties associated with the rule’s event along with the validation rules for the tracking plan.

Rule event definition

PropertyTypeDescription
$ref
Required
StringReference to an existing event definition. See Reference catalog resources for more information on how to work with references.
allow_unplannedBooleanValidation rule that checks if the event can have properties other than those defined in the rule’s properties section.

Default value: false
identity_section
Required, for non-track events
StringDefines in which field of the corresponding RudderStack event payload the rule’s properties should be included.

Acceptable values are: properties, traits, and context.traits.

Rule property definition

PropertyTypeDescription
$ref
Required
StringReference to an existing property definition. See Reference catalog resources for more information on how to work with references.
requiredBooleanValidation rule that determines whether the property should always be present in the RudderStack event.

Default value: false

Example

version: "rudder/0.1"
kind: "tp"
metadata:
  name: "first_tracking_plan"
spec:
  id: "first_tracking_plan"
  display_name: "First Tracking Plan"
  description: "First tracking plan for the application."
  rules:
    - type: "event_rule"
      id: "rule_01"
      event:
        $ref: "#/events/web_events/source_created"
      properties:
        - $ref: "#/properties/additional_props/write_key"
          required: true
        - $ref: "#/properties/additional_props/source_type"
          required: false
        - $ref: "#/properties/additional_props/source_name"
          required: false

Reference catalog resources

Definitions in a YAML file can refer to definitions in other files by using the resource reference ($ref) strings - this is useful while defining resources like tracking plans which need to be associated with events and properties defined in other files.

Note that references must always start with a # character followed by the path using the / delimiter. The path can point to a unique resource within a project’s file and is expected to have the following components in order:

  • kind value of the target resource’s file, for example, events, properties.
  • metadata.name value of the target resource’s file.
  • id value of the target resource.

For example, you can refer an event inside a file with metadata.name set to examples and having id as example_id as follows:

- $ref: "#/events/examples/example_id"

Questions? Contact us by email or on Slack