danger

You are viewing documentation for an older version.

Click here to view the latest documentation.

Cohorts

Create core customer segments in your warehouse and use them for targeted campaigns.

Cohort is a subset of entityEntity refers to a digital representation of a class of real world distinct objects for which you can create a profile. instances meeting a specified set of characteristics, behaviors, or attributes. For example, if you have user as an entity, you can define cohorts like known users, new users, or North American users.

Using RudderStack Profiles, you can create the desired cohorts for entities and target specific user segments by enabling targeted campaigns and analysis.

Define cohorts

Profiles lets you define cohorts as a model under model_type field in your profiles.yaml file:

  • Default cohort: When you define an entity, a default cohort <entity>/all (user/all for the user entity) is created automatically. It contains the set of all instances of that entity. Any other cohort you define for that entity is derived from it.
  • Derived cohort: When you define a cohort based on a pre-existing cohort (base cohort), it becomes a derived cohort. A derived cohort inherits the features of the base cohort. You can filter out the member instances of the base cohort based on a set of characteristics, behaviors, or attributes for the derived cohort. You must specify the base cohort in the derived cohort’s definition using the extends field.

For example, known_users is a cohort derived from the base cohort user/all (set of all users), whereas known_mobile_users is derived from its base cohort known_users.

When you run a Profiles project including cohorts, the output of the cohort is stored in a table/view with the same name.

Sample cohort

You can apply filters using the include/exclude clauses to specify a boolean expression over any of the entity vars defined on the base cohort or its ancestors. Certain features might hold relevance only for the specific cohorts. For example, SSN feature may only be applicable for American users.

Example 1: Let’s consider the following profiles.yaml file which defines a cohort knownUsUsers to include users from US with a linked email address.

models:
  - name: knownUsUsers
    model_type: entity_cohort
    model_spec:
      extends: user/all
      materialization:
        output_type: table
      filter_pipeline:
        - type: exclude
            # exclude users which don't have any linked email.
          value: "{{ user.Var('id_type_email_count') }} = 0"
        - type: include
            # include users with country US.
          value: "{{ user.Var('country') }} = 'US'"   

Here, the extends keyword specifies the base cohort users/all. You can also specify the path of a custom defined base cohort, if applicable.

Example 2: Let’s derive the us_credit_card_users cohort from the knownUsUsers as a base cohort. It filters the known US users who possess a credit card. The extends field specifies the path of the base cohort which is models/knownUsUsers.

-  name: us_credit_card_users
     model_type: entity_cohort
     model_spec:
       extends: models/knownUsUsers
       materialization:
         output_type: view
       filter_pipeline:
         - type: include
           value: "{{ user.Var('has_credit_card') }} = 1"

Associate features with cohort

You can also use var_groups to target a cohort instead of an entire entity which will provide a comprehensive 360-degree view combining relevant features.

To do so, associate features with a cohort by specifying the entity_cohort key and passing the cohort’s path to it within a var_group, as shown:

var_groups:
  - name: known_us_users_vars
    entity_cohort: models/knownUsUsers
    vars:
  	- entity_var:
  	    name: has_credit_card
  	    from: inputs/rsIdentifies
  	    select: first_value(has_credit_card)
            where: has_credit_card is not null 
            default: false
  - name: user_vars
    entity_key: user
    vars:
      - entity_var:
          name: max_timestamp
          select: max(timestamp)
          from: inputs/rsIdentifies

To apply the features to the entire user entity, you can use an entity_key in user_vars.

info
In a var_group, you can use either entity_key or entity_cohort but not both. Setting entity_key as user is equivalent to setting entity_cohort as user/all.

Feature view of cohort

You can establish a holistic 360 feature view of a cohort within its definition. This view consolidates all the features associated with the specified identifiers, providing a complete overview of the cohort.

The following example shows how to define a feature view for the knownUsUsers cohort:

models:
  - name: knownUsUsers
    model_type: entity_cohort
    model_spec:
      extends: users/all
      materialization:
        output_type: table
      filter_pipeline:
        - type: exclude
            # exclude users which don't have any linked email.
          value: "{{ user.Var('id_type_email_count') }} = 0"
        - type: include
            # include users with country US.
          value: "{{ user.Var('country') }} = 'US'"
      # to define a 360 feature view of knownUsUsers cohort [optional]
      feature_views:
        # view with entity's `main_id` as identifier
        name: known_us_users_feature_view
        using_ids:
          - id: email
            # view with `email` as identifier
            name: us_users_with_email

Here, the known_us_users_feature_view view contains all the features of the knownUsUsers cohort and uses main_id as the identifier. There is another us_users_with_email view which also contains all the features of the knownUsUsers cohort but uses email as the identifier (specified in using_ids field).

Use cohorts

Once you have defined cohorts in your profiles.yaml file, you can choose to run your project in either of the following ways:

Profile CLI

Run your Profiles CLI project using the pb run command to generate output tables.

Profiles UI

To view cohorts in the RudderStack dashboard, you can make your Profiles CLI project available in a Git repository and import it in the RudderStack dashboard. See Import Profiles Project from Git for more information.

Once imported, you can run your project by navigating to the History tab and clicking Run. After a successful run of the project, you can view the output for cohorts in the Entities tab of the project:

Activation API
info
Contact Profiles support team in RudderStack’s Community Slack if you are unable to see the Entities tab.

You can further activate your cohorts data by syncing it to the downstream destinations. See Activations for more information.


Questions? Contact us by email or on Slack