Capture Webform Data and Avoid Script Blockers

Why do you need this?

It’s not uncommon for users on your website to run browser extensions or other third-party tools that prevent client-side javascript from firing. In these scenarios, form client-side data being sent via RudderStack Track and Identify calls could be blocked (along with any other client-side data you’re collecting). This is especially challenging with modern JAMStack websites that are deployed as static documents or when using third-party embedded forms from tools like Marketo or HubSpot.

As data engineers, part of our job is to ensure delivery of key data under any condition, so when it comes to things like lead forms on a marketing site, problems could cause major pain for marketing and sales.

Data engineers using RudderStack leverage a simple process to ensure that key events make it through even if client side data is blocked. In this example, we’ll walk through our own use of Basin, a form endpoint, and RudderStack webhooks.

When the form data is submitted to Basin, it is passed as encoded html using the standard http get or post protocols and is therefore not susceptible to client-side blockers. Once received, Basin immediately forwards the entire contents of the form along with some additional metadata to a webhook source in RudderStack. RudderStack can then route those submissions to a handful of downstream tools like Customer.io, Salesforce, Slack and Snowflake.

It’s good to have redundancy for key data

Client-side tracking works really well, so some might argue that for most visitors to your website, an entire additional pipeline is unnecessary. For key data, though, redundancy is good—for many companies, marketing leads are the lifeblood of the business. Also, the ease of implementation and cost of the tooling make it a no brainer to ensure no leads slip through the cracks.

Here’s a step-by-step guide to creating this data flow.

Step 1: Create Your Webhook Source & Destinations

Check out the RudderStack Documentation for specifics, but it’s as simple as creating a new source in RudderStack and selecting the Webhook option under Event Streams.

After you give it a name, you can find the specific URL for your webhook on the settings tab:

The Dataplane URL can be found on the top of your main Connections page.

Once you have the URL for your webhook, record it as you will need to set this in the Basin setup in the next few steps.

When connecting downstream destinations for your webhook, it’s important to consider what types of payloads you will be receiving from your various sources. As we mentioned, Webhooks have no filters and the data you receive may not be in the right format prior to being forwarded to your destinations. User Transformations are a great tool for filtering unnecessary events and modifying payloads into the correct format for each specific destination.

Step 2: Set Up Your Basin Form

You can follow these steps within the Basin Documentation to create an account and create your first endpoint. Your endpoint will produce a URL that you will want to paste into the action command of your website’s HTML form. Your form will look something like this:

Step 3: Create Basin Webhook

With your Basin endpoint created and configured to receive your submitted form data, the next step is to configure Basin to forward the submitted form data to your new RudderStack Webhook source. From the top menu, select the Integrations option:

Scroll to the bottom of the screen and enter the URL for your RudderStack Webhook source from Step 1 and paste it into the URL window. Select JSON as the Payload format:


Click the Save Changes button and you are ready to start receiving events.


Step 4: User Transformations

Now that you are ready to start receiving events via our Webhook, you may find it useful to create a User Transformation to improve the quality of the payload. These include changing type of event from a track call (all webhook source events are track calls) to an Identify call and/or renaming the event name to something else since all webhook events are passed with a generic “webhook event” event name

Change the Payload to an Identify Call

You can call the metadata from the inbound event to filter or transform the event based on the source id for the webhook.

JAVASCRIPT
export async function transformEvent(event, metadata) {
// Use the Data Governance API to get the Source ID
if (metadata(event).sourceId == 'Your Source ID') {
event.type = 'identify';
let traits = {
email: event.properties.email,
firstName: event.properties.firstName,
lastName: event.properties.lastName,
description: event.properties.message
}
if (event.properties.company) { traits.company = event.properties.company }
if (event.properties.jobTitle) { traits.title = event.properties.jobTitle }
traits = Object.assign(traits, leadSources(event.properties.form_id))
event.context = { traits: traits};
delete event.properties;
return event;
}
return;
}

Update the Event.Name to the Webhook Source

By default, all events received through webhook sources are track calls and all have the same name, “WEBHOOK_SOURCE_EVENT”. This is fine if you only have one webhook source, but if you have more than one, all of the events will be inserted into the same webhook_source_event table in your data warehouse. This can be confusing, especially if the sources don’t have differentiating payloads.

We could use a solution like we did for the identify call above and just rename the event based on a hard-coded mapping, but instead we decided to create a RudderStack Transformation Library to call the Data Governance API from which we can map the source ID in our event metadata to find the source name for renaming our event.

In this example we created a library action to call the data governance API itself. Before you get started we would encourage you to check out our Data Governance API Docs.

Calling the data governance API:

JAVASCRIPT
export async function getSourceName(encodedWorkspaceToken, sourceId) {
const resp = await fetch(
'https://api.rudderlabs.com/workspaceConfig',
{ headers: { Authorization: `Basic ${encodedWorkspaceToken}Og==` } }
)
const sourceDisplayName = resp.sources.find(source => source.id === sourceId).name
return sourceDisplayName
}

With our library created, we can now create our transformation or add it to an existing one.

You will need to retrieve the workspace token for your instance of Rudderstack which can be found on the main Connections screen. It will also need to be encoded to Base64 encryption (you could do this with a second library but since in this case it remains static, you can just encode it using a site like https://www.base64encode.net/.)

JAVASCRIPT
import { getSourceName } from 'getSourceDisplayName'
import { base64 } from 'base64'
export async function transformEvent(event, metadata) {
// Enter your base64 encrypted workspace token here below
const WORKSPACE_TOKEN = 'YOUR WORKSPACE TOKEN'
// Get the id of the source that you want the display name of
const SOURCE_ID = metadata(event).sourceId
const SOURCE_NAME = await getSourceName(base64(WORKSPACE_TOKEN), SOURCE_ID)
event.event = SOURCE_NAME
return event;
}

You can see that in a few steps we have retrieved the Source Name for the Webhook sample created above. This could also be passed as a property if you did want to send all webhook events to the same table in your warehouse but needed a way to differentiate them.


Alternatives to the Webhook Solution

Yes there are other ways to solve the problem of client-side scripts being blocked. The most straightforward is to host your own data plane and call the sdk’s from behind your own firewall.

The real takeaway here is that RudderStack is focused on delivering flexible products that support engineers. To learn more about how RudderStack can support your data stack, check out our video library, or sign up for free to test drive the product today.