How to load data from Intercom to PostgreSQL
Extract your data from Intercom
Intercom exposes a rich REST API for interacting with its services. There are two reasons someone would like to use the Intercom REST API:
- To pull data from it and use it for analytic purposes
- To push data into to enrich the information it contains for customers and enhance its service
In this post, we’ll focus mainly on pulling data from the API and use it to enrich our data warehouse with data that are generated from our interactions with our customers.
The main entities of the Intercom API are the following:
- Users – which is the primary way of interacting with Intercom.
- Leads – represent logged-out users of your application.
- Companies – allow you to represent commercial organizations using your product.
- Tags – A tag allows you to label your users and companies and list them using that tag.
- Segments – A segment is a group of your users defined by the rules that you set.
- Notes – Notes allow you to annotate and comment on your users.
- Events – Events are how you can submit user activity to Intercom.
- Counts – You can get counts of users and companies filtered by certain criteria.
- Conversations – Conversations are how you can communicate with users in Intercom.
At this point, it has to be noted that not all of the above entities can be pulled out from the Intercom API. For example, Events can only be pushed inside Intercom and it’s not possible to extract them again. So if you plan to use Intercom also for tracking the behavior of your users keep that in mind, because contrary to services like Mixpanel, it’s not possible to pull the user events from the system.
A good strategy for ensuring that your user activity will be pushed on all the different services that you need and that you will always have access to the data is by using a service like Segment.
Intercom Exposes a RESTful web API, which means that we interact with its resources through HTTP verbs like POST and GET by using an HTTP client. Intercom also offers a number of SDKs that are built around an HTTP client for a number of popular languages.
Pull Data from the Intercom REST API
A typical use case for pulling data out of Intercom is to fetch all your users together with all the conversations you have done with each one. Then you can load this information to your data warehouse and enhance your analytic capabilities with additional interactions that you had with them. To do that you first need to get all your users, with CURL you can do that in the following way:
SH
curl https://api.intercom.io/users-u pi3243fa:da39a3ee5e6b4b0d3255bfef95601890afd80709-H 'Accept: application/json'
A typical result will look like this:
JSON
{"type": "user.list","total_count": 105,"users": [{"type": "user","id": "530370b477ad7120001d",...},...],"pages": {"next": "https://api.intercom.io/users?per_page=50&page=2","page": 1,"per_page": 50,"total_pages": 3}}
Now we can also extract a full list of the conversations that have been performed on Intercom by doing the following:
SH
$ curlhttps://api.intercom.io/conversations?type=admin&admin_id=25&open=true-u pi3243fa:da39a3ee5e6b4b0d3255bfef95601890afd80709-H 'Accept:application/json'
and a typical result will look like the following:
JSON
{"type": "conversation.list","conversations": [{"type": "conversation","id": "147","created_at": 1400850973,"updated_at": 1400857494,"user": {"type": "user","id": "536e564f316c83104c000020"},"assignee": {"type": "admin","id": "25"},"conversation_message": {"type": "conversation_message","subject": "","body": "<p>Hi Alice,</p>nn<p>We noticed you using our Product, do you have any questions?</p> n<p>- Jane</p>","author": {"type": "admin","id": "25"},"attachments": [{"name": "signature","url": "http://someurl.com/signature.jpg"}]}}]}
As we can see, each conversation contains a user object which contains an id, in this way we can associate the conversations with the users we had extracted earlier. Of course, in order to do that on our Data warehouse repository, we need to map the above structures to the data model that the repository follows by respecting both the schema and the data types.
Then we can write a pipeline that will extract data and transform it into the model of our repository and load the data by following the instructions that follow below. Of course, if something changes on the Intercom API the pipeline will break and we will have to take care of it.
Use Webhooks to Push Events from Intercom to PostgreSQL
Intercom also supports the definition of webhooks, where you can register certain events and the system will push there a message whenever the events are triggered. So for example you might define a webhook that will push a message every time a new conversation is performed with your customers.
By doing this it is possible to create a near real-time streaming load process for your data warehouse. In order to implement something like that though, you need to take into consideration the limitations of both ends, while ensuring the delivery semantics that your use case requires for the data management infrastructure that you will build.
For more information, you can check the webhooks section on the Intercom API documentation.
Data Preparation for Intercom to PostgreSQL
To populate a PostgreSQL database instance with data, first, you need to have a well-defined data model or schema that describes the data. As a relational database, PostgreSQL organizes data around tables.
Each table is a collection of columns with a predefined data type as an integer or VARCHAR. PostgreSQL, like any other SQL database, supports a wide range of different data types.
A typical strategy for loading data from Intercom to PostgreSQL is to create a schema where you will map each API endpoint to a table. Each key inside the Intercom API endpoint response should be mapped to a column of that table and you should ensure the right conversion to a PostgreSQL compatible data type.
For example, if an endpoint from Intercom returns a value as String, you should convert it into a VARCHAR with a predefined max size or TEXT data type. tables can then be created on your database using the CREATE SQL statement.
Of course, you will need to ensure that as the types of data from the Intercom API might change, you will adapt your database tables accordingly, there’s no such thing as automatic data typecasting.
After you have a complete and well-defined data model or schema for PostgreSQL, you can move forward and start loading your data into the database.
Load data from Intercom to PostgreSQL
Once you have defined your schema and you have created your tables with the proper data types, you can start loading data into your database.
The most straightforward way to insert data into a PostgreSQL database is by creating and executing INSERT statements. With INSERT statements, you will be adding data row-by-row directly to a table. It is the most basic and straightforward way of adding data into a table but it doesn’t scale very well with larger data sets.
The preferred way for adding larger datasets into a PostgreSQL database is by using the COPY command. COPY is copying data from a file on a file system that is accessible by the Postgres instance, in this way much larger datasets can be inserted into the database in less time.
You should also consult the documentation of PostgreSQL on how to populate a database with data. It includes a number of very useful best practices on how to optimize the process of loading data into your Postgres database.
COPY requires physical access to a file system in order to load data.
Nowadays, with cloud-based, fully managed databases, getting direct access to a file system is not always possible. If this is the case and you cannot use a COPY statement, then another option is to use PREPARE together with INSERT, to end up with optimized and more performant INSERT queries.
Updating your Intercom data on PostgreSQL
As you will be generating more data on Intercom, you will need to update your older data on PostgreSQL. This includes new records together with updates to older records that for any reason have been updated on Intercom.
You will need to periodically check Intercom for new data and repeat the process that has been described previously while updating your currently available data if needed. Updating an already existing row on a PostgreSQL table is achieved by creating UPDATE statements.
Another issue that you need to take care of is the identification and removal of any duplicate records on your database. Either because Intercom does not have a mechanism to identify new and updated records or because of errors on your data pipelines, duplicate records might be introduced to your database.
In general, ensuring the quality of data that is inserted in your database is a big and difficult issue and PostgreSQL features like TRANSACTIONS can help tremendously, although they do not solve the problem in the general case.
The best way to load data from Intercom to PostgreSQL
So far we just scraped the surface of what you can do with PostgreSQL and how to load data into it. Things can get even more complicated if you want to integrate data coming from different sources.
Are you striving to achieve results right now?
Instead of writing, hosting, and maintaining a flexible data infrastructure use RudderStack that can handle everything automatically for you.
RudderStack with one click integrates with sources or services, creates analytics-ready data, and syncs your Intercom to PostgreSQL right away.