How to load data from MailChimp to MS SQL Server
Extract your MailChimp’s Data
First of all, Mailchimp updated its API to v3 recently, so ensure that anything you do will be with this version as the previous are all deprecated although still supported. For more information, visit MailChimp API v3.0 documentation. MailChimp was always a promoter of APIs and encouraged integration with other systems. It has a rich API that exposes a large number of endpoints for interacting with the resources of the applications. More specifically, there are endpoints for the following resources:
- Automations – offers functionality related to automated tasks that we define on MailChimp
- Batch operations – for managing batch processes on our MailChimp account
- Campaign folders – helps you organize your campaigns into groups
- Campaigns – for managing your campaigns
- Conversations – helps you track threads of emails with specific users
- File manager files – for managing your static assets like images
- File manager folders – for creating folders to organize your assets
- Lists – CRUD operations on lists of users
- Reports – for accessing reports with statistics on your campaigns
- Template folders – operations on creating folders for organizing your templates for your emails
- Templates – operations on templates for your emails
Something interesting to note about the MailChimp API is that we don’t see a root-level resource for users anywhere. This makes sense of course, if we take into consideration that MailChimp is all about mail campaigns so users or to put in the correct context, subscribers, are not a stand-alone resource but instead they exist only inside the lists we manage.
It is important to understand that every service perceives the world from a different perspective which of course it’s relevant to the value it offers, so even if what we care about is information about our users, it makes complete sense for someone like MailChimp to organize everything around lists and campaigns.
If we check the model of another service, like Intercom, for example, we’ll notice that the user is at the top resources, which again makes sense because Intercom is all about one-to-one communication with users.
MailChimp and any other service that you might be using have figured out (hopefully) the optimal model for its operations. Still, when we fetch data from them, we usually want to answer questions or do things that are not part of the context that these services operate in, something that makes these models sub-optimal for your analytics needs.
For this reason, we should always keep in mind that when we work with data coming from external services we need to remodel it and bring it to the right form for our needs.
Interacting with the MailChimp REST API can be done by using tools like CURL or Postman or by using http clients for your favorite language or framework. A few suggestions:
- Apache HttpClient for Java
- Spray-client for Scala
- Hyper for Rust
- Ruby rest-client
- Python http-client
The MailChimp REST API supports OAuth 2.0 authentication. More information can be found in the Authorised apps section of the API documentation. After you successfully authenticate with the REST API, you have to start interacting with its resources and start fetching data in order to load them on your data warehouse.
Extract your user data using the MailChimp API
So let’s assume that we want to get all the information we got on MailChimp for our users in order to enrich our user records inside our data warehouse. To do that, we should do the following.
First, we must fetch all the lists that we have created on MailChimp. We can do this by performing a GET request to the appropriate endpoint.
JAVASCRIPT
curl --request GET--url 'https://usX.api.mailchimp.com/3.0/lists'--user 'anystring:apikey'—include
We should get back a response like the following:
JAVASCRIPT
{"lists": [{"id": "57afe96172","name": "Freddie's Jokes","contact": {"company": "MailChimp","address1": "675 Ponce De Leon Ave NE","address2": "Suite 5000","city": "Atlanta","state": "GA","zip": "30308","country": "US","phone": ""},
Using the ID we get from the response for each list, we iterate through all the lists and make requests to the appropriate end-points to get the members for the list
JAVASCRIPT
curl --request GET--url 'https://usX.api.mailchimp.com/3.0/lists/57afe96172/members'--user 'anystring:apikey'—include
and we should get a response back like the following:
JAVASCRIPT
{"members": [{"id": "f777bbffab8d1ceca8b757df63c47cb8","email_address": "urist.mcvankab+1@freddiesjokes.co","unique_email_id": "882e9bec19","email_type": "html","status": "subscribed","status_if_new": "","merge_fields": {"FNAME": "","LNAME": ""},"interests": {"9143cf3bd1": true,"3a2a927344": false,"f9c8f5f0ff": false,"f231b09abc": true,"bd6e66465f": false},
Keep in mind that a user might appear in more than one list, which means that you also must reduplicate your results based on this fact. After we get the results we need and collect all the user-related information from various other resources we need to map it to the model of our Data Warehouse repository before we do the actual loading.
MailChimp Data Preparation for Microsoft SQL Server
As in every relational database, SQL Server requires a well-defined database schema before we start populating with data. Data is organized in schemas, which are distinct namespaces where database objects belong to.
The most common database objects are, of course, tables that have a number of columns, with each one having a declared data type. MS SQL Server supports a large number of different data types, which gives us great flexibility in expressing the data that we have and at the same time optimizing our data warehouse.
When working with data from web services, where data is usually serialized in JSON, it is important to correctly map the data to the right data types. As changing the data types in the future is a process that might cost in downtime of your database, it is important to spend enough time thinking about the proper data type assignments.
For example, dates in JSON are just strings, but when storing data objects in a database, we can enhance analytics with great capabilities by transforming the raw string data into an appropriate date type. A typical strategy for loading data from MailChimp to an SQL Server database is to create a schema where you will map each API endpoint to a table. Each key inside the MailChimp API endpoint response should be mapped to a column of that table, and you should ensure the right conversion to an SQL Server compatible data type.
Of course you must ensure that as the data types from the MailChimp API might change, you will adapt your database tables accordingly. There’s no such thing as automatic data type casting. After you have a complete and well-defined data model or schema for Microsoft SQL Server, you can move forward and start loading your data into the database.
Load data from MailChimp to MS SQL Server
As a feature-rich and mature product, MS SQL Server offers a large and diverse set of methods for loading data into a database. One way of importing data into your database is by using the SQL Server Import and Export Wizard. With it and through a visual interface you will be able to bulk load data using a number of data sources that are supported.
You can import data from another SQL Server, from an Oracle database, from Flat Files, from an Access Data Source, PostgreSQL, MySQL, and finally Azure Blob Storage. Especially if you are using a managed version of MS SQL Server on Azure, you should definitely consider utilizing the Azure Blob Storage connection.
In this way, you will be loading data as Blobs on Azure, and your MS SQL Server database will sync with it through the Import and Export Wizard.
Another way for importing bulk data into an SQL Server, both on Azure and on-premises, is by using the bcp utility. This is a command-line tool that is built specifically for bulk loading and unloading of data by an MS SQL database.
Finally and for compatibility reasons, especially if you are managing databases from different vendors, you can BULK INSERT SQL statements.
In a similar way, and as it happens with the rest of the databases, you can also use the standard INSERT statements, where you will be adding data row-by-row directly to a table. It is the most basic and straightforward way of adding data into a table, but it doesn’t scale very well with larger data sets.
So for bulk datasets, you better consider one of the previous methods.
Updating your MailChimp data on MS SQL Server
As you will be generating more data on MailChimp, you will need to update your older data on an MS SQL Server database. This includes new records together with updates to older records that for any reason have been updated on MailChimp.
You will need to periodically check MailChimp for new data and repeat the process that has been described previously, while updating your currently available data if needed. Updating an already existing row on a SQL Server table is achieved by creating UPDATE statements.
Another issue that you need to take care of is the identification and removal of any duplicate records on your database. Either because MailChimp does not have a mechanism to identify new and updated records or because of errors on your data pipelines, duplicate records might be introduced to your database.
In general, ensuring the quality of the data that is inserted in your database is a big and difficult issue, and MS SQL Server features like TRANSACTIONS can help tremendously, although they do not solve the problem in the general case.
The best way to load data from MailChimp to MS SQL Server
So far, we just scraped the surface of what you can do with MS SQL Server and how to load data into it. Things can get even more complicated if you want to integrate data coming from different sources.
Are you striving to achieve results right now?
Instead of writing, hosting, and maintaining a flexible data infrastructure, use RudderStack, which can handle everything automatically for you.
RudderStack, with one click, integrates with sources or services, creates analytics-ready data, and syncs your MailChimp to MS SQL Server right away.