How to load data from Braintree to SQL Data Warehouse

Extract your data from Braintree

Braintree, as it is common with payment gateways, exposes an API that can be used to integrate a product with payment services. Access to this API happens through a number of clients or SDKs that Braintree offers:

Instead of a public REST API, Braintree provides client libraries in seven languages to ease integration with our gateway. This choice is deliberate as Braintree believes that in this way they can guarantee:

  1. Better security
  2. Better platform support. And
  3. Backward compatibility

The languages they targeted with their SDKs cover the majority of the frameworks and needs. For example, with the Java SDK, they can also support the rest of the JVM languages like Scala and Clojure.

Braintree API Authentication

To authenticate against the Braintree API and perform either transaction or pull data, the following credentials are required.

  1. Public key: user-specific public identifier
  2. Private key: a user-specific secure identifier that should not be shared.
  3. Merchant ID: unique identifier for the gateway account.
  4. Environment: Sandbox (for testing) or production.

For more information on how to retrieve the above information, you can check the credentials documentation.

Braintree API Rate Limiting

For a system that handles payments, rate limiting doesn’t really make sense. I guess you wouldn’t like to see some of your payments failing because it happens that you have too many customers you are dying to pay you. For this reason, Braintree has implemented some really sophisticated algorithms to ensure that if one of their users goes crazy for any reason, this will not affect the others. So they are actually operating outside of the conventional practices of setting up rate limits. Nevertheless, you should always make sure that you respect the service you are interacting with and the code you write is not abusing it.

Endpoints and Available Resources

The Braintree API exposes a number of resources through the available SDKs, with these you can interact with the service and perform anything that is part of the functionalities of the Braintree platform.

  • Add-ons: returns a collection of all the add-ons that are available.
  • Address: through this resource, you can create and manage addresses for your customers. There’s a limit of 50 addresses per customer and a customer ID is always required for the operations associated with this resource.
  • Client Token: This resource is available for creating tokens that will authenticate your client to the Braintree platform.
  • Credit Card: Deprecated
  • Credit card verification: Returns information related to the verification of credit cards.
  • Customer: your customer with all the information needed in Braintree to perform payments
  • Discount: Access to all the discounts that you have created on the Braintree platform.
  • Merchant Account: information about merchants on the platform.
  • Payment methods: Objects that represent payments
  • Plan: Information about the different plans that you have created in the Braintree platform.
  • Settlement Batch Summary: The settlement batch summary displays the total sales and credits for each batch for a particular date.
  • Subscription: All the subscriptions that have been created on behalf of your customers inside the Braintree platform.
  • Transaction: This functionality is specific to Marketplace

All the above resources are manipulated through the SDKs that Braintree maintains. In most cases, the full range of CRUD operations is supported, unless it doesn’t make sense or if there are security concerns. In general, you can interact with everything that is available on the platform. Through the same SDKs, we can all fetch information that we can then store locally to perform our analytics. Each one can offer back all its results that we can consume, let’s assume that we want to get a list of all the Customers we have with all their associated data. In order to do that we first need to perform a search query on the Braintree API, for example in Java:

JAVASCRIPT
CustomerSearchRequest request = new CustomerSearchRequest()
.id().is("the_customer_id");
ResourceCollection<Customer> collection = gateway.customer().search(request);
for (Customer customer : collection) {
System.out.println(customer.getFirstName());
}

With the above query, we will be searching for all the entries that belong to a customer with the given ID. Braintree has a very reach search mechanism that allows you to perform complex queries based on your data. For example, you might search based on dates and get only the new customers back. Each customer object that will be returned, will contain the following fields.

The above fields will be the columns of the Customer table that we will create for storing the Customer data.

Paging is transparently managed by the SDK and the Braintree API so you won’t have to worry about how to iterate on a large number of records. When you get your results you will get an Iterator object which will iterate over all the results in a lazy way for keeping the resource consumption low.

What is important to notice is that the above data are available encapsulated into the structures that each SDK is exposing, so if you need the data in JSON format, for example, this is something that you have to take care of by converting the objects you get as results into JSON objects.

Load Data from Braintree to SQL Data Warehouse

SQL Data Warehouse support numerous options for loading data, such as:

  • PolyBase
  • Azure Data Factory
  • BCP command-line utility
  • SQL Server integration services

As we are interested in loading data from online services by using their exposed HTTP APIs, we are not going to consider the usage of BCP command-line utility or SQL server integration in this guide. We’ll consider the case of loading our data as Azure storage Blobs and then use PolyBase to load the data into SQL Data Warehouse.

Accessing these services happens through HTTP APIs, as we see again APIs play an important role in both the extraction but also the loading of data into our data warehouse. You can access these APIs by using a tool like CURL or Postman. Or use the libraries provided by Microsoft for your favorite language. Before you actually upload any data you have to create a container which is something similar as a concept to the Amazon AWS Bucket, creating a container is a straightforward operation and you can do it by following the instructions found on the Blob storage documentation from Microsoft. As an example, the following code can create a container in Node.js.

JAVASCRIPT
blobSvc.createContainerIfNotExists('mycontainer', function(error, result, response){
if(!error){
// Container exists and allows
// anonymous read access to blob
// content and metadata within this container
}
});

After the creation of the container you can start uploading data to it by using again the given SDK of your choice in a similar fashion:

JAVASCRIPT
blobSvc.createBlockBlobFromLocalFile('mycontainer', 'myblob', 'test.txt', function(error, result, response){
if(!error){
// file uploaded
}
});

When you are done putting your data into Azure Blobs you are ready to load it into SQL Data Warehouse using PolyBase. To do that you should follow the directions in the Load with PolyBase documentation. In summary, the required steps to do it, are the following:

  1. create a database master key
  2. create a database scoped credentials
  3. create an external file format
  4. create an external data source

PolyBase’s ability to transparently parallelize loads from Azure Blob Storage will make it the fastest tool for loading data. After configuring PolyBase, you can load data directly into your SQL Data Warehouse by simply creating an external table that points to your data in storage and then mapping that data to a new table within SQL Data Warehouse.

Of course, you will need to establish a recurrent process that will extract any newly created data from your service, load them in the form of Azure Blobs and initiate the PolyBase process for importing the data again into SQL Data Warehouse. One way of doing this is by using the Azure Data Factory service. In case you would like to follow this path you can read some good documentation on how to move data to and from Azure SQL Warehouse using Azure Data Factory.

The best way to load data from Braintree to SQL Data Warehouse and possible alternatives

So far we just scraped the surface of what can be done with Microsoft Azure SQL Data Warehouse and how to load data into it. The way to proceed relies heavily on the data you want to load, from which service they are coming from, and the requirements of your use case. Things can get even more complicated if you want to integrate data coming from different sources. A possible alternative, instead of writing, hosting, and maintaining a flexible data infrastructure, is to use a product like RudderStack that can handle this kind of problem automatically for you.

RudderStack integrates with multiple sources or services like databases, CRM, email campaigns, analytics, and more.

Sign Up For Free And Start Sending Data
Test out our event stream, ELT, and reverse-ETL pipelines. Use our HTTP source to send data in less than 5 minutes, or install one of our 12 SDKs in your website or app.