How to load data from BaseCRM to Google BigQuery
Access your data on Base CRM
The first step in loading any Base CRM data to any kind of data warehouse solution is to access them and start extracting them.
As previously mentioned, using Base’s rich Core API you can get access to data from 25 resources including, among others, the following:
- Account: The Account API provides read-only access to your account details.
- Calls: The Calls API delivers a simple interface for managing calls.
- Contacts: The Contacts API provides a simple interface to manage your contacts. A contact represents an individual or an organization.
- Deals: The Deals API delivers a simple interface for managing deals.
- Leads: The Leads API provides a simple interface to manage leads. A lead represents an individual or an organization that expresses interest in your goods or services.
- Orders: Through the Orders API you can manage your orders.
- Pipelines: The Pipelines API provides a read-only interface to your sales pipeline definition.
- Products: The Products API offers an interface for managing the Product Catalog. The catalog lists products that are available in your account.
- Users: Using the User’s API you can interact with your account’s users. You can retrieve a single user as well as list of all users associated with your account.
In addition to the above, the things that you have to keep in mind when dealing with the BaseCRM API, are:
- Rate limits. According to the documentation, you can make up to 36,000 requests per hour (10 requests/ip/second).
- Authentication. You can authenticate on Base CRM using OAuth.
- Pagination. API endpoints that return a collection of items are always paginated. The number of results to display can vary with a maximum value of 100.
Transform and prepare your Base CRM data for Google BigQuery
After you have accessed your data on Base CRM, you will have to transform it based on two main factors,
- The limitations of the database that the data will be loaded onto
- The type of analysis that you plan to perform
Each system has specific limitations on the data types and data structures that it supports. If for example, you want to push data in a BigQuery, then you can send nested data like JSON directly.
Also, you have to choose the right data types. Again, depending on the system that you will send the data to and the data types that the API exposes to you, you will have to make the right choices. These choices are important because they can limit the expressivity of your queries and limit your analysts on what they can do directly out of the database.
Also, you have to consider that the reports you’ll get from Base CRM are like CSV files in terms of their structure and you need to somehow identify what and how to map to a table into your database.
Load data from Base CRM to BigQuery
If you want to load any data from Base CRM to BigQuery, you have to use one of the following supported data sources.
- Google Cloud Storage
- Sent data directly to BigQuery with a POST request
- Google Cloud Datastore Backup
- Streaming insert
- App Engine log files
- Cloud Storage logs
From the above list of sources, 5 and 6 are not applicable in our case.
For Google Cloud Storage, you first must load some data into it, there are a few options on how to do this, for example, you can use the console directly as it is described here and do not forget to follow the best practices.
Another option is to post data through the JSON API, as we see again APIs play an important role in both the extraction but also the loading of data in our data warehouse. In its simplest case, it’s just a matter of one HTTP POST request using a tool like CURL or Postman.
After you have loaded you own data into Google Cloud Storage, you have to create a Load Job for BigQuery to actually load the data to it, this Job should point to the source data in Cloud Storage that have to be imported, this happens by providing source URIs that point to the appropriate objects.
The best way to load data from Base CRM to Google BigQuery
So far we just scraped the surface of what you can do with BigQuery and how you can load data into it. Things can get even more complicated if you want to integrate data coming from different sources.
Are you striving to achieve results right now?
Instead of writing, hosting, and maintaining a flexible data infrastructure use RudderStack that can handle everything automatically for you.
RudderStack with one click integrates with sources or services, creates analytics-ready data, and syncs your Base CRM to BigQuery right away.