How to load data from Xero to MS SQL Server
Extract your Xero’s data
Xero has an excellent API, or to be more precise a number of APIs, and encourages developers to build applications that can be sold on their add-on marketplace. The APIs that they expose are the following:
- Xero Core (Accounting) API – exposes accounting and related functions of the main Xero application and can be used for a variety of purposes such as creating transactions like invoices and credit notes, right through to extracting accounting data via our reports endpoint.
- Xero Payroll API – exposes payroll-related functions of Payroll in Xero and can be used for a variety of purposes such as syncing employee details, importing timesheets, etc.
- Files API – provides access to the files, folders, and the association of files within a Xero organization.
- Fixed Assets API – which is under review, this feature is not yet available but users can vote for it to become publicly available.
- Xero Practice Manager API – a recently released product built on the WorkflowMax product. Which is an API for managing workflows
In this post, we’ll focus on the Xero Core (Accounting) API. Which exposes the core accounting functionalities of the Xero product. The Xero API is a RESTful web service and uses the OAuth (v1.0a) protocol to authenticate 3rd party applications. As a RESTful API, interacting with it can be achieved by using tools like CURL or Postman by using http clients for your favorite language or framework. A few suggestions:
- Apache HttpClient for Java
- Spray-client for Scala
- Hyper for Rust
- Ruby rest-client
- Python http-client
As a product and consequently an API that has to deal with sensitive data, Xero API takes really good care of security. For this reason, there are a number of different applications that can be developed and integrated with it, where the main difference is how the application authenticates, how often the tokens expire, and in general security-related aspects.
For more about the different application types, you can consult the application types guides on their documentation.
Xero API requests limits
The Xero API has three different types of limits that enforce on the usage of their API. It’s extremely important to keep those in mind when developing against its API and a reason for many headaches when someone attempts to build an infrastructure for extracting data from it.
- Daily limit – of 1000 API calls per organization.
- Requests per minute – each OAuth access token can be used up to 60 times in any 60 second period. This rate limit is based on a rolling 60-second window.
- Request Size Limit – A single POST to the Accounting or Payroll APIs has a size limit of 5MB.
For more information about the API limitations, please consult the documentation for API limits.
Xero API Resources
The Xero API has a very rich data model of 31 resources. It is important to know that by default the response type of the API calls is of type text/xml but you can override this option and request JSON responses if preferred.
- Attachments
- Accounts
- Bank Statements
- Bank Transactions
- Bank Transfers
- Branding Themes
- Contacts
- Contact Groups
- Credit Notes
- Currencies
- Employees
- Expense Claims
- Invoices
- Invoice Reminders
- Items
- Journals
- Linked Transactions
- Manual Journals
- Organisation
- Overpayments
- Payments
- Prepayments
- Purchase Orders
- Quotes
- Receipts
- Repeating Invoices
- Reports
- Tax Rates
- Tracking Categories
- Types
- Users
Requesting data using Xero API
Let’s assume that you would like to retrieve all the invoices that you have issued through Xero and put the information in your data warehouse to perform analytics and reporting. To do that you should perform a GET request to the https://api.xero.com/api.xro/2.0/Invoices endpoint. A typical result, in XML, from performing such an action is like the following:
JAVASCRIPT
<Invoices><Invoice><Type>ACCREC</Type><Contact><ContactID>025867f1-d741-4d6b-b1af-9ac774b59ba7</ContactID><ContactStatus>ACTIVE</ContactStatus><Name>City Agency</Name><Addresses><Address><AddressType>STREET</AddressType></Address><Address><AddressType>POBOX</AddressType><AddressLine1>L4, CA House</AddressLine1><AddressLine2>14 Boulevard Quay</AddressLine2><City>Wellington</City><PostalCode>6012</PostalCode></Address></Addresses><Phones><Phone><PhoneType>DEFAULT</PhoneType></Phone><Phone><PhoneType>DDI</PhoneType></Phone><Phone><PhoneType>MOBILE</PhoneType></Phone><Phone><PhoneType>FAX</PhoneType></Phone></Phones><UpdatedDateUTC>2009-08-15T00:18:43.473</UpdatedDateUTC><IsSupplier>false</IsSupplier><IsCustomer>true</IsCustomer></Contact><Date>2009-05-27T00:00:00</Date><DueDate>2009-06-06T00:00:00</DueDate><Status>AUTHORISED</Status><LineAmountTypes>Exclusive</LineAmountTypes><LineItems><LineItem><Description>Onsite project management </Description><Quantity>1.0000</Quantity><UnitAmount>1800.00</UnitAmount><TaxType>OUTPUT</TaxType><TaxAmount>225.00</TaxAmount><LineAmount>1800.00</LineAmount><AccountCode>200</AccountCode><Tracking><TrackingCategory><TrackingCategoryID>e2f2f732-e92a-4f3a9c4d-ee4da0182a13</TrackingCategoryID><Name>Activity/Workstream</Name><Option>Onsite consultancy</Option></TrackingCategory></Tracking><LineItemID>52208ff9-528a-4985-a9ad-b2b1d4210e38</LineItemID></LineItem></LineItems><SubTotal>1800.00</SubTotal><TotalTax>225.00</TotalTax><Total>2025.00</Total><UpdatedDateUTC>2009-08-15T00:18:43.457</UpdatedDateUTC><CurrencyCode>NZD</CurrencyCode><InvoiceID>243216c5-369e-4056-ac67-05388f86dc81</InvoiceID><InvoiceNumber>OIT00546</InvoiceNumber><Payments><Payment><Date>2009-09-01T00:00:00</Date><Amount>1000.00</Amount><PaymentID>0d666415-cf77-43fa-80c7-56775591d426</PaymentID></Payment></Payments><AmountDue>1025.00</AmountDue><AmountPaid>1000.00</AmountPaid><AmountCredited>0.00</AmountCredited></Invoice></Invoices>
It is possible to paginate your results by using the paging support of the Xero API. Which is very useful when you have to work with a large number of invoices. Also, it is possible to request from the API only the latest invoices. This is done by providing the “Modified After” parameter on the GET request to the API.
The ModifiedAfter filter is actually an HTTP header: ‘If-Modified-Since‘.
A UTC timestamp (yyyy-mm-ddThh:mm:ss) . Only invoices created or modified since this timestamp will be returned e.g. 2009-11-12T00:00:00.
Xero exposes a very rich API which offers you the opportunity to get very granular data about your accounting activities and use it for analytic and reporting purposes. This richness comes with a price though, a large number of resources that have to be handled where some of them allow fetching updates and some others not.
Xero Data Preparation for Microsoft SQL Server
As in every relational database, SQL Server requires a well-defined database schema before we start populating with data. Data is organized in schemas, which are distinct namespaces where database objects belong to.
The most common database objects are, of course, tables that have a number of columns, with each one having a declared data type. MS SQL Server supports a large number of different data types, which gives us great flexibility in expressing the data that we have and at the same time optimizing our data warehouse.
When working with data coming from web services, where data is usually serialized in JSON, it is important to correctly map the data to the right data types. As changing the data types in the future is a process that might cost in downtime of your database, it is important to spend enough time thinking about the proper data type assignments.
For example, dates in JSON are just strings, but when storing date objects in a database, we can enhance analytics with great capabilities by transforming the raw string data into an appropriate date type. A typical strategy for loading data through Xero to an SQL Server database is to create a schema where you will map each API endpoint to a table. Each key inside the Xero API endpoint response should be mapped to a column of that table, and you should ensure the right conversion to an SQL Server compatible data type.
Of course, you will need to ensure that as the data types from the Xero API might change, you will adapt your database tables accordingly. There’s no such thing as automatic data type casting. After you have a complete and well-defined data model or schema for Microsoft SQL Server, you can move forward and start loading your data into the database.
Updating your Xero data on MS SQL Server
As you will be generating more data on Xero, you will need to update your older data on an MS SQL Server database. This includes new records together with updates to older records that for any reason have been updated on Xero.
You will need to periodically check Xero for new data and repeat the process that has been described previously while updating your currently available data if needed. Updating an already existing row on a SQL Server table is achieved by creating UPDATE statements.
Another issue that you need to take care of is the identification and removal of any duplicate records on your database. Either because Xero does not have a mechanism to identify new and updated records or because of errors on your data pipelines, duplicate records might be introduced to your database.
In general, ensuring the quality of the data that is inserted in your database is a big and difficult issue, and MS SQL Server features like TRANSACTIONS can help tremendously, although they do not solve the problem in the general case.
The best way to load data from Xero to MS SQL Server
So far we just scraped the surface of what you can do with MS SQL Server and how to load data into it. Things can get even more complicated if you want to integrate data coming from different sources.
Are you striving to achieve results right now?
Instead of writing, hosting, and maintaining a flexible data infrastructure use RudderStack that can handle everything automatically for you.
RudderStack, with one click, integrates with sources or services, creates analytics-ready data, and syncs your Xero to MS SQL Server right away.