Building a Geocoder in Alteryx

There are so many options when it comes to geocoding in Alteryx, you can build a workflow to use Google APIs to look up your postcode and feedback the answer for you. Lots of blogs around relating to that, but actually building your own using open source data that doesn’t require an internet connection every time you would like to use it, now that is a macro for your suite of tools. Buckle up my friends, lets go.

Firstly you need some data, of course this is being written in the UK, so I am going to use UK postcodes. The Office of National Statistics is an epic treasure trove of data. Here is a link to the Postcode Directory for August 2020. This dataset is updated every 3 months, so you will need to keep that in mind if you want to keep on top of newer postcodes but the process we are building is going to support that.

ONS Postcode Directory – August 2020

Before we carry on let me explain the process we are going to undertake. We are going to create three small workflows. One is to build a Calgary database (1) which we will then use within the Macro (2) we are going to build and finally we will use the Macro in a workflow (3).

Step 1: Building a Calgary Database

Workflow required for building a Calgary database (Runtime ~ 2 mins)

The steps are connect to the csv data set, which can be found in the “DATA” folder from the ONS data you downloaded. (Downloads\ONSPD_AUG_2020_UK\Data)

Create a new field, which removes the spaces from the pcd (postcode column in the ONS data) so that you can have a field which will support more matches when joined, as users tend to type postcodes differently and always get it wrong this is the best way to deal with matching on postcodes.

Then we select the fields from the ONS dataset that we want to use in our database. It is always useful to have the lookups available within your geocoding tool, so I have included the Output Area, Lower Super Output Area, etc. However the Latitude and Longitude fields are a must for this tool, as well as the new field we have created to help us match against human inputted data. (strippedpc)

We then add the Calgary Loader tool, which will go ahead and index and store all our required data in a smaller compact and quickly accessible database. Make sure to save this somewhere accessible by all.

Step 2: Geocoder Macro

Geocoder Macro (Runtime 0.5 secs for 1 record)

A macro input required with just a single column and row with an example postcode in it. You can use your own, or perhaps your office building postcode. I called this column postcode, and set up the question as such for the user to understand what is required from them in order for the macro to work.

Then as this is the user side, we need to remove spaces from any postcode they enter, as we have already mentioned it is likely they haven’t entered it as the standard requires. so the best way to get a match is to remove all spaces and match without them with the strippedpc column you made in your Calgary database.

Now its the best step, adding in the database, this isn’t your normal input tool. You actually need a Calgary Join tool. Connect your user entered data to the input anchor of this tool then set up as follows:

Calgary Join – connect to your Calgary Database

Here you are joining the data from your user, which could be a list of 1 or a list of 1 million records, to your calgary database you have created. When you have your configuration options up, you are selecting your user created strippedpc and then the index field of strippedpc with the query type of Exact value. This provides a quick way of finding your matches.

Then we join back onto our original data using the stripped pc to make the join.

From our matched data we want to use the Latitude and Longitude data and create a point for it, so that we are officially geocoding.

Then we have two Macro Outputs. One for the matched data from the join, then one for the unmatched so that the user has full control of all the data they have entered at the start of the process and know which postcodes are “not valid”.

Interface Properties

Before you save your Macro, don’t forget to give it a helpful name, some metadata (a description) and an icon, so that users can pick it off a toolbar knowing fully how to use it well.

Step 3: Testing your Macro

Testing your Geocoder

Finally put in a postcode within a text input and then attach your macro with a couple of browse tools. Run to test and view your results in your Geocoded results, as you can see my postcode test was Buckingham Palace.

Successful job, we have just created a geocoder, but one that you can keep up to date too.

Testing 1 record takes 0.5 seconds, 26,472 records takes 1.2 seconds or 2,467,046 in 30.9 seconds which is a great test of speed. Thanks for following. *o*

4 thoughts on “Building a Geocoder in Alteryx

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s