Non Overlapping Geography within Alteryx – How to do it like a Pro!

Part 2 – Rebuilding Geography where Trade Areas are split across Rivers

So welcome back, Part 2 of Non Overlapping Geography within Alteryx. So far we have built non overlapping trade areas and removed parts of the polygons which spanned over the rivers around the coastline within Part 1.

Which left us with the following predicament, a store trade area spanning over the river, which if we are considering the nearest population to a store, this isn’t true as the population here cannot access a store over a river. This would leave large errors around these areas if we were modelling with these population numbers, so refinement is key.

So how do we deal with this? Well the first thing we need to do is identify all those regions which have now got more than a single region post being cut by the coastline. We can do this by using the Spatial Info tool, selecting the spatial field we are interested in and checking the Number of Parts checkbox, this outputs the number of regions within each spatial object, which is great to identify those with more than 1 for example.

For all those regions which only have a single polygon then are fed off straight to the end process. For those which have more than one part to a region a number of processes need to happen. Firstly we remove small error polygons from the cookie cut by using the area of the polygon and those which are really small and insignificant we just ignore from the process. Then we need to create a flag which helps us understand if the store is within the region part or not.  As regions are broken into at least two parts around the coastline its important to understand which region part is the main region for the store and which are additional parts. This is done within the formula tool with the following ST_Within(Point, Polygon).

Next is the clever bit, we filter using the store within flag, and those regions which contain the main store area become the root of the new polygons for the stores and then those areas which are split from their original areas are fed in as the universe within the spatial match. Using Touches or intersects we can now join areas together where they are next to a main store area. What we see here is the areas on the right side of the river being joined up with a region which is on the same side.

As you can see here, the Grimsby region now has a section highlighted to be attached which was originally part of a region from over the river.

Where there is only the one combination of a part being added to a main store region, then we go ahead combine those regions and then feed them to the end, however it would be no fun if it was that easy. When using a spatial match you can get many combinations which come out that are a potential match. So how do we decide which main store region should gain the additional region parts? What if there are two, three or more additional parts?

So to find the right region to join to, we use a centroid of the region parts combined for each combination, and then calculate how far from the store it is. The store that it is closest to, this is the pairing that is successful and goes ahead to join the rest of the stores which made it to the final list.

The image above shows how the centroid for the newer larger polygon shifts when joined with a newer part. The distance between the original store and the centroid of the newer larger polygon determines which store gains the extra area. The smaller the distance the better. The two areas above are competing for the central part and the area on the left hand side is the store which wins it due to the centroid of the newer polygon being closer to the store. The polygon on the right is then reverted back to its original state without the additional part and has lost the gain of trade area.

Now we have a final store list of non-overlapping boundaries that don’t span over the river.

We have seen lots of examples of Hull, so here is a before and after image of Edinburgh, Scotland.

Combining all the feeds and checking we have just the one unique record for each store, we can then do analysis such as calculate the number of people or households within the areas for use within regression models. As this is a macro, we can add it to a process so when new stores open this is all calculated dynamically as part of a feed within Alteryx. It takes all of 50 seconds to update these figures.

Within this blog, I have shown some handy techniques in which you can reassign polygons which have been split by a river and where there is possibly a closer store which initially lost that trade, it is now able to be re-assigned and be counted where it counts. I have found that my modelling has increased in accuracy purely just through improving this process, so its worth giving it a shot.

Data Sources to support you on your journey through Part 2:

Populated Weighted Centroids

Create a union between these and join them to the population numbers you want to use, using the OA Code. Unfortunately England, Scotland and Ireland there are three different sources and it does require combining these datasets to gain a fuller picture, however once completed you have a great dataset that you can use again and again.

There are many places you can source statistical datasets such as postcode level population stats, CACI, Experian, etc. These of course will gain you better granularity and better results when doing spatial matching like in this example.

One thought on “Non Overlapping Geography within Alteryx – How to do it like a Pro!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s