Location Optimisation (part 2)

In Part 1, we introduced the Location Optimiser macro and how the p-median based approach for solving location-allocation problems is shown in the Alteryx sample workflows (trying to minimise the overall weighted distance that demand points have to travel to a set of potential facilities). In this part, we’ll give some examples to bring it to life a bit, and discuss some of the different approaches to simply minimising distance.

The approach that may be needed will depend on how the location optimisation problem is framed. The minimise impedance problem was discussed in part 1 of the blog in the form of the p-median problem. This approach is designed to minimise the sum of weighted costs between demand and facilities. Another approach is to maximise coverage in some way to capture the most demand from a fixed number of facilities within a set distance or time. The result of this approach is that the most demand (people, patients, fires, crimes etc) within the distance cut-off will be selected (a set cut off means that some of the demand may be excluded if there is insufficient demand near to a facility). In both of these, you need to know how many facilities that you want to add. In cases where you don’t know this, the problem could be framed as minimising the number of facilities. That is, we are trying to maximise the coverage whilst minimising the facilities. For a bit more reading on these, see the useful overview in the ArcGIS network analyst help.

In practice, we may be asked questions that require us to re-frame them into one of these location optimisation approaches. Let’s imagine that we were launching a new brand into the UK. We might want to see “how many locations would we need for 80% of the population to live within 30 miles of a store?”. This is really a minimise-facilities problem, but with an acknowledgement that 20% of the population are going to live further away than 30 miles. Whilst we are trying to minimise the facilities, we are actually trying to maximise the population counts at each location by choosing those that are most densely populated.  

Of course, we want to use the location optimiser macro (which is what the blog is all about after all!), but you’ll remember that one of the inputs for this is the number of facilities we want to add to the network. This is exactly what we want to find! In actual fact, we can use a combination of an iterative macro to estimate the number of facilities, and then a location optimiser macro to find the best (the most population) combination of n facilities. This saves us running multiple configurations of the location optimiser which takes some time, whilst also making sure that we get the best overall combination of facilities.

Input Data

First, we need some demand points and some supply points. The former is just small-area population counts (for example Output Areas or LSOAs from the census which are displayed as centroids). For the latter, we need to create some potential facilities, as we don’t actually have any locations in mind. We can do this with a 10km grid around the UK coastline, and then extract the centroids with the spatial info or formula tool).

Estimating the number of facilities with an iterative macro

We want to use this macro to, run by run, select the facilities with the highest population within 30 miles, calculate the running percentage of overall population and then see how many facilities to run the location optimiser for. The challenge is that in Alteryx, iterative macros which have to run spatial processes can be (painfully) slow. However, we can design our macro to run entirely using joins and numeric calculations, doing any spatial work outside the iterating itself.

First, we create a large long-thin matrix table containing the supply and demand ID pairs (one demand point can be in many supply point’s 30-mile radius), the demand for each demand point, and the total demand for the supply point. In our example, there are almost 14 million pairs.  

Supply-Demand Pairs with Demand and Total Demand at the supply point

On the first run, we want to choose the supply point that has the highest total demand and use this as our first facility. Here, 12.5 million people live within 30 miles of the supply point, spread across almost 39k demand points.  

Next we need to burn down, or remove any of these demand points from the looping data. This is so that once a demand point is allocated, it can’t be used again. We do this by joining all of the supply-demand pairs to the demand points that are used, and only taking forward the unmatched pairs.

If we were to map the remaining demand points, we will see a large hole around London, the most densely populated part of the UK:

Note the hole covering London

The final thing is to re-create the input table of supply-demand pairs, and re-calculate each of the supply point’s total demand for the next iteration. In doing so, the order of the most densely populated supply points is changed. Many of the supply points having a high proportion of demand that was removed after the first iteration are now the least densely populated and are no longer viable to use as a facility. Accordingly, in iteration 2, the supply point at the next most densely populated part of the UK is chosen, and so on. Here is the macro in entirety:

If we calculate the running percent of population coverage for each iteration, this gives us a similar table to before, and we can map the facilities to use in order to reach coverage of 80% of the population within 30 miles. In this case, its 15 facilities.

This all runs in about 8 minutes, much quicker than the compute heavy location optimiser would be for assessing multiple facilities. Here are the final locations selected:

Final solution by iterating.

This method is really a brute force approach with cumulative population density above all else, in a single order. In fact, a better solution (higher population overall) could be reached by a different combination of facilities which have less overlap. The iterative approach does not allow this. So…

The location optimiser

With the output from the iterative macro – 15 – we set this as the number of facilities to find. Inside the location optimiser macro, we use the spatial match tool to find which demand points are within the 30 mile trade area. As the macro randomly chooses which 15 facilities to be part of the solution, we may have situations where a demand point is within 30 miles of more than 1 facilities within the solution. We can remove these duplicated demand points with the unique tool (it doesn’t really matter which facility they are assigned to in this case, since we are assessing the random selection of 15 facilities as a whole).

This assessment or scoring is done simply by summing up the population within the solution, and we want to change our macro settings to maximise the population who live within 30 miles of the set of facilities (a higher score).      

Then we can calculate the % of the population that the best 15 facilities has within 30 miles. The table below compares this to the iterative approach, and as you can see, whilst its pretty marginal, the location optimiser does find a better overall solution.

MethodPopulationUK Total Population% of population
Iterative Macro51,123,29163,182,17880.9%
Location Optimiser51,231,28463,182,17881.1%
Best Selected 15 Locations

As might be expected, the result is basically a map of the major cities and their surroundings, or where the highest combined population density is. This took about 15 minutes to run. Incidentally, a solution with 14 facilities almost reaches the cut off – 79.1% (again, marginally higher the the equivalent using just an iterative macro).

You’ll notice that the two approaches actually lead to quite similar results, both in terms of where they find and the number of locations needed. However, using the location optimiser finds a slightly better solution (81.1% vs 80.9%) using the same number of facilities. This is because the location optimiser finds the best overall solution, whilst the iterative approach is cumulative and has only 1 possible outcome – there is no flexibility in the combination of facilities that it will choose. For minimise facility problems, a combination of the two may work well to reduce computation time, first running an iterative approach to reach the likely number of facilities, and then running a location optimiser macro.

Obviously in reality, constraints of existing locations, capacity, closeness to road links and costs (and many more) play a much more significant role than this simple example, and using vehicle journey times would likely yield a different result as well. Nevertheless, this hopefully gives an idea of how location optimisation is a really powerful technique that can be used to solve many typical location-allocation problems in a few different ways.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s