2

I have for example, 1000 customers located in Europe with different latitude and longitude. I want to find the minimal number of facilities that can serve all customers, subject to the constraint that each customer must be served within 24hr delivery (here I use a maximum allowed transportation distance from a facility to a customer as the constraint for ensuring 24hr delivery service (distance is straight line between two locations, calculated based on Euclidean distance/straight line).

So, with each warehouse that can only serve the customers within certain distance e.g. 600 km, what is the algorithms that can help me to find the minimal number of facilities needed to service all customers, and their respective latitude and longitude. An example is shown in the attached pic below.

example of finding minimal warehouses and their locaitons

example of finding minimal warehouses and their locaitons

Jack
  • 1,339
  • 1
  • 12
  • 31
  • Be more formal! What are your distances? A valid metric or not? Which one (euclidean, haversine...)? (for euclidean-distances this sounds feasible by using SOCP solvers; decide on k: check if there is a solution; if not, increase k; should be polynomial-time) The last note might also be conflicting with the other objectives! – sascha Jan 11 '18 at 18:58
  • distance is calculated based on coordiantes of two locations, using simple straight line distance. – Jack Jan 11 '18 at 19:07
  • 1
    Simple straight line between latitude/longitudes... This sounds wrong (but that's not my area). – sascha Jan 11 '18 at 19:11
  • it's based on Euclidean distance (straight-line distance between two points ) – Jack Jan 11 '18 at 19:13
  • But why sphere-based coords then? The shortest path then is not covered by euclidean distance. Do you know what you are doing? It might be important to articulate your information more formally! – sascha Jan 11 '18 at 19:19
  • for the sake of simplicity, I use straight-line distance * a factor to reflect the actual distance). The more important thing is to find minimal number of warehouse locations to cover all demand. This is "maximum coverage facility location problem". I want to know if there is a good and simple algorithms to help me find the number and locations of warehouses. – Jack Jan 11 '18 at 19:28
  • 1
    Sounds like a use case for Voronoi: https://en.wikipedia.org/wiki/Voronoi_diagram – gdir Jan 11 '18 at 19:50
  • Have a look in the book Computational Geometry (Mark de Berg et. al.). Chapter 7: The Post Office Problem. Link: http://www.springer.com/de/book/9783540779735#otherversion=9783642096815 – gdir Jan 11 '18 at 19:58
  • A genetic algorithm can quickly find good solutions to these types of optimization problems. You could probably also use a variant of simulated annealing. There are spatial partitioning algorithms like median-cut and k-d trees that could be applied. (Or copy someone else! Legend says that Netflix chose its DVD-by-mail distribution points by finding where Amazon distribution points were, on the theory that Amazon had already done a good coverage-for-cost analysis.) – Adrian McCarthy Jan 12 '18 at 00:16

2 Answers2

5

This falls in the category of facility location problems. There is quite a rich literature about these problems. The p-center problem is close to what you want.

Some notes:

  1. Besides solving a formal mathematical optimization model, often heuristics (and meta-heuristics) are used.
  2. The distances are a rough approximation of real travel time. That also means approximate solutions are probably good enough.
  3. Besides just finding the minimum number of facilities needed to service all customers, we can refine the locations by minimizing the distances.
  4. A math programming model for the pure "minimize number of facilities" can be formulated as a Mixed Integer Quadratically Constrained problem (MIQCP). This can be solved with standard solvers (e.g. Cplex and Gurobi). Below is an example I cobbled together:

enter image description here

With 1000 random customer locations, I can find a proven optimal solution:

    ----     57 VARIABLE n.L                   =        4.000  number of open facilties

    ----     57 VARIABLE isopen.L  use facility

    facility1 1.000,    facility2 1.000,    facility3 1.000,    facility4 1.000


    ----     60 PARAMETER locations  

                        x           y

    facility1      26.707      31.796
    facility2      68.739      68.980
    facility3      28.044      67.880
    facility4      76.921      34.929

See here for more details.

Basically we solve two models:

  1. Model 1 finds the number of warehouses needed (minimize number subject to maximum distance constraint)
  2. Model 2 finds the optimal placement of the warehouses (minimize total distance)

After solving model 1 we see (for a 50 customer random problem): enter image description here We need three warehouses. Although no link exceeds the maximum distance constraint, this is not an optimal placement. After solving model 2 we see: enter image description here This now optimally places the three warehouses by minimizing the sum of length of the links. To be precise I minimized the sum of the squared lengths. Getting rid of the square root allowed me to use a quadratic solver.

Both models are of the convex Mixed Integer Quadratically Constrained Problem type (MIQCP). I used a readily available solver to solve these models.

Erwin Kalvelagen
  • 15,677
  • 2
  • 14
  • 39
  • Thanks Erwin. If I understand correctly, the basic steps are: 1). find the minimal number of facilities required to cover all customers, and their respective locations. Then assign customers to their nearest facility to form the different clusters. 2) fine tune the position of facility of each cluster to try to minimise the total transportation distance. I am not very good at mathematical representation. Not sure if it is ok to explain the algorithms via Python codes or general pseudo-codes. – Jack Jan 12 '18 at 14:08
  • See [here](http://examples.gurobi.com/facility-location/) for an example with Python code. – Erwin Kalvelagen Jan 12 '18 at 14:17
  • This example is about the optimal tradeoff between delivery cost and the cost of building new facilities, which is quite different from my case - minimize facilities to serve all customers with distance constraint. – Jack Jan 12 '18 at 14:37
  • The modeling is actually quite close. It is another example of a facility location problem. – Erwin Kalvelagen Jan 12 '18 at 14:39
  • 1
    because I don't use Cplex and Gurobi, I still have no idea how to implement your method. What is missing yet most critical is the algorithm to calculate the initial coordinates of each facility. If possible I want to understand the whole calculation procedures so that I could replicate using python or netlogo programming. – Jack Jan 12 '18 at 14:53
  • In my example the locations are calculated by the MIQCP solver. Writing such a solver yourself likely out of the question. If you want to program everything yourself, it may be better to implement a (meta-)heuristic. These are typically easier to implement. No guarantee of optimality however. If you google for "p-center problem" (which is close to your problem but not exactly the same) you will get a lot of hits. – Erwin Kalvelagen Jan 12 '18 at 15:15
3

Python codes with Gurobi as the solver:

from gurobipy import *
import numpy as np
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt



customer_num=15
dc_num=10
minxy=0
maxxy=10
M=maxxy**2
max_dist=3
service_level=0.7
covered_customers=math.ceil(customer_num*service_level)
n=0
customer = np.random.uniform(minxy,maxxy,[customer_num,2])


#Model 1 : Minimize number of warehouses

m = Model()

###Variable
dc={}
x={}
y={}
assign={}

for j in range(dc_num):
    dc[j] = m.addVar(lb=0,ub=1,vtype=GRB.BINARY, name="DC%d" % j)
    x[j]= m.addVar(lb=0, ub=maxxy, vtype=GRB.CONTINUOUS, name="x%d" % j)
    y[j] = m.addVar(lb=0, ub=maxxy, vtype=GRB.CONTINUOUS, name="y%d" % j)

for i in range(len(customer)):
    for j in range(len(dc)):
        assign[(i,j)] = m.addVar(lb=0,ub=1,vtype=GRB.BINARY, name="Cu%d from DC%d" % (i,j))

###Constraint
for i in range(len(customer)):
    for j in range(len(dc)):
        m.addConstr(((customer[i][0] - x[j])*(customer[i][0] - x[j]) +\
                              (customer[i][1] - y[j])*(customer[i][1] - \
                              y[j])) <= max_dist*max_dist + M*(1-assign[(i,j)]))

for i in range(len(customer)):
    m.addConstr(quicksum(assign[(i,j)] for j in range(len(dc))) <= 1)

for i in range(len(customer)):
    for j in range(len(dc)):
        m.addConstr(assign[(i, j)] <= dc[j])

for j in range(dc_num-1):
    m.addConstr(dc[j] >= dc[j+1])

m.addConstr(quicksum(assign[(i,j)] for i in range(len(customer)) for j in range(len(dc))) >= covered_customers)

#sum n
for j in dc:
    n=n+dc[j]

m.setObjective(n,GRB.MINIMIZE)

m.optimize()

print('\nOptimal Solution is: %g' % m.objVal)
for v in m.getVars():
    print('%s %g' % (v.varName, v.x))
#     # print(v)


# #Model 2: Optimal location of warehouses

optimal_n=int(m.objVal)
m2 = Model()   #create Model 2

# m_new = Model()

###Variable
dc={}
x={}
y={}
assign={}
d={}

for j in range(optimal_n):
    x[j]= m2.addVar(lb=0, ub=maxxy, vtype=GRB.CONTINUOUS, name="x%d" % j)
    y[j] = m2.addVar(lb=0, ub=maxxy, vtype=GRB.CONTINUOUS, name="y%d" % j)

for i in range(len(customer)):
    for j in range(optimal_n):
        assign[(i,j)] = m2.addVar(lb=0,ub=1,vtype=GRB.BINARY, name="Cu%d from DC%d" % (i,j))

for i in range(len(customer)):
    for j in range(optimal_n):
        d[(i,j)] = m2.addVar(lb=0,ub=max_dist*max_dist,vtype=GRB.CONTINUOUS, name="d%d,%d" % (i,j))

###Constraint
for i in range(len(customer)):
    for j in range(optimal_n):
        m2.addConstr(((customer[i][0] - x[j])*(customer[i][0] - x[j]) +\
                              (customer[i][1] - y[j])*(customer[i][1] - \
                              y[j])) - M*(1-assign[(i,j)]) <= d[(i,j)])
        m2.addConstr(d[(i,j)] <= max_dist*max_dist)

for i in range(len(customer)):
    m2.addConstr(quicksum(assign[(i,j)] for j in range(optimal_n)) <= 1)

m2.addConstr(quicksum(assign[(i,j)] for i in range(len(customer)) for j in range(optimal_n)) >= covered_customers)

L=0
L = quicksum(d[(i,j)] for i in range(len(customer)) for j in range(optimal_n))

m2.setObjective(L,GRB.MINIMIZE)

m2.optimize()


#########Print Optimization Result
print('\nOptimal Solution is: %g' % m2.objVal)

dc_x=[]
dc_y=[]
i_list=[]
j_list=[]
g_list=[]
d_list=[]
omit_i_list=[]
for v in m2.getVars():
    print('%s %g' % (v.varName, v.x))
    if v.varName.startswith("x"):
        dc_x.append(v.x)
    if v.varName.startswith("y"):
        dc_y.append(v.x)
    if v.varName.startswith("Cu") and v.x == 1:
        print([int(s) for s in re.findall("\d+", v.varName)])
        temp=[int(s) for s in re.findall("\d+", v.varName)]
        i_list.append(temp[0])
        j_list.append(temp[1])
        g_list.append(temp[1]+len(customer))  #new id mapping to j_list
    if v.varName.startswith("Cu") and v.x == 0:
        temp=[int(s) for s in re.findall("\d+", v.varName)]
        omit_i_list.append(temp[0])
    if v.varName.startswith("d") and v.x > 0.00001:
        d_list.append(v.x)


#########Draw Netword
# prepare data
dc_cor=list(zip(dc_x,dc_y))
dc_list=[]
for i,k in enumerate(dc_cor):
    temp=len(customer)+i
    dc_list.append(temp)

df=pd.DataFrame({'Customer':i_list,'DC':j_list,'DC_drawID':g_list,'Sqr_distance':d_list})
df['Sqrt_distance']=np.sqrt(df['Sqr_distance'])
print(df)

dc_customer=[]
for i in dc_list:
    dc_customer.append(df[df['DC_drawID'] == i]['Customer'].tolist())
print('\n', dc_customer)

#draw
G = nx.DiGraph()

d_node=[]
e = []
node = []
o_node = []
for c, k in enumerate(dc_list):
    G.add_node(k, pos=(dc_cor[c][0], dc_cor[c][1]))
    d_node.append(c)
    v = dc_customer[c]
    for n, i in enumerate(v):
        G.add_node(i, pos=(customer[i][0], customer[i][1]))
        u = (k, v[n])
        e.append(u)
        node.append(i)
        G.add_edge(k, v[n])
for m,x in enumerate(omit_i_list):
    G.add_node(x, pos=(customer[x][0], customer[x][1]))
    o_node.append(x)
nx.draw_networkx_nodes(G, dc_cor, nodelist=d_node, with_labels=True, width=2, style='dashed', font_color='w', font_size=10, font_family='sans-serif', node_shape='^',
node_size=400)
nx.draw_networkx_nodes(G, customer, nodelist=o_node, with_labels=True, width=2, style='dashed', font_color='w', font_size=10, font_family='sans-serif', node_color='purple',
node_size=400)
nx.draw(G, nx.get_node_attributes(G, 'pos'), nodelist=node, edgelist=e, with_labels=True,
        width=2, style='dashed', font_color='w', font_size=10, font_family='sans-serif', node_color='purple')


# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('Optimization_Result.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

plt.axis('on')
plt.show()
Jack
  • 1,339
  • 1
  • 12
  • 31
  • can you please provide explanations on what each variable mean? – azal Dec 16 '20 at 15:59
  • dc={} -- distribution center x={} -- x coordinate y={} -- y coordinate assign={} -- assignment of customers to each DC – Jack Dec 17 '20 at 07:39
  • thanks for your prompt reply. Will it work if the coordinates are in lat lon as well? – azal Dec 17 '20 at 09:05
  • thanks again for your speedy reply. In what unit is the maxdist defined? Assuming we have x,y coords -as in this case- and assuming we have lat,lon? – azal Dec 17 '20 at 14:52