How could osmnx be used in an HPC cluster?

Question

I want to run a python this script in a cluster of type HPC in Ubuntu Server to speed up its execution time

(Currently it takes about 4 minutes to finish the process on a server without a cluster)

Attached Python code:

import sys
import networkx as nx
import osmnx as ox
import geopandas as gpd
import math
from shapely.geometry import Point
import json


ox.config(use_cache=False)


origlatTmp = float(sys.argv[1])
origlngTmp = float(sys.argv[2])
destlatTmp = float(sys.argv[3]) 
destlngTmp = float(sys.argv[4])


def fromAtoBpoints(origlat, origlng, destlat, destlng):
   
    G = ox.load_graphml('/var/www/html/jalisco.graphml')

    #This .graphml file was generated with:

    #import osmnx as ox

    #ox.config(use_cache=True, log_console=True)

    #G = ox.graph_from_place('Jalisco,Mexico', network_type = 'drive', simplify=False)
    #G = ox.add_edge_speeds(G)
    #G = ox.add_edge_travel_times(G)
    #ox.save_graphml(G, '/var/www/html/jalisco.graphml')
    #print("Done!")

    #Jalisco is a state of Mexico and the territorial extension of Jalisco is 78,588 km²

    lats = []
    lngs = []
    
    lats.insert(0, origlat)
    lats.insert(1, destlat)
    
    lngs.insert(0, origlng)
    lngs.insert(1, destlng)

    points_list = [Point((lng, lat)) for lat, lng in zip(lats, lngs)]

    points = gpd.GeoSeries(points_list, crs='epsg:4326')

    points_proj = points.to_crs(G.graph['crs'])

    nearest_nodes = [ox.distance.nearest_nodes(G, pt.x, pt.y) for pt in points_proj]

    route = nx.shortest_path (G, nearest_nodes[0], nearest_nodes[1], weight='length')

    time = nx.shortest_path_length(G, nearest_nodes[0], nearest_nodes[1], weight='travel_time')
    #print("Tiempo:",time/60,"min")

    distance = nx.shortest_path_length(G, nearest_nodes[0], nearest_nodes[1], weight='length')
    #print("Distancia:",distance/1000,"km")

    resultArray = []
    resultArray2 = []

    for a in route:
        resultArray.append("{lat:"+ str(G.nodes[a]['y'])+",lng:"+ str(G.nodes[a]['x'])+"}")

    return (distance/1000),"@@@",resultArray

print(fromAtoBpoints(origlatTmp,origlngTmp,destlatTmp,destlngTmp))

This code takes a long time to execute (4 minutes as I mentioned above), but what we have in mind is that it can work offline, and the solution that is proposed to solve the response time is using an HPC cluster

Is HPC necessary? Can you just solve the shortest paths in parallel with multiprocessing? See https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.distance.shortest_path — gboeing, Oct 16 '21 at 20:40
I try to have a short resolution time (4 minutes to seconds) and have thought of an HPC implementation to meet this need. I tested the code I posted above on a server with 74GB RAM DRR3 and (Intel xeon x5570) X2 Do you think that activating multiprocessing will improve the response time? — Juan Martin Gonzalez Razo, Oct 17 '21 at 20:31
Your code is CPU-bound. Whether you address that with OSMnx's built-in multiprocessing, an HPC, or dumping your graph to cuGraph for GPU path solving is up to you and what infrastructure you have available. Any would work. I prefer multiprocessing for a decent speed up or cuGraph for a major speed up. I have an HPC available, but the other two options are much more convenient and usually perfectly sufficient. — gboeing, Oct 17 '21 at 23:28
Thinking a bit and seeing how the code behaves, and reading a little answers that you have published on github I now see that my problem is not as such the processing, but the amount of RAM on my server. The cluster was proposed by a colleague, but I am seeing that it is not the best option. How much RAM memory could you recommend for an optimal response time when loading states of a country or a whole country, I currently have 74 GB DDR3 RAM? — Juan Martin Gonzalez Razo, Oct 18 '21 at 00:27
Unfortunately that answer is entirely dependent on the specifics of your study area. See https://stackoverflow.com/questions/69511305/how-to-simply-compute-the-travel-time-from-one-point-to-an-other-without-a-plo/69576295#69576295 — gboeing, Oct 18 '21 at 03:08

How could osmnx be used in an HPC cluster?

0 Answers0