Shortest path

investigating street network

So far, I have inspected VAW locations to understand their distribution in the urban environment. Imported, cleaned, sorted and plotted with basemaps. Proved the postive spatial correlation and detected which parts of the city have the stronger relationships with the surrounding units.

This project now comes to its last part: investigating the help centre accessibility by answering to the question ‘Where is the closest help centre from my place of living?’.

Detecting the areas with low help centre accessibility can represent an important element in VAW alleviation.

To meet the goal, I have to retieve the buildings, find the nearest help centre and calculate the shortest path.

Required modules*:

  • OSMnx for retrieving buildings and network from Open Street Map and calculating shortest-paths
  • Shapely for getting the nearest point

*exclude description of modules imported in the previous steps

# Import modules
import geopandas as gpd
import matplotlib.pyplot as plt
import folium
from folium import plugins

# for network and non analysis
import osmnx as ox
from shapely.ops import nearest_points

# avoid messages
import warnings
warnings.filterwarnings('ignore')

Get buildings footprint from OSM

OSMnx provides the possibility to obtain the building representation from OpenStreetMap by simply entering key-value pairs for tags and name of the place.

tags = {'building': True}
buildings = ox.geometries_from_place("Mexico City, Mexico", tags)

Display the outputs:

buildings.plot(markersize=0.2,
               figsize=(12,12))

plt.axis('off') 
plt.show()

png

It is evident that several buildings are not present on the map. Consequently, the resulting help centre accessibility would be partial and unreliable. In light of this, I prefer to use AGEB geographies. Note that, despite being partial, the number of buildings is high:

len(buildings)
116401

Thus, using these data will be computationally demanding, especially when calculating the nearest center.

Import AGEB geographies

# Read the data
gdfa = gpd.read_file('data/VAW_AGEB_clean_lag.gjson')

Like buildings, AGEBs are polygons. I have to choose a point of the geometry from which calculate the distance. In order to get more accurate results, I have to reproject the data to the local UTM.

utm source

For Mexico City, it is UTM14N.

gdfa = gdfa.to_crs(epsg=6369) #reproject to CDMX UTM

And find the centroid:

gdfa['centroid'] = gdfa.centroid
gdfa.head(2)
CVEGEO POBTOT PRESOE15 P5_HLI POB_AFRO GRAPROES PDESOCUP PSINDER VPH_SINLTC VPH_SINCINT VAW_COUNT VAW_per_1000 VAW_per_1000_LAG geometry centroid
0 0900200010148 2513.0 85.0 16.0 70.0 13.26 39.0 498.0 4.0 90.0 7 2.785515 2.238242 POLYGON ((478281.827 2156260.877, 478307.065 2... POINT (477965.009 2155945.484)
1 0900200010190 8920.0 602.0 113.0 218.0 11.55 134.0 2011.0 76.0 600.0 34 3.811659 3.507377 POLYGON ((480069.413 2156118.108, 480081.995 2... POINT (479795.113 2155570.427)

Now import the Help Centre locations.

Get Help Centre Locations from CDMX Open Data Portal

The data set is licensed under the open data licence (CC BY 4.0).

# Read the data
gdf = gpd.read_file('data/servicios_de_atencion_a_violencia_contra_las_mujeres/servicios_de_atencion_a_violencia_contra_las_mujeres.shp')
gdf = gdf[gdf['_id'] != 28]
gdf.head(3)
_id id id_ente id_sede nmbr___ calle nmr_xtr nmr_ntr colonia alcaldi ... mdds_d_ apr____ infrm__ prfls_s vsts_d_ a_______ at______ i______ geopont geometry
0 1 0 FGJ FGJ-005 Coordinacion Territorial Az-1 y Az-3 Av de las Culturas y eje 5 Norte S/N S/N San Martin Xochinahuac Azcapotzalco ... None None None None None None 19.5105948,-99.1915796 POINT (-99.19158 19.51059)
1 2 1 FGJ FGJ-007 Coordinación Territorial BJ-3 y Bj-4 Avenida Cuauhtémoc esquina Obrero Mundial S/N S/N Navarte Benito Juárez ... None None None None None None 19.4018736,-99.1556199 POINT (-99.15562 19.40187)
2 3 2 FGJ FGJ-010 Coordinación Territorial Coy-2 Zompantitla Esq. Tecualiapan S/N S/N Romero de Terreros Coyoacán ... None None None None None None 19.3444204,-99.1666016 POINT (-99.16660 19.34442)

3 rows × 43 columns

Reproject to Mexico City UTM:

gdf = gdf.to_crs(epsg=6369)

Find Nearest Point

My query is one-to-many, namely from the centroid to all possible help centres in the city. So, I need to combine all the help centre locations in a sigle list.

list_points = gdf["geometry"].unary_union

And now look for the match. Use shapely - nearest_points to find the closest geometry. Note that the output will refer to the liner distance. Given the number of points, the following operation will take some time. Retrieve the name of the centre too for future map displaying.

%%time
points = []
link_value = []

for i in range(len(gdfa)):
    nearest_geoms = nearest_points(gdfa.centroid[i], list_points)
    points.append(nearest_geoms[1])
    points_data = gdf.loc[gdf["geometry"] == points[i]]
    link_value.append(points_data['nmbr___'].values[0])
    i += 1
CPU times: user 48.1 s, sys: 0 ns, total: 48.1 s
Wall time: 48.1 s

Attach outputs to the AGEB geoDataFrame:

gdfa['closest_centre'] =  gpd.GeoSeries(points, crs="EPSG:6369") #ensure geom format
gdfa['Name_centre'] = link_value
gdfa.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 2279 entries, 0 to 2278
Data columns (total 17 columns):
 #   Column            Non-Null Count  Dtype   
---  ------            --------------  -----   
 0   CVEGEO            2279 non-null   object  
 1   POBTOT            2279 non-null   float64 
 2   PRESOE15          2279 non-null   float64 
 3   P5_HLI            2279 non-null   float64 
 4   POB_AFRO          2279 non-null   float64 
 5   GRAPROES          2279 non-null   float64 
 6   PDESOCUP          2279 non-null   float64 
 7   PSINDER           2279 non-null   float64 
 8   VPH_SINLTC        2279 non-null   float64 
 9   VPH_SINCINT       2279 non-null   float64 
 10  VAW_COUNT         2279 non-null   int64   
 11  VAW_per_1000      2279 non-null   float64 
 12  VAW_per_1000_LAG  2279 non-null   float64 
 13  geometry          2279 non-null   geometry
 14  centroid          2279 non-null   geometry
 15  closest_centre    2279 non-null   geometry
 16  Name_centre       2279 non-null   object  
dtypes: float64(11), geometry(3), int64(1), object(2)
memory usage: 302.8+ KB

At this point, compute the distance in meters using geopands distance and convert to km for better understanding.

gdfa['distance'] = gdfa.centroid.distance(gdfa.closest_centre)
gdfa['dist_km'] = gdfa['distance'] /1000
gdfa['dist_km'].describe()
count    2279.000000
mean        1.169192
std         0.843243
min         0.007343
25%         0.601612
50%         0.985613
75%         1.500619
max         6.715206
Name: dist_km, dtype: float64

On average, the distance from AGEB central points to the closest help centre is approximately 1.5 km, which is not bad, and the maximum distance from a women in danger to a safe place is less than 7 Km. Display distance on a choroplet map and compare it with the spatial lag map.

f, axs = plt.subplots(1, 2, figsize=(15, 12))
ax1, ax2 = axs

gdfa.plot(
    column='VAW_per_1000_LAG',
    cmap='RdYlGn_r',
    scheme='quantiles',
    k=5,
    edgecolor='white',
    linewidth=0.,
    alpha=0.75,
    legend=True,
    ax=ax1
)
ax1.set_axis_off()
ax1.set_title("Rates VAW - LAG")


gdfa.plot(
    column='dist_km',
    cmap='RdYlGn_r',
    scheme='quantiles',
    k=5,
    edgecolor='white',
    linewidth=0.,
    alpha=0.75,
    legend=True,
    ax=ax2
)
ax2.set_axis_off()
ax2.set_title("Distance to Help Centres (Km)")


plt.show()

png

As in many cities around the world, AGEBs on the outskirts are those with less access to services. The distance values we have found are widely acceptable, but do they correspond to reality?

Get street network with OSMnx

As for neighbourhoods, OSMnx allows to download street network data, project and plot the networks.

It takes the name of the place, and returns the network within its boundary. For future calculation, I prefer to retrieve the driving network type.

G = ox.graph_from_place('Mexico City, Mexico', network_type='drive')

As for the distance between AGEBs, the same story applies for the network. With ox.project_graph, the graph will be automatically projected to the local UTM zone.

G_proj = ox.project_graph(G)

Since I want to compute distance and travel time from a point A to a point B, I need to impute missing speeds and travel times:

G_proj = ox.speed.add_edge_speeds(G_proj)
G_proj = ox.speed.add_edge_travel_times(G_proj)

Point A, Point B

I could pick any row, no matter which. I opt for the one with the greatest distance

gdfa.loc[gdfa["dist_km"] == max(gdfa.dist_km)]
CVEGEO POBTOT PRESOE15 P5_HLI POB_AFRO GRAPROES PDESOCUP PSINDER VPH_SINLTC VPH_SINCINT VAW_COUNT VAW_per_1000 VAW_per_1000_LAG geometry centroid closest_centre Name_centre distance dist_km
421 0901200261710 785.0 14.0 22.0 38.0 9.95 6.0 328.0 6.0 90.0 6 7.643312 2.410339 POLYGON ((479531.480 2125207.628, 479656.893 2... POINT (479491.354 2124303.978) POINT (478033.049 2130858.925) Hospital General Ajusco Medio 6715.206217 6.715206

To compute distance in OSM network, I need to find the closest nodes (of the network) to my points, gdfa.centroid and gdfa.closest_centre.

#specify index of the selected points
i = 421

orig_node = ox.distance.nearest_nodes(G_proj, X=gdfa.centroid[i].x, Y=gdfa.centroid[i].y)
dest_node = ox.distance.nearest_nodes(G_proj, X=gdfa.closest_centre[i].x, Y=gdfa.closest_centre[i].y)

Once nodes are found, calculate the routes using two different parameters: length and travel time.

route1 = ox.shortest_path(G_proj, orig_node, dest_node, weight="length")
route2 = ox.shortest_path(G_proj, orig_node, dest_node, weight="travel_time")

Compare the routes:

# import datetime to convert seconds
import datetime

route1_length = int(sum(ox.utils_graph.get_route_edge_attributes(G_proj, route1, "length")))
route2_length = int(sum(ox.utils_graph.get_route_edge_attributes(G_proj, route2, "length")))

route1_time = int(sum(ox.utils_graph.get_route_edge_attributes(G_proj, route1, "travel_time")))
route2_time = int(sum(ox.utils_graph.get_route_edge_attributes(G_proj, route2, "travel_time")))


print("Route 1 is", route1_length/1000, "km and takes", datetime.timedelta(seconds =route1_time), "to get there by car.")
print("Route 2 is", route2_length/1000, "km and takes", datetime.timedelta(seconds =route2_time), "to get there by car.")
Route 1 is 10.645 km and takes 0:18:43 to get there by car.
Route 2 is 14.138 km and takes 0:17:46 to get there by car.

Route 2 is longer but smoother, resulting in shorter travel time.

Take a look at the distances. These values are rather out of the linear distance range, for which we had 7 km as maximum value. This tell us that linear distance is good to have a general insight on data, but not suitable for accurate analysis.

Plotting routes on Networks

Plot the shorter route (for travel time) in green and the longer in red.

# plot static map
fig, ax = ox.plot_graph_routes(
    G_proj, routes=[route1, route2], route_colors=["r", "g"], route_linewidth=6, node_size=0, 
    bgcolor='w', edge_linewidth=0.2)

png

Great! It is easy to understand the best route. However, it is impossible to say which direction to take. We can solve this by plotting the route in folium.

First, define a function that returns the shortest route for travel time.

def shortest_time_route(rt1,rt2,r1,r2):
    if rt1 <= rt2:
        return r1
    else:
        return r2

Store the shortest route in ‘route’ variable.

route = shortest_time_route(route1_time,route2_time,route1,route2)

The route is nothing but the sequence of OSM nodes. Let’s take a look to one node:

G_proj.nodes[orig_node]
{'y': 2124263.8605110194,
 'x': 479429.4467834827,
 'street_count': 3,
 'lon': -99.1956934,
 'lat': 19.211704}

Inside every node, we find both the reprojected coordinates, and in the “original” CRS, namely EPSG:4326. Use this last to build the folium map. First, create a list of the route nodes.

path = []

for nds in route:
    lon = G_proj.nodes[nds]['lon'] 
    lat = G_proj.nodes[nds]['lat'] 
    path.append([lat, lon])
    nds += 1

Note that latitude and longitude have different order in non-reproject network!

# epsg:3426
G.nodes[orig_node]
{'y': 19.211704, 'x': -99.1956934, 'street_count': 3}

Create the map:


start_location=[G_proj.nodes[orig_node]['lat'], G_proj.nodes[orig_node]['lon']]
direction_m = folium.Map(location = start_location, zoom_start = 12)

#Add orig marker
folium.Marker(
  location=[G_proj.nodes[orig_node]['lat'], G_proj.nodes[orig_node]['lon']],
  tooltip='Origin',
).add_to(direction_m)

#add dest marker
folium.Marker(
  location=[G_proj.nodes[dest_node]['lat'], G_proj.nodes[dest_node]['lon']],
  popup= gdfa['Name_centre'].values[i],
  tooltip='Destination',
    icon=folium.Icon(color='red')
).add_to(direction_m)

folium.plugins.MiniMap().add_to(direction_m)

folium.plugins.AntPath(path).add_to(direction_m)


#direction_m.save("output/shortest_path.html") #save to a file
#direction_m

Fullscreen Map

Click on destination icon to get the name of the help centre.

The snake flow tells the direction.

Next