0

I am trying to geocode multiple addresses (batch geocoding) from the CSV file. I have made the below attempt using Mapbox geocoding, however, it only returns addresses in the output instead of geographical information (Latitude, Longitude). I have searched through existing answers 1, 2, 3 but most of the answers are based on Javascript and online geocoding tools or Softwares. I am trying to achieve this with Python.

Address column in sample_addresses.csv file:

1/18 MCLAUGHLIN STREET COLAC 3250
2/18 MCLAUGHLIN STREET COLAC 3250
3/18 MCLAUGHLIN STREET COLAC 3250
18 MCLAUGHLIN STREET COLAC 3250
18 MCLAUGHLIN STREET COLAC 3250
107 MAIN STREET ELLIMINYT 3250
105 MAIN STREET ELLIMINYT 3250
1/426 MURRAY STREET COLAC 3250
2/426 MURRAY STREET COLAC 3250
3/426 MURRAY STREET COLAC 3250
426 MURRAY STREET COLAC 3250
426 MURRAY STREET COLAC 3250
164 MURRAY STREET COLAC 3250
162 MURRAY STREET COLAC 3250
1/27 SKENE STREET COLAC 3250

Mapbox Geocoding

from mapbox import Geocoder
import pandas as pd
import json

geocoder = Geocoder(access_token="pk.-------------------------------")

# response = geocoder.forward('Colac, Victoria 3250, Australia')


def load_dataset():
    """Load data from CSV."""
    citiDF = pd.read_csv("sample_addresses.csv").head(5)
    return citiDF


def geocode_address(address):
    """Geocode street address into lat/long."""
    response = geocoder.forward(address)
    coords = str(response.json()['features'][0]['center'])
    coords = coords.replace(']', '')
    coords = coords.replace('[', '')
    return coords


def geocode_dataframe(row):
    """Geocode start and end address."""
    citiDF = geocode_address(row['ADD_EZI_ADDRESS'])

    print(row)


citiDF = load_dataset()
citiDF.apply(geocode_dataframe, axis=1)
citiDF.to_csv('geocoded_results.csv')

Output: The code only returns addresses in the output instead of geographical information (Latitude, Longitude)

ADD_EZI_ADDRESS    1/18 MCLAUGHLIN STREET COLAC 3250
Name: 0, dtype: object
ADD_EZI_ADDRESS    2/18 MCLAUGHLIN STREET COLAC 3250
Name: 1, dtype: object
ADD_EZI_ADDRESS    3/18 MCLAUGHLIN STREET COLAC 3250
Name: 2, dtype: object
ADD_EZI_ADDRESS    18 MCLAUGHLIN STREET COLAC 3250
Name: 3, dtype: object
ADD_EZI_ADDRESS    18 MCLAUGHLIN STREET COLAC 3250
Name: 4, dtype: object
Case Msee
  • 405
  • 5
  • 17
  • attach the output after printing `response ` – oreopot Jun 09 '20 at 07:06
  • The output after printing is already attached to qyestion at the end. – Case Msee Jun 09 '20 at 12:14
  • It is simpler to use a batch geocoding tool. For eg: https://geocode.xyz/972701319580567,share?export=GeoCluster You may also use a bash script to geocode a csv file: #!/bin/bash while IFS='' read -r line || [[ -n "$line" ]]; do echo $line,`curl -X POST -d locate="$line" -d geoit="csv" https://geocode.xyz`; done < "$1" – Ervin Ruci Jun 15 '20 at 17:49

1 Answers1

0

I used a list comprehension and it works well. I imagine using apply or map may be faster.

def load_data(filename):
    df = pd.read_csv(filename)
    return df
def geocode_address(address):
    response = geocoder.forward(address)
    coords = str(response.json()["features"][0]["center"])
    coords = coords.replace("]", "")
    coords = coords.replace("[", "")
    return coords
def geocode_df(df):
    df["coordinates"] = [geocode_address(i) for i in df["address"]]
    return df

if __name__ == "__main__":
    data = load_data(args.file)
    geocode_df(data)
    data.to_csv(f"GEOD--{args.file}", index=False)
JDots
  • 45
  • 1
  • 7