1

Background: I obtained a list of points from Google maps extracted data as a csv. Cleaned it in Pandas, and exported it as a JSON file. (Used Records for export)

Issue: Coordinates are strings. Which makes sense because initially, the coordinates where tied in with a url

Example: https://www.google.com/maps/search/{coordinates}

I used the replace function to clear out the text to only remain with the coordinates. Is there a way to make my values in my Location column numerical type and putting them in a list.

Example Mock-up data of what my exported JSON file looks like:

[
{
      "Bin":"Yes",
      "Location":"##.##,-###.##"
   },

I was trying to clean my data to look like the example below

Example of a GeoJSON file I was trying to model

[
{
    location: [41.8781, -87.6298],
    city: "Chicago"
  },

Goal: I am trying to make a custom map for my use in mapbox

Example Mock-up of how my DataFrame looks like

    Bin         Location
0   Yes         ##.##,-###.##
1   Yes         ##.##,-###.##

Input: df.types

Output:
Bin          object
Location     object
dtype: object

Thank you for the help.

1 Answers1

1

You'll need to store the numbers in location as separate columns (I'm assuming these are lat/long coordinates) in order for them to be treated as numbers and work as you expect them to. Ideally you should change your json cleaning code to return a result that looks like this before you read it into a dataframe:

{
    lat: 41.8781,
    long: -87.6298,
    city: "Chicago"
}

However you can solve this problem once it's in a dataframe as well:

json_data = [
    {"location": [41.8781, -87.6298], "city": "chicago"},
    {"location": [44.8141, 20.1234], "city": "somewhere"}
]

df = pd.DataFrame.from_records(json_data)

print(df)
    location            city
0   [41.8781, -87.6298] chicago
1   [44.8141, 20.1234]  somewhere

print(df.dtypes)
location    object
city        object
dtype: object

Applying our transformation:

df[["lat", "long"]] = pd.DataFrame(df["location"].tolist(), columns=["lat", "long"])

print(df)
    location            city      lat       long
0   [41.8781, -87.6298] chicago   41.8781   -87.6298
1   [44.8141, 20.1234]  somewhere 44.8141   20.1234

print(df.dtypes)
location     object
city         object
lat         float64
long        float64
dtype: object

What we just did was tell pandas that our "location" column actually has 2 values in it and that they should be in separate columns. We expand this and add it back to the original dataframe.

If for whatever reason, pandas doesn't parse your lat/long columns as floats automatically, you can use pd.to_numeric to convert object columns to integer/float dtypes.

df["lat"] = pd.to_numeric(df["lat"])
df["long"] = pd.to_numeric(df["long"])

print(df)
              location       city      lat     long
0  [41.8781, -87.6298]    chicago  41.8781 -87.6298
1   [44.8141, 20.1234]  somewhere  44.8141  20.1234

print(df.dtypes)
location     object
city         object
lat         float64
long        float64
dtype: object
Cameron Riddell
  • 10,942
  • 9
  • 19
  • I am not reading the JSON file into my Dataframe. I started off with a CSV file then cleaned the data and exported it to a JSON file, but thank you for your response this is really helpful. I am going to give this another try with your method. I appreciate your in-depth response its really good! – Juan M Guevara Martinez Sep 17 '20 at 18:55
  • Question is there anyway to combine lat and long and keeping the float64 format? I tried list(zip(df.lat, df.long)), but my combined zip made it an object again. – Juan M Guevara Martinez Sep 17 '20 at 23:42
  • Unfortunately there is not. When you zip those numbers together, you end up making them a column of tuples. Since the values in the column are tuples, pandas can not determine that they are floats since all it can "see" is the tuple and not the numbers inside of it. Out of curiosity, why do you want them together? – Cameron Riddell Sep 17 '20 at 23:45
  • I see, thanks for the follow up. I wanted them together so that when I call the coordinates in my javascript file I can just call it as: L.marker( city.location ) (using leaflet) – Juan M Guevara Martinez Sep 18 '20 at 17:59
  • Ah, I've never worked with leaflet. But would it be possible to do something like: `L.marker({"lat": city.lat, "lon": city.lon})` based on [this documentation](https://leafletjs.com/reference-1.7.1.html#latlng). Alternatively, you could try zipping those arrays in javascript via [this stackoverflow answer](https://stackoverflow.com/questions/22015684/how-do-i-zip-two-arrays-in-javascript) – Cameron Riddell Sep 18 '20 at 21:11