0

I'm relatively new to python so I'm sorry in advance if I'm asking too dumb questions

I have CSV file with following columns: 'CarNumber','DateTime', 'GPS', 'Speed'. GPS column contains information in the form of: 'Latitude : Longitude'

I want to:

1) Load CSV file

2) Split GPS column to Latitude and Longitude columns

3) Apply Haversine formula in order to calculate distance between two points with known Latitude and Longitude. So far I've come up with following function:

def distRad(glat1, glng1, glat2, glng2):
    from math import sin, cos, sqrt, atan2, radians, asin
    # approximate radius of earth in km
    R = 6371.0
    lat1 = radians(glat1)
    lng1 = radians(glng1)
    lat2 = radians(glat2)
    lng2 = radians(glng2)
    dlng = lng2 - lng1
    dlat = lat2 - lat1
    #a = 2 * asin((sin(dlng/2)**2+cos(lng1)*cos(lng2)*sin(dlat/2)**2)**0.5)
    #c = a
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlng / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    return R * c

4) Upload results to new csv file with Columns: 'CarNumber', 'DateTime', 'Latitude', 'Longitude', 'Distance' I know that might sound really simple and trivial but I still need guidance

Portion of my CSV file:

CarNumber;DateTime;GPS;Speed
230;04.06.2019 0:00:12;87,96978 : 159,588606;20

Thank you!

Adren
  • 83
  • 7
  • What's the question? Does your solution work? Is it deficient? How is it deficient? Do you suspect any part of it being the culprit? Why? – wwii Sep 19 '19 at 02:18
  • I dont know how to load CSV file and perform following steps – Adren Sep 19 '19 at 02:19
  • Can you provide the portion of your CSV? You can directly paste it in the question as text. – moys Sep 19 '19 at 02:19
  • Welcome to SO. Please take the [tour] and take the time to read [ask] and the other links found on that page. This isn't a discussion forum or tutorial service. There is a [`csv`](https://docs.python.org/3/library/csv.html) module and the examples in the docs should get you started. – wwii Sep 19 '19 at 02:21
  • SH-SF, here's my CSV: CarNumber;DateTime;GPS;Speed 230;04.06.2019 0:02:43;40,969915 : 151,588611;7,72 – Adren Sep 19 '19 at 02:22
  • Can you share how lat long is stored in the column ? – dper Sep 19 '19 at 02:42
  • What are `glat1, glng1, glat2, glng2` in your function? I ask because you have only one lat & long value in each row. However, your function needs 4 variables. – moys Sep 19 '19 at 02:45
  • SH-SF, glat1 glng1 are coordinates of current point and glat2 glng2 of previous one – Adren Sep 19 '19 at 02:47
  • user375916, 57,966779 : 101,655177 – Adren Sep 19 '19 at 02:48

2 Answers2

2
  1. Load the csv to dataframe

    df = pd.read_csv('data.csv', delimiter=',')
    
  2. Assuming lat and long are separated by :

    df[lat], df['long'] = df['GPS'].str.split(': ', 1).str
    
  3. Apply haversine formula using the haversine package in python

    from haversine import haversine
    df['Distance'] = haversine()
    
  4. To upload it to csv you can use,

    df.to_csv('check.csv', sep=',', encoding='utf-8')
    

Please note that these are just pointers, I did not test the code but this should get you started.

Edit:

For iterating through your CSV dataset, you can do the following

import csv
import copy
def read_csv(filepath, has_header=False):
    with open(filepath, 'r') as file:
        reader = csv.reader(file)
        data = list(reader)
        header = None
        if has_header:
            header = data[0]
            data = data[1:]


    file.close()
    return data, header

codes_dict = {}
data, header = read_csv("data/your_csv.csv", has_header=True)

# iterate and create the map having lat long in codes_dict
for row in data:
    # ...
Ayrat
  • 1,221
  • 1
  • 18
  • 36
dper
  • 884
  • 1
  • 8
  • 31
  • Steps 1 and 2 working just fine I installed haversine package, now it gives me error on df['Distance']=haversine() Do I need to specify my lat and long columns? – Adren Sep 19 '19 at 03:08
  • @MDoskarin Like I said, this is just a pseudo code, for step 3 you need to provide lat long to the formula or else it wouldn't understand the parameters. Read this for the same : https://pypi.org/project/haversine/ – dper Sep 19 '19 at 03:16
  • Ah, I see. lyon = (45.7597, 4.8422) # (lat, lon) paris = (48.8567, 2.3508) haversine(lyon, paris) >> 392.2172595594006 # in kilometers How would it work in case where I need to calculate distance between to points corresponding to nearest time – Adren Sep 19 '19 at 03:20
  • you need to create a dictionary of (lat long) by iterating through your dataframe and then calculate the distance. – dper Sep 19 '19 at 03:34
  • @MDoskarin I have edited the code to make it more understandable, I can't provide you the exact answer as would have to look at the data as well. This should be sufficient for you to understand the dynamics of how you need to proceed with the problem. – dper Sep 19 '19 at 03:41
  • @MDoskarin Please upvote and mark as accepted if you think this helps you move forward. Thanks – dper Sep 19 '19 at 03:42
  • @MDoskarin No issues, one more suggestion, I am not sure about why are you choosing haversine distance, but there are others as well, I prefer vincenty's distance over haversine. Here is a discussion to it if it helps , https://stackoverflow.com/questions/19412462/getting-distance-between-two-points-based-on-latitude-longitude/43211266#43211266 – dper Sep 19 '19 at 03:56
  • I'll certainly look into Vincenty's method as soon as I'll figure out how to iterate through column, thank you! – Adren Sep 19 '19 at 04:07
  • 1
    iterating is pretty straight forward, google it and you will find many solutions to it – dper Sep 19 '19 at 04:21
0

I would simply recommend using pandas library and use the function read-csv().

import pandas as pd 
df = pd.read_csv(‘nameofcsv.csv’)
print(df)

Calculations here

pd.to_csv(‘savename.csv’)
Colton Neary
  • 80
  • 1
  • 9