0

I have a homework assignment where I've been provided with a file about 10,000 lines in length. Each line has 3 elements, a species name, latitude and longitude. I need to write a function which returns the number of animals found within a specific distance of a given location, taking into account 4 perameters: the file name, distance, and the latitude and longitude of the location.

In an ideal world, I'd be able to go into the shell, and call the function with the file name, any distance and any longitude and latitude, and have the number of animals within the distance be calculated.

I've successfully managed to import the file, and I've been given samples of code to help calculate distance and also help convert the file to a list. Here's the code I've written so far:

def LocationCount(filename, distance, Lat1, Lon1):
FIn = open(filename, "r")
for Line in FIn:
    def LineToList(Line):
        Line = Line.rstrip()
    FIn.close()
    return Line.split("\t")

def CalculateDistance(Lat1, Lon1, Lat2, Lon2):

        Lat1 = float(Lat1)
        Lon1 = float(Lon1)
        Lat2 = float(Lat2)
        Lon2 = float(Lon2)

        nDLat = (Lat1 - Lat2) * 0.017453293
        nDLon = (Lon1 - Lon2) * 0.017453293

        Lat1 = Lat1 * 0.017453293
        Lat2 = Lat2 * 0.017453293

        nA = (math.sin(nDLat/2) ** 2) + math.cos(Lat1) * math.cos(Lat2) * (math.sin(nDLon/2) ** 2 )
        nC = 2 * math.atan2(math.sqrt(nA),math.sqrt( 1 - nA ))
        nD = 6372.797 * nC

return nD
  • Welcome to SO! This is a bit beside the point, but I strongly recommend `snake_case` in Python. `CamelCase` is for class names only by convention. Also, be sure to clean up your indentation; this is invalid Python. – ggorlen Mar 27 '19 at 18:15
  • I'm not 100% sure what you're asking, are you having trouble accessing the data in the list returned by `LocationCount()`? If so the first line will be `mylist[0]` or the first element in the `i`th line will be `mylist[i][0]`. If that's not it could you be a bit more specific in what you need to do? – Hoog Mar 27 '19 at 18:22
  • 1
    I'd recommend checking the pandas library. You can load your data into a dataframe (think of a table), and then apply your CalculateDistance function to the whole table, and add the distance as a column. After that, you can filter for all the lines where the distance is within the input distance. Alternative approach would be iterating through your list 1 line at a time, and applying the distance function. – Baris Tasdelen Mar 27 '19 at 18:27
  • sorry, I think this is the process I need to follow: load the file into python, split the file into lines, for each line calculate distance, if that distance is less than the distance specified in the shell, add one to a counter. finally, print the counter. i'm having trouble figuring out how to calculate the distance, since I can't figure out how to access the last 2 integers in each line, the longitude and latitude. – biffalo_ Mar 27 '19 at 18:29
  • Seems there are a few things here to point out. First, the question of splitting a file into lines, and then splitting those lines into lists. Both of these can be accomplished with str.split(), but might be managed more easily with a csv reader. It might save a lot of work to start with pd.read_csv(), then do as @Baris Tasdelen mentioned. Second, as latitude and longitude are degrees, to calculate distance you will need some kind of transformation. You can do this manually https://stackoverflow.com/questions/19412462/getting-distance-between-two-points-based-on-latitude-longitude. – bart cubrich Mar 27 '19 at 19:15

1 Answers1

0

To split a line into parts you can use str.split(). For example, to split a line on whitespaces and into 3 parts, you could use _, lat, lon = line.strip().split(' ') (the underscore is just a convetion to indicate that you don't want to use the first part).

Here is a more complete example. I formatted the code according to Pythons style convention (google Pythons PEP-8 style guide).

import math

def count_locations(filename, max_distance, source_lat, source_lon):
    counter = 0

    with open(filename) as f:
        for line in f:
            try:
                # try to split into 3 parts
                _, lat, lon = line.strip().split(' ')
            except ValueError:
                # cannot be split into 3 parts, so we skip this line
                continue

            try:
                # try to convert
                lat = float(lat)
                lon = float(lon)
            except ValueError:
                # cannot be converted to float, so we skip this line
                continue

            d = calculate_distance(source_lat, source_lon, lat, lon)
            if d <= max_distance:
                counter += 1

    return counter

def calculate_distance(lat_1, lon_1, lat_2, lon_2):
    n_d_lat = (lat_1 - lat_2) * 0.017453293
    n_d_lon = (lon_1 - lon_2) * 0.017453293

    lat_1 = lat_1 * 0.017453293
    lat_2 = lat_2 * 0.017453293

    n_A = (
            math.sin(n_d_lat / 2) ** 2
            + math.cos(lat_1) * math.cos(lat_2) * math.sin(n_d_lon / 2) ** 2
    )
    n_C = 2 * math.atan2(math.sqrt(n_A), math.sqrt(1 - n_A))
    n_D = 6372.797 * n_C

    return n_D

Does this work for you?

Ralf
  • 16,086
  • 4
  • 44
  • 68
  • Thanks so much for trying it out, the function runs without error messages but every calculation comes out as zero. Here's a sample of what the file looks like: Myotis nattereri 54.07663633 -1.006446707 It's separated by tabs, so I replaced the blank space in the string.split for a /t. – biffalo_ Mar 27 '19 at 19:12
  • @biffalo_ are you sure that your formula is correct? Have you tried to print the different steps inside `calculate_distance()` to see where it goes wrong? – Ralf Mar 27 '19 at 20:31
  • For tab did you use \t (not /t) in the split function? – Baris Tasdelen Mar 28 '19 at 01:48