I have two CSV files containing coordinate information for points. One file has a list of eleven unique IDs (let's call it File 1), and each ID corresponds to a lat/lon coordinate. The other file (let's call it File 2) has around 300 lat/lon points, and each point matches one of those eleven unique IDs. What I want to do is check to see if a point in File 2 has the same ID has a point in File 1, and if so, calculate the distance between those two points.
I have tried to do so using the following code:
for index, row in res_df:
for windex, wrow in will_df:
if res_df['Residence_ID'] == will_df['Residence_ID']:
print(res_df.distance(will_df))
However, when I try this, I get the following error:
ValueError Traceback (most recent call last)
<ipython-input-73-b4268cef3e71> in <module>()
----> 1 for index, row in res_df:
2 for windex, wrow in will_df:
3 if res_df['Residence_ID'] == will_df['Residence_ID']:
4 print(res_df.distance(will_df))
5
ValueError: too many values to unpack (expected 2)
I also tried using iterrows at one point, but that did not fix my problem.
Additionally, I wanted to try to count the number of matching records, and I ran into a problem here too:
counter = 0
for id1 in res_df['Residence ID']:
for id2 in will_df['Residence_ID']:
if id1 == id2:
print("match")
counter += 1
print(counter)
When I run the code above, my counter returns a value of 52; however, this doesn't make sense, because all of my 300 records in File 2 match with some record in File 1. So, I think I am missing some fundamental logic here.
EDIT:
I also just tried:
for index, row in res_df.items():
for windex, wrow in will_df.items():
if res_df['Residence_ID'] == will_df['Residence_ID']:
print(res_df.distance(will_df))
and the error message is:
ValueError Traceback (most recent call last)
<ipython-input-80-65e9f78ad2a5> in <module>()
1 for index, row in res_df.items():
2 for windex, wrow in will_df.items():
----> 3 if res_df['Residence_ID'] == will_df['Residence_ID']:
4 print(res_df.distance(will_df))
/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __nonzero__(self)
1553 "The truth value of a {0} is ambiguous. "
1554 "Use a.empty, a.bool(), a.item(), a.any() or a.all().".format(
-> 1555 self.__class__.__name__
1556 )
1557 )
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Not sure what this means.