2

I am using geoprapy to to get locations via a URL. I have a URL column for my DataFrame. I am attempting to run a pre-built Geograpy function on each URL and create a new column of the locations on the DataFrame. So, I have tried (from other questions):

hits['place'] = geograpy.get_place_context(url=hits.urls)

# and

hits['place'] = hits.apply(geograpy.get_place_context(url=hits.urls), axis=1))

# and

def getPlace(frame):
    urls = frame['urls']
    print(urls)
    frame['place'] = geograpy.get_place_context(url=urls)
    return frame

getPlace(hits)

Along with a few others. I keep getting

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Which I understand as that it is seeing URLs as a whole column and cannot run the function on the column? Doesn't really matter.

How can I run a function for every row in a dataframe and create a new column?

I expect my places to be a 'memory type object' I can reference later. I have part of this to work via:

for url in urls:
    place = (geograpy.get_place_context(url=url))
    region = place.country_regions

However, later in the code, the iterations causes it to fall apart.

Sam Dean
  • 379
  • 9
  • 19

2 Answers2

3

pandas.apply function does exactly what you want, you just didn't pass the right argument. You can see in the documentation that you need to pass a function, not the result of the function call.

So, just pass geograpy.get_place_context to apply like this -

hits['place'] = hits['urls'].apply(geograpy.get_place_context, axis=1)
  • The problem is that I have to pass the URL to the ```get_place_context``` function like so ```hits['place'] = hits.assign(geograpy.get_place_context(url=hits['url']))``` It does not register that URL – Sam Dean Dec 05 '19 at 18:25
  • 2
    You don't have to pass the url manually. Apply passes the values in the column to your function. – Maya Gershovitz Bar Dec 06 '19 at 18:44
2

You should use .apply() over the urls column like:

hits['place'] = hits['urls'].apply(geograpy.get_place_context, axis=1)

This answer had helped find the distinction between different vectorization methods and their usage. Hope you find it useful too.

Edit: Since only one column is used to create another, .apply() over that column should work fine for you. .apply() is defined over a DataFrame as well as a Series.

S.Au.Ra.B.H
  • 457
  • 5
  • 9