I am using a defined function to query a REST API which returns multiple rows per request from a pandas dataframe using apply. It looks like what I have so far is running the function correctly but because the for loop returns more than one row I only get the last one.
What I want to do is return multiple rows for each row I pass to the function.
This is my function I'm using:
def get_entity_rec(row):
try:
documents = row.content
textcon = row.content[0:2000]
doclang = [textcon]
outputs = []
result = client.recognize_entities(documents = doclang)[0]
entitylength = len(result)
for entity in result.entities:
row['text'] = entity.text
row['category'] = entity.category
row['subcategory'] = entity.subcategory
return row
except Exception as err:
print("Encountered exception. {}".format(err))
And my code where I apply it:
apandas3 = apandas2.apply(get_entity_rec, axis=1)
I get (what i think is) the last result like this:
path | text | category | subcategory |
---|---|---|---|
path of file | i am text | i am the category returned | i am the subcategory returned |
I want to return a dataframe with the original columns repeated with each "entity" returned by the function lke this:
path | text | category | subcategory |
---|---|---|---|
path of file | i am text | i am the category returned | i am the subcategory returned |
path of file | i am text | i am the first category returned | i am the first subcategory returned |
path of file | i am text | i am the second category returned | i am the second subcategory returned |