0

I have a dataframe (testdata) of 1 column with n rows that each contain a string. I want to send all to an API and store the result in the next column. But the API where I can send lists to analyze something (like ["Test1"],["Test2"]) has a limit of 20 elements per call, that's why i wanted to write a function that for each row/cell sends the string element (doc) to the API and stores the result in a new column via the apply-function.

testdata
def API(doc):
    result = APIclient.analyze(doc, language="en")
    return result
API(["Test1"],["Test2"])
#testdata["Response"] = testdata.apply(API, axis = 1)

Right now, the function works if applied to lists like here (API(["Test1"],["Test2"])) but if i want to apply it to every row my column i get the typical The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().-error. Does anyone know how to solve this? Thanks!

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • 1
    This code doesn't work: `TypeError: API() takes 1 positional argument but 2 were given`. Please make a [mre] including working code as well as some example data and desired output. That'll require making some equivalent of `APIclient.analyze` for testing purposes. See [How to make good reproducible pandas examples](/q/20109391/4518341) for specifics. – wjandrea Apr 07 '22 at 15:53
  • 1
    Oh wait, I just realized you're doing `testdata.apply`, not `testdata[column].apply`, which means `doc` is a row (`Series`), not a string. – wjandrea Apr 07 '22 at 16:02

1 Answers1

0

Rows and cells are not interchangeable. A row is a vector (Series), while a cell is a scalar (usually). It looks like the error is happening because APIclient.analyze() expects a string, not a Series.

Simply replace testdata.apply with either:

  • testdata[column].apply - by column name
  • testdata.iloc[:, 0].apply - by column number (0)
wjandrea
  • 28,235
  • 9
  • 60
  • 81