0

I have a df containing some references for my projects, such like this:

           root_path   Credentials
Project1   path/to/    cred1.json
Project2   path/to/    cred2.json
Project3   path/to/    cred3.json

I need a third column Client to save the response of the connection to BigQuery API, which is an object. The argument to pass to the function is the string formed by df['root_path'] + df['Credentials']. For this, I tried with this two queries:

  1. Using apply()

    df['Client'] = df.apply(lambda x : bigquery.Client.from_service_account_json((df['root_path'] + df['Credentials']).values), axis = 1)

  2. Using map()

    df['Client'] = map(lambda root,bq : bigquery.Client.from_service_account_json(root + bq), df['root_path'], df['Credentials'])

Finally, the second query gave me the result I wanted. Could someone explain me why the second works and the first not? As far as I understand, it's because the first query returns everytime the whole series of objects, and the second calls the function row by row.

Thanks in advance.

Parfait
  • 104,375
  • 17
  • 94
  • 125
Julio
  • 119
  • 1
  • 9
  • 1
    #1 looks kind of suspicious to me since you aren't using the lambda variable `x`. – 0x5453 Jul 15 '20 at 20:38
  • Change `df` inside the `lambda` function to `x` and do not call [`.values`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.values.html) (deprecated btw for newer [`to_numpy()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_numpy.html)). – Parfait Jul 15 '20 at 20:59

0 Answers0