Map dataframe function without lambda

Question

I have the following function:

def summarize(text, percentage=.6):
    import numpy as np
    sentences = nltk.sent_tokenize(text)
    sentences = sentences[:int(percentage*len(sentences))]
    summary = ''.join([str(sentence) for sentence in sentences])
    return summary

And I want to map it to dataframe rows. It works pretty well when I use the following code :

df['summary'] = df['text'].map(summarize)

However, when I want to change the percentage variable in this call, it does df['summary'] = df['text'].map(summarize(percentage=.8)), it shows an error indicating it requires another argument, which is text. Of course, it can be resolved using a lambda function as follows:

df['summary'] = df['text'].map(lambda x: summarize(x, percentage=.8))

But I do not want use the lambda in the call. Is there any method to do it otherwise? For example using kwargs inside the function to refer to the text column in the dataframe? Thank you

jezrael · Accepted Answer · 2023-01-05T12:10:16.867

1

Possible solution is use Series.apply instead map, then is possible add parameters without lambda like named arguments:

df['summary'] = df['text'].map(summarize, percentage=.8)

TypeError: map() got an unexpected keyword argument 'percentage'

df['summary'] = df['text'].apply(summarize, percentage=.8)

edited Jan 05 '23 at 12:10

answered Jan 05 '23 at 12:04

jezrael

822,522
95
1,334
1,252

Map dataframe function without lambda

1 Answers1