How do I apply a function over a column?

Question

I have created a function I would like to apply over a given dataframe column. Is there an apply function so that I can create a new column and apply my created function? Example code:

dat = pd.DataFrame({'title': ['cat', 'dog', 'lion','turtle']})

Manual method that works:

print(calc_similarity(chosen_article,str(df['title'][1]),model_word2vec))
print(calc_similarity(chosen_article,str(df['title'][2]),model_word2vec))

Attempt to apply over dataframe column:

dat['similarity']= calc_similarity(chosen_article, str(df['title']), model_word2vec)

The issue I have been running into is that the function outputs the same result over the entirety of the newly created column.

I have tried apply() as follows:

dat['similarity'] = dat['title'].apply(lambda x: calc_similarity(chosen_article, str(x), model_word2vec))

and

dat['similarity'] = dat['title'].astype(str).apply(lambda x: calc_similarity(chosen_article, x, model_word2vec))

Which result in a ZeroDivisionError which i am not understanding since I am not passing empty strings

Function being used:

def calc_similarity(input1, input2, vectors):
    s1words = set(vocab_check(vectors, input1.split()))
    s2words = set(vocab_check(vectors, input2.split()))
    
    output = vectors.n_similarity(s1words, s2words)
    
    return output

Does you function take a vector as input? return a vector? If this is a scalar it is normal to have a single value — mozway, Jul 05 '22 at 13:48
`dat['similarity'] = dat['title'].astype(str).apply(lambda x: calc_similarity(chosen_article, x, model_word2vec))` — mozway, Jul 05 '22 at 13:49
@mozway the function does take a vector input. I have it defined as `def calc_similarity(input1, input2, vectors): s1words = set(vocab_check(vectors, input1.split())) s2words = set(vocab_check(vectors, input2.split())) output = vectors.n_similarity(s1words, s2words) return output` . Would that affect the approach on how I would use .apply()? I am getting a keyerror within the proposed solution — Zachqwerty, Jul 05 '22 at 14:02
Does this answer your question? [How can I use the apply() function for a single column?](https://stackoverflow.com/questions/34962104/how-can-i-use-the-apply-function-for-a-single-column) — Yaakov Bressler, Jul 05 '22 at 14:03
please [edit](https://stackoverflow.com/posts/72870529/edit) the question with the details — mozway, Jul 05 '22 at 14:03
@YaakovBressler I believe the apply() function is the solution. I am having issues on implementing into my code though. I dont quite understand why it has not been working — Zachqwerty, Jul 05 '22 at 14:06

score 1 · Accepted Answer · answered Jul 05 '22 at 20:59

It sounds like you are having difficulty applying a function while passing additional keyword arguments. Here's how you can execute that:

# By default, function will use values for first arg.
# You can specify kwargs in the apply method though
df['similarity'] = df['title'].apply(
    calc_similarity,
    input2=chosen_article,
    vectors=model_word2vec
)

How do I apply a function over a column?

1 Answers1