I am trying to count the number of words in every row of a dataframe column. Every word is separated by a comma. The name of the column is Items
.
I tried achieving this by looping over every word of the dataframe row using apply
and lambda
. However, I am not sure how to count the number of words -
# Import pandas library
import pandas as pd
# initialize list elements
data = {'Company': ['Nike', 'Levi', 'Dell'],
'Items': ['Shoes, Shorts, Socks', 'Jeans, Jackets', 'Laptops']}
# Create the pandas DataFrame with column name is provided explicitly
df = pd.DataFrame(data)
df['ind_words'] = df.Items.apply(lambda x: ' '.join([word for word in x.split(",")]))
df['lengths'] = df['ind_words'].count()
# print dataframe.
print(df.head())
Doing this resulted in -
Company Items ind_words lengths
0 Nike Shoes, Shorts, Socks Shoes Shorts Socks 3
1 Levi Jeans, Jackets Jeans Jackets 3
2 Dell Laptops Laptops 3
The column lengths
is wrong. I understand why the function count()
is wrong here, but I don't know what function to use.
Here is the ideal output -
Company Items length
0 Nike Shoes, Shorts, Socks 3
1 Levi Jeans, Jackets 2
2 Dell Laptops 1