Python Pandas: count the number of words in a data frame

Question

There is a large data frame name dataframe1. for example(just a few):

 date                  text                             name
 1      I like you hair, do you like it              screen1
 2      beautiful sun and wind                       screen2
 3      today is happy, I want to got school         screen3
 4      good movie                                   screen4
 5      thanks god                                   screen1
 6      you are my son and I love you                screen2
 7      the company  is good                         screen1
 8      no one can help me, only you                 screen2
 9      the book is good and I read it everyday      screen3
 10      water is the source of love                 screen4
 11     I like you hair, do you like it              screen1
 12     my love man is leaving                       screen2

I want to calculate the number of the words of each name's text(such as all the screen1's text in the dataframe1) use the function count_noun(str). Further, the con_noun(str) is ok and finished.

I want to extract all the text which have the same name in the data frame and calculate the noun counts. Please don't focus on the function count_noun(str), and I have finished it.

My code:

import pandas as pd
import numpy as np

screen_name_unique = list(set(dataframe1['name']))
for name in screen_name_unique:
   dataframe_text = dataframe1[dataframe1.name == name]
   count = noun_count(dataframe['text'])



 def noun_count (str):
    words_len = len(str)
    return words_len

I found it is wrong and don't know how to solve it, for example extract all the name1's text to be string and send it to function: noun_count(str), please give me your hand, thanks!

What is wrong? How you know noun_count isn't wrong? Try print the count , because right now you dont return anything. — Merlin, Jul 11 '16 at 03:15
thanks for comments, please don't focus on the noun_count() function. I just want to extract all the text of each name, and calculate the number of noun words. I have no idea how to solve it after I extract the text for each name. The next step I have no idea. — tktktk0711, Jul 11 '16 at 03:18
@Merlin, the function noun_count(Str) , the parameter str is a string type — tktktk0711, Jul 11 '16 at 03:26
unless you include that function, it will be hard to figure out what is not working. — Merlin, Jul 11 '16 at 03:29
@Merlin, thanks for your comment, I just give a simple noun_count function. Please don't focus on the this function, the point is that I should get the each name's text and make it be to a string. — tktktk0711, Jul 11 '16 at 03:49
google: word count with pandas -- it returns many SO posts. this is duplicate. — Merlin, Jul 11 '16 at 03:55

tktktk0711 · Accepted Answer · 2017-01-20T03:51:41.143

1

I have solved it, use the apply() function to count

import pandas as pd
import numpy as np

screen_name_unique = list(set(dataframe1['name']))
for name in screen_name_unique:
  dataframe_text = dataframe1[dataframe1.name == name]
  dataframe_text['text'].apply(noun_count)



def noun_count (str):
  words_len = len(str)
  return words_len

edited Jan 20 '17 at 03:51

answered Jul 12 '16 at 08:43

tktktk0711

1,656
7
32
59

`len(str)` will calculate the number of characters, not the number of words. – pnv Sep 29 '17 at 05:56

Python Pandas: count the number of words in a data frame

1 Answers1