0

There is a large data frame name dataframe1. for example(just a few):

 date                  text                             name
 1      I like you hair, do you like it              screen1
 2      beautiful sun and wind                       screen2
 3      today is happy, I want to got school         screen3
 4      good movie                                   screen4
 5      thanks god                                   screen1
 6      you are my son and I love you                screen2
 7      the company  is good                         screen1
 8      no one can help me, only you                 screen2
 9      the book is good and I read it everyday      screen3
 10      water is the source of love                 screen4
 11     I like you hair, do you like it              screen1
 12     my love man is leaving                       screen2

I want to calculate the number of the words of each name's text(such as all the screen1's text in the dataframe1) use the function count_noun(str). Further, the con_noun(str) is ok and finished.

I want to extract all the text which have the same name in the data frame and calculate the noun counts. Please don't focus on the function count_noun(str), and I have finished it.

My code:

import pandas as pd
import numpy as np

screen_name_unique = list(set(dataframe1['name']))
for name in screen_name_unique:
   dataframe_text = dataframe1[dataframe1.name == name]
   count = noun_count(dataframe['text'])



 def noun_count (str):
    words_len = len(str)
    return words_len

I found it is wrong and don't know how to solve it, for example extract all the name1's text to be string and send it to function: noun_count(str), please give me your hand, thanks!

tktktk0711
  • 1,656
  • 7
  • 32
  • 59
  • If you want to more information about this, please tell me – tktktk0711 Jul 11 '16 at 02:31
  • What is wrong? How you know noun_count isn't wrong? Try print the count , because right now you dont return anything. – Merlin Jul 11 '16 at 03:15
  • thanks for comments, please don't focus on the noun_count() function. I just want to extract all the text of each name, and calculate the number of noun words. I have no idea how to solve it after I extract the text for each name. The next step I have no idea. – tktktk0711 Jul 11 '16 at 03:18
  • @Merlin, the function noun_count(Str) , the parameter str is a string type – tktktk0711 Jul 11 '16 at 03:26
  • unless you include that function, it will be hard to figure out what is not working. – Merlin Jul 11 '16 at 03:29
  • @Merlin, thanks for your comment, I just give a simple noun_count function. Please don't focus on the this function, the point is that I should get the each name's text and make it be to a string. – tktktk0711 Jul 11 '16 at 03:49
  • google: word count with pandas -- it returns many SO posts. this is duplicate. – Merlin Jul 11 '16 at 03:55

1 Answers1

1

I have solved it, use the apply() function to count

import pandas as pd
import numpy as np

screen_name_unique = list(set(dataframe1['name']))
for name in screen_name_unique:
  dataframe_text = dataframe1[dataframe1.name == name]
  dataframe_text['text'].apply(noun_count)



def noun_count (str):
  words_len = len(str)
  return words_len
tktktk0711
  • 1,656
  • 7
  • 32
  • 59