1

I have a data frame as

enter image description here

which as you can see has two columns. One has company names and another has a string of text corresponding to each company name.

I want to perform these lines of code on each of the texts. (sentence in the below code shall be each of the texts)

def nlp_func(text)
 neg = 0
 pos = 0
 sentence = sentence.lower()
 for word in words:
    classResult = classifier.classify( word_feats(word))
    if classResult == 'neg':
        neg = neg + 1
    if classResult == 'pos':
        pos = pos + 1
 if (pos>neg)
  print('Positive: ' + str(float(pos)/len(words)))
 else
 print('Negative: ' + str(float(neg)/len(words)))

Instead of printing the result I want to store it in another dataframe which would look like

company_names     value
3M Company         pos
ANSYS              neg

I am new to both python and pandas so I can't figure out how exactly to do it. I need help in two places.

First : How do I send the text corresponding to the company_names as an argument to the function nlp_func?

Second : How do I create another dataframe and store the values each time the function is called?

The Zach Man
  • 738
  • 5
  • 15
rini saha
  • 195
  • 1
  • 1
  • 11
  • I'm not going to type up a full answer at the moment, but these should be helpful to you: https://stackoverflow.com/a/16476974/7916348 has how to get individual rows of a dataframe https://stackoverflow.com/a/17496530/7916348 has how to get a dataframs from a list of rows. – The Zach Man Apr 25 '19 at 18:03

1 Answers1

0

Let your dataframe be something like:

import pandas as pd

df = pd.DataFrame({"Company":["A", "B"],
                   "Sentence":["Something here", "Something there"]})

that is:

    Company Sentence
0      A    something here
1      B    something there

then your function should be:

def nlp_func(df):
    res = {}
    for r in df.itertuples():
        # DO YOUR NLP ANALYSIS
        # code
        # code
        # code
        if (pos>neg):
            res[r.Company] = "Pos"
        else:
            res[r.Company] = "Neg"
    return pd.DataFrame(res.items(), columns=["Company", "Value"])

Note that:

  1. you iterate over rows in the dataframe by for r in df.itertuples():
  2. you get the value in column Company by r.Company
  3. you get the value in column Sentence by r.Sentence
sentence
  • 8,213
  • 4
  • 31
  • 40
  • It will work except I am getting one small error again. When I call the function it says 'DataFrame constructor not properly called!' and points to the last line of the function 'return pd.DataFrame(res.items(), columns=["Company", "Value"])'. – rini saha Apr 26 '19 at 07:29