0

I have a case study to work on, in which there are several customer reviews available and I have to do the following

 

  1. predict their sentiment (positive, negative, neutral) based on their reviews

  2. display a wordcloud of not the frequently occurring words, but the words which are the pain-points of the customer and of what is the customer happy about.

    e.g. If many customers are happy about leather-strap of the watch, then the wordcloud should display 'leather-strap' in the wordcloud of positive sentiments. 

    & If many customers are complaining/unhappy about the dial-size, then the wordcloud should display 'dial-size' in the wordcloud of negative sentiments.

Point 1 can be achieved more or less using VADER.

But I am not sure how to achieve point 2, as it is not the usual wordcloud of frequently occurring words.

Can you please help me on how can I achieve the second task?

Ismael Padilla
  • 5,246
  • 4
  • 23
  • 35
  • Does this answer your question? [How to create a word cloud from a corpus in Python?](https://stackoverflow.com/questions/16645799/how-to-create-a-word-cloud-from-a-corpus-in-python) – Trenton McKinney Jul 01 '20 at 05:46
  • A good blog, but it doesn't help my task. I want to generate wordcloud of customer pain-points which gets hidden under the frequently occuring words. for example, if I have 2 reviews,one says, 'The leather strap of watch is bad', the other says,'The watch dial is bad', then I need a wordcloud which shows words:'leather strap' & 'dial' but these words wont be displayed as the word 'bad' is the most frequent in both reviews. – prachi dhotre Jul 01 '20 at 05:56
  • Honestly, the way the question is written, no one will answer it. Please read the following documentation, then [edit] and rephrase the question. [Take the Tour](https://stackoverflow.com/tour) & [How to ask a good question](https://stackoverflow.com/help/how-to-ask). Always [Provide a Minimal, Reproducible Example (e.g. code, data, errors) as text](https://stackoverflow.com/help/minimal-reproducible-example) & you're expected to [try to solve the problem first](https://meta.stackoverflow.com/questions/261592/how-much-research-effort-is-expected-of-stack-overflow-users). – Trenton McKinney Jul 01 '20 at 05:57
  • You can look at [How to create a wordcloud according to frequencies in a pandas dataframe?](https://stackoverflow.com/questions/57826063/how-to-create-a-wordcloud-according-to-frequencies-in-a-pandas-dataframe) – Trenton McKinney Jul 01 '20 at 06:18

2 Answers2

0

Since no code was shared, I would break this problem into the following steps:

  1. You already found the sentiments

  2. for the group of sentences with positive sentiment, find the most important word from each sentence

  3. calculate the number of times this word is present in the positive sentiments

  4. create the word cloud based on the number of times each word is available

  5. Repeat step 2-4 for the sentences with negative sentiments

Check the following link: https://amueller.github.io/word_cloud/generated/wordcloud.WordCloud.html.

Let me know if I missed understanding your exact issue.

MrRaghav
  • 335
  • 3
  • 11
  • Thank you for your response. There is no code present in this question because I am searching for how to achieve the task, code will be the next step. Suppose I have 2 sentences in positive reviews of a watch: 1: The watch has a good dial. 2: The good thing about the product is the leather strap. Steps 2-4 will give the most frequently occurring words ,i.e 'good' ; but what I want is not the frequently occurring sentiment laden words but the words which the sentiment laden words talk about ,like here it is , 'dial' and 'leather strap'. – prachi dhotre Jul 07 '20 at 05:05
0

I wanted to add it as a comment but it was not clear enough there.

Well, I used the below code:

Anaconda command prompt:

pip install rake-nltk

Jupyter:

from rake_nltk import Rake
r = Rake()
myText = ''' The watch has a good dial. The good thing about the product is the 
leather strap '''
r.extract_keywords_from_text(myText)
r.get_ranked_phrases()

Output:

['leather strap', 'good thing', 'good dial', 'watch', 'product']

I think you can try this algorithm in your actual text and get ranked phrases. Hope this will help. Check this link for more details: https://pypi.org/project/rake-nltk/#:~:text=RAKE%20short%20for%20Rapid%20Automatic,other%20words%20in%20the%20text.

MrRaghav
  • 335
  • 3
  • 11