1

I am trying to do aspect-based sentiment analysis. When I try to find the aspect as well as opinion using a dictionary, I got some of the aspects pair many times in the dictionary. My code is:

aspects_main = []
feature_main =[]
feautures_term_main =[]
txt = "great hotel jacuzzi bath!. really lovely hotel. stayed very top floor and surprised jacuzzi bath not know getting! staff friendly and helpful and included breakfast great! great location and great value money. not want leave!"
nlp=spacy.load("en_core_web_sm") 
doc_main = nlp(txt)
for i, sentence in enumerate(doc_main.sents):
  aspects = []
  feature =[]
  feautures_term =[]
  sentence= str(sentence)
  doc = nlp(sentence)
  descriptive_term = ''
  target = ''
  for token in doc:
    if (token.dep_ == 'nsubj' and token.pos_ == 'NOUN') or (token.pos_ == 'NOUN'):
     
      target = token.text
    if token.pos_ == 'ADJ':
      prepend = ''
      for child in token.children:
        if child.pos_ != 'ADV':
          continue
        prepend += child.text + ' '
      descriptive_term = prepend + token.text
      
    if((target=='') or (descriptive_term=='')):
      continue
    else:
      aspects.append({'aspect': target,
        'opinion': descriptive_term})
      feautures_term.append(descriptive_term)
      feature.append(target)
  
  aspects_main.append(aspects)
  feautures_term_main.append(feautures_term)
  feature_main.append(feature)



print(aspects_main)

I want to remove the duplicated ones and keep one of them. I tried this solution and the code is:

L=[[{'aspect': 'hotel', 'opinion': 'great'},  {'aspect': 'hotel', 'opinion': 'great'}],[]]

L=[dict(s) for s in set(frozenset(d.items()) for d in L)]
L

It gives me error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-172-b649b849dec9> in <module>()
      1 L=[[{'aspect': 'hotel', 'opinion': 'great'},  {'aspect': 'hotel', 'opinion': 'great'}],[]]
      2 
----> 3 L=[dict(s) for s in set(frozenset(d.items()) for d in L)]
      4 L

<ipython-input-172-b649b849dec9> in <genexpr>(.0)
      1 L=[[{'aspect': 'hotel', 'opinion': 'great'},  {'aspect': 'hotel', 'opinion': 'great'}],[]]
      2 
----> 3 L=[dict(s) for s in set(frozenset(d.items()) for d in L)]
      4 L

AttributeError: 'list' object has no attribute 'items'

I tried using the loop. and here is the code:

a=[]
for i in range(len(aspects_main)):
  aa=[]
  for j in range(len(aspects_main[i])):
    aa.append(aspects_main[i][j])
  aa=set(aa)
  a.append(aa)
                

print(a)

But got the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-182-b87e8b70dd59> in <module>()
      4   for j in range(len(aspects_main[i])):
      5     aa.append(aspects_main[i][j])
----> 6   aa=set(aa)
      7   a.append(aa)
      8 

TypeError: unhashable type: 'dict'

How can I do this?

My given output is :

[[{'aspect': 'hotel', 'opinion': 'great'},  {'aspect': 'hotel', 'opinion': 'great'}],[{'aspect': 'location', 'opinion': 'great'}, {'aspect': 'location', 'opinion': 'great'}, ][]]

and I want that (expected output):

[[{'aspect': 'hotel', 'opinion': 'great'}],[{'aspect': 'location', 'opinion': 'great'}]]
Samrat Alam
  • 558
  • 3
  • 19
  • What does nlp() do? I assume it means 'natural language processing' but what module is this is from? I don't think it's nltk –  Jul 26 '21 at 07:45
  • @AndyKnight, sorry, I forgot to add this in this cell. I declared this in the previous cell. it is nlp=spacy.load("en_core_web_sm") and i edited that. – Samrat Alam Jul 26 '21 at 08:14

1 Answers1

1

The reason for your error is that you have a list within a list (L is a list of lists), and when calling d.items() for d in L you mistakenly trying to extract items of a list.

This may solve what you're trying to do:

new_list = []
for list in L:
    no_dup_l = [dict(s) for s in set(frozenset(d.items()) for d in list)]
    if no_dup_l:
        new_list.append(no_dup_l)

personally, I wouldn't try to write this as one liner as it will harm readability (you already have 2 "for"s in your list comprehension)

IrinMoon
  • 21
  • 2
  • thanks. it works. but I faced a problem and I got this [[{'opinion': 'great', 'aspect': 'hotel'}]] as output. output is reversed here. Can you help me where is the problem? – Samrat Alam Jul 26 '21 at 08:11
  • Interesting. I didn't encounter that. what is your version of python? what is in L before you run this code? frozenset(d.items()) shouldn't change the order of the items in a dict. – IrinMoon Jul 26 '21 at 08:29
  • I take L (list of some aspects and opinions pair) for taking a simple example and try to find the solution. – Samrat Alam Jul 26 '21 at 11:15