1

I am trying to classify words into a score, the scoring for now is to be very simple in that I just want to classify words based on -1, 0, 1 and sum the scores at the end. This classification would be based on the emotional connotation of the word so positive words like "great,awesome,excellent" would receive score of +1 and negative words like "bad, ill, not" would receive score of -1 and neutral words would receive 0 . For example;text = "I feel bad" would be pushed through a table,DB,library in which words were pre-classied and would summed into "I(0) + feel(0) + bad(-1) = -1

I have gone ahead and as an example stripped a website of its HTML coding using BeautifulSoup and urllib libraries (code below):

import urllib
from bs4 import BeautifulSoup

url = "http://www.greenovergrey.com/living-walls/what-are-living-walls.php"
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html)

# kill all script and style elements
for script in soup(["script", "style"]):
    script.extract()    # rip it out

# get text
text = soup.get_text()

# break into lines and remove leading and trailing space on each
lines = (line.strip() for line in text.splitlines())
# break multi-headlines into a line each
chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
# drop blank lines
text = '\n'.join(chunk for chunk in chunks if chunk)

print(text)

Output:

What are Living Walls? Definition of Green Wall and Vertical Garden
GREEN OVER CREY
Overview
/
What are living walls
/
Our green wall system vs. modular boxes
What are living walls
L iving walls or green walls are self sufficient vertical gardens that are attached to the exterior or interior of a building. They differ from green façades (e.g. ivy walls) in that the plants root in a structural support which is fastened to the wall itself. The plants receive water and nutrients from within the vertical support instead of from the ground.
The Green over Grey™ living wall system is different than others on the market today. It closely mimics nature and allows plants to grow to their full potential, without limitations. It is also by far the lightest.
Diversity is the key and by utilizing hundreds of different types of plants we create striking patterns and unique designs. We achieve this by utilizing the multitude of colours, textures and sizes that nature provides. Our system accommodates flowering perennials, beautiful foliage plants, ground covers and even allows for bushes, shrubs, and small trees!
Living walls are also referred to as green walls, vertical gardens or in French, mur végétal. The French botanist and artist Patrick Blanc was a pioneer by creating
the first vertical garden over 30 years ago.
Our system
consists of a frame, waterproof panels, an automatic irrigation system, special materials, lights when needed and of course plants. The frame is built in front of a pre existing wall and attached at various points; there is no damage done to the building. Waterproof panels are mounted to the frame; these are rigid and provide structural support. There is a layer of air between the building and the panels which enables the building to breath. This adds beneficial insulating properties and acts like rain-screening to protect the building envelop.
Our green walls are low maintenance thanks to an automatic irrigation system

my question is what would be the best way to run this string through a table or library of pre classified words and would anyone know of any existing libraries of preclassified words based on emotion? how can I create a small table or DB to test with really quick?

Thank you all in advance, Rusty

RustyShackleford
  • 3,462
  • 9
  • 40
  • 81

2 Answers2

0

I dont know how to mark this question as a duplicate, but a quick google search turned this up.

The first answer looks promising. I went to the link and it just requires some information to access the file. I assume it would be in a format that is straightforward to parse.

Community
  • 1
  • 1
Keith
  • 164
  • 5
0

If you have such a table you can find a list of such lexicons here: http://mpqa.cs.pitt.edu/lexicons/effect_lexicon/

You could load that list on a dictionary and perform the algorithm you describe. However, if you are looking for quick results, I recommend you use textblob library. It's very easy to use and it has a lot of features. A very nice place to start in a project like what you might be starting.

Rafael Almeida
  • 2,377
  • 3
  • 22
  • 32