1

I need a wordcloud in which the same word could appear twice with different colors; red indicating negative associations, and green positive.

I am already able to generate a wordcloud with repeated words using MultiDicts (see the code below):

enter image description here

However, I need one of the homes to appear in green color. Is this possible with the wordcloud libray? Can somebody recommend another library that support this?


from wordcloud import WordCloud
from multidict import MultiDict

class WordClouder(object):

    def __init__(self, words, colors):
        self.words = words
        self.colors = colors

    def get_color_func(self, word, **args):
        return self.colors[word]

    def makeImage(self, path):
        wc = WordCloud(background_color="white", width=200, height=100)

        # generate word cloud
        wc.generate_from_frequencies(self.words)

        # color the wordclound
        wc.recolor(color_func=self.get_color_func)

        image = wc.to_image()
        image.save(path)


if __name__ == '__main__':

    # Ideally this would be used somewhere
    colors_valence = {
        '+': 'green',
        '-': 'red',
    }

    colors = {
        'home': 'red',
        'car': 'green',
    }

    words = MultiDict({
        'home': 2.0, # This home should be green
        'car': 20.0,
    })

    words.add('home',10) , # This home should be red


    wc = WordClouder(words, colors)
    wc.makeImage('wordcloud.png')
toto_tico
  • 17,977
  • 9
  • 97
  • 116
  • Do you understand the code that you wrote? `def get_color_func` returns a color based on *the exact word*. So either make a dummy word `home ` and give that another color, or rewrite the entire word <-> color system to something else. – Jongware Dec 14 '17 at 18:48
  • Do you mean add a blank space to home so I have `"home "` and `"home"`? I thought about that, but in reality I will have more than two categories (positive and negative is just a simplified example to make the question easy to understand). If I have 5 categories, there will be 4 white spaces, and then the word clouds start to look weird. In any case, it would not be a very elegant solution, more like a hack. – toto_tico Dec 14 '17 at 20:52
  • Then, as I said, you need to add another way of distinguishing the variants. As it is now, the color is inextricably linked to the word. – Jongware Dec 14 '17 at 21:20
  • Well, yes, that is my question. If the wordcloud library supports this, or if somebody can recommend an alternative. I looked into the wordcloud code, and this doesn't seem to be supported. – toto_tico Dec 15 '17 at 08:42
  • 1
    I thought of adding atributes to strings, but it is [not very recommended](https://stackoverflow.com/questions/2444680/how-do-i-add-my-own-custom-attributes-to-existing-built-in-python-types-like-a), another monkey patch. – toto_tico Dec 15 '17 at 08:47

1 Answers1

2

I created a decent solution by inherit from WordCloud. You can just copy and paste the code below, and then you can use + and - to indicate the color, something like this:

words = {
    'home+': 2.0,
    'home-': 10.0,
    'car+': 5.0,
}

would generate this:

enter image description here



Explanation: I added a character after the word in the dictionary that tells me the color (+ or -). I removed the character in the recolor method of the WordCloud that I overwrote, but I did send the entire world (including the character + or -) to the color_fun that I use to select the appropriate color. I commented the important bits in the code below.

from wordcloud import WordCloud as WC

from multidict import MultiDict


class WordCloud(WC):

    def recolor(self, random_state=None, color_func=None, colormap=None):
        if isinstance(random_state, int):
            random_state = Random(random_state)
        self._check_generated()

        if color_func is None:
            if colormap is None:
                color_func = self.color_func
            else:
                color_func = colormap_color_func(colormap)

        # Here I remove the character so it doesn't get displayed
        # when the wordcloud image is produced
        self.layout_ = [((word_freq[0][:-1], word_freq[1]), font_size, position, orientation,
               # but I send the full word to the color_func
               color_func(word=word_freq[0], font_size=font_size,
                          position=position, orientation=orientation,
                          random_state=random_state,
               font_path=self.font_path))
               for word_freq, font_size, position, orientation, _
                   in self.layout_]

        return self


class WordClouder(object):

    def __init__(self, words, colors):
        self.words = words
        self.colors = colors

    def get_color_func(self, word, **args):
        return self.colors[word[-1]]

    def makeImage(self, path):
        #alice_mask = np.array(Image.open("alice_mask.png"))

        wc = WordCloud(background_color="white", width=200, height=100)

        # generate word cloud
        wc.generate_from_frequencies(self.words)

        # color the wordclound
        wc.recolor(color_func=self.get_color_func)

        image = wc.to_image()
        image.save(path)


if __name__ == '__main__':

    colors = {
        '+': '#00ff00',
        '-': '#ff0000',
    }

    words = {
        'home+': 2.0,
        'home-': 10.0,
        'car+': 5.,
    }

    wc = WordClouder(words, colors)
    wc.makeImage('wc.png')
toto_tico
  • 17,977
  • 9
  • 97
  • 116