1

i really like the look of the vega Word Clouds: https://vega.github.io/vega/examples/word-cloud/

I'm currently using the spec from the link as follows in colab:

spec = "insert spec here"

#Option one:
from altair import vega
vega.renderers.enable('colab')
vega.Vega(spec)

#Option two:
import panel as pn
from vega import Vega
pn.extension('vega')
pn.pane.Vega(spec)

But actually i want to make faceted wordclouds with vega. I currently load my data as json from my github account which is also slightly annoying, but i found no way to reference python variables in the vega spec. Does anyone maybe have a hint, how i could layout the vega wordcloud in a grid by groups specified in my data? My json has this structure: [{"text":text,"group":group}], drawing the wordclouds from this works, but not the faceting by the group field. I know vega-lite can do faceting, but it can't draw the beautiful wordcloud it seems.

Thanks for any help!

sisyphos
  • 33
  • 4

2 Answers2

1

You can't facet a word cloud as it requires a CountPattern transform which will destroy any faceting field you try to use. Instead you will need to provide a separate data object for each word cloud you want and then concatenate them together.

Edit

Link

enter image description here

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "description": "A word cloud visualization depicting Vega research paper abstracts.",
  "title": "A Wordcloud",
  "width": 420,
  "height": 400,
  "padding": 0,
  "data": [
    {
      "name": "table",
      "url": "https://raw.githubusercontent.com/nyanxo/vega_facet_wordcloud/main/split.json"
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "table", "field": "text_split"},
      "range": ["#d5a928", "#652c90", "#939597"]
    }
  ],
  "layout": {"padding": 20, "columns": 2, "bounds": "full", "align": "all"},
  "marks": [
    {
      "name": "cell",
      "type": "group",
      "style": "cell",
      "from": {
        "facet": {"name": "facet", "data": "table", "groupby": ["group"]}
      },
      "encode": {
        "update": {"width": {"signal": "400"}, "height": {"signal": "400"}}
      },
      "marks": [
        {
          "type": "text",
          "from": {"data": "facet"},
          "encode": {
            "enter": {
              "text": {"field": "text_split"},
              "align": {"value": "center"},
              "baseline": {"value": "alphabetic"},
              "fill": {"scale": "color", "field": "text_split"}
            },
            "update": {"fillOpacity": {"value": 1}},
            "hover": {"fillOpacity": {"value": 0.5}}
          },
          "transform": [
            {
              "type": "wordcloud",
              "size": [400, 400],
              "text": {"field": "text_split"},
              "font": "Helvetica Neue, Arial",
              "fontSizeRange": [12, 56],
              "padding": 2
            }
          ]
        }
      ]
    }
  ]
}
Davide Bacci
  • 16,647
  • 3
  • 10
  • 36
  • Hey, thanks for your answer. So would it be possible to facet if i computed the wordcounts beforehand? – sisyphos Jul 16 '22 at 10:21
  • Yes, that would be possible. Don't forget to mark as solved if your problem is solved. – Davide Bacci Jul 16 '22 at 10:23
  • I computed the word counts per group now, i'm stuck with the faceting though and also had to removed the transform which also computes the rotation of words: https://vega.github.io/editor/#/gist/de224a20126569d8f0df402e47ff5cdb/spec.json – sisyphos Jul 16 '22 at 10:57
  • I'll update my answer for you. If your question is answered, kindly mark as solved. – Davide Bacci Jul 16 '22 at 16:38
1

Here is a working example of Vega spec using facet with your data.

For illustration only, the formula field for angle places words with larger field size in horizontal position.

View in Vega online editor enter image description here

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "description": "A word cloud visualization depicting Vega research paper abstracts.",
  "title": "A Wordcloud",
  "width": 400,
  "height": 400,
  "padding": 10,
  "background": "ghostwhite",

  "layout": {
    "bounds": "flush",
    "columns": 2,
    "padding": 10
  },

  "data": [
    {
      "name": "table",
      "url": "https://raw.githubusercontent.com/nyanxo/vega_facet_wordcloud/main/split.json",
  
      "transform": [
        {
          "type": "formula", 
          "as": "angle",
          "expr": "datum.size >= 3 ? 0 : [-45,-30, -15, 0, 15, 30, 45][floor(random() * 7)]"
          }
        ]
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "table", "field": "text_split"},
      "range": ["#d5a928", "#652c90", "#939597"]
    }
  ],

  "marks": [
   {
      "type": "group",
      "from": {
        "facet": {
          "name": "facet",
          "data": "table",
          "groupby": "group"
        }
      },

      "title": {
        "text": {"signal": "parent.group"},
        "frame": "group"
      },

      "encode": {
        "update": {
          "width": {"signal": "width"},
          "height": {"signal": "height"}
        }
      },

"marks": [
    {
      "type": "rect",
      "encode": {
        "enter": {
          "x": {"value": 0},
          "width": {"signal": "width" },
          "y": {"value": 0},
          "height": {"signal": "height"},
          "fill": {"value": "beige"}
        }
      }
    },

    {
      "type": "text",
      "from": {"data": "facet"},
      "encode": {
        "enter": {
          "text": {"field": "text_split"},
          "align": {"value": "center"},
          "baseline": {"value": "alphabetic"},
          "fill": {"scale": "color", "field": "text_split"}
        },
        "update": {"fillOpacity": {"value": 1}},
        "hover": {"fillOpacity": {"value": 0.5}}
      },
      "transform": [
        {
          "type": "wordcloud",
          "size": {"signal": "[width, height]"},
          "text": {"field": "text_split"},
          "rotate":  {"field": "datum.angle"},
          "font": "Helvetica Neue, Arial",
          "fontSize": {"field": "datum.size"},
          "fontSizeRange": [12, 28],
          "padding": 2
        }
      ]
    }
    ]
   }
  ]
}
Roy Ing
  • 724
  • 1
  • 2
  • 2