4

I have a list of strings:

How many glasses are on the tab ?
What does the sign say ?
Has the pizza been baked ?
Do you think the boy on the ground has broken legs ?
Is this man crying ?
How many pickles are on the plate ?
What is the shape of the plate?
…

How can I convert it into a sunburst plot in Python?

The sunburst plot shows the distribution of questions by their first four words, the arc length is proportional to the number of questions containing the word and the white areas are words with contributions too small to show.

enter image description here

(Image source -> page 5, figure 3)

The question How to make a sunburst plot in R or Python? doesn't make any assumption regarding the input format and the Python answers assume that the input has a very different format.

Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
  • Possible duplicate: [How to make a sunburst plot in R or Python?](https://stackoverflow.com/questions/12926779/how-to-make-a-sunburst-plot-in-r-or-python) – Lomtrur Mar 05 '19 at 07:29
  • @Lomtrur Thanks, the question [How to make a sunburst plot in R or Python?](https://stackoverflow.com/q/12926779/395857) doesn't make any assumption regarding the input format and the answers assume that the input has a very different format. – Franck Dernoncourt Mar 05 '19 at 07:30

3 Answers3

3

Expanding on Jimmy Ata's answer, which pointed to the Python plotly package:


You can use https://plotly.com/python/sunburst-charts/:

Example from the same page:

# From https://plotly.com/python/sunburst-charts/
import plotly.express as px
data = dict(
    character=["Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"],
    parent=["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve" ],
    value=[10, 14, 12, 10, 2, 6, 6, 4, 4])

fig =px.sunburst(
    data,
    names='character',
    parents='parent',
    values='value',
)
fig.show()

enter image description here

Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
1

I suggest an R package, ggsunburst https://github.com/didacs/ggsunburst

This might be a good starting point. The file data.txt contains the first four words in you example

library(ggsunburst)
sb <- sunburst_data('data.txt', type = "lineage", sep = ' ')
sunburst(sb, node_labels = T, node_labels.min = 0)

enter image description here

using the first four words in questions from https://conversationstartersworld.com/good-questions-to-ask/

sunburst(sb, node_labels = T, leaf_labels = F, node_labels.min = 5)

enter image description here

didac
  • 311
  • 2
  • 4
1
import plotly.graph_objects as go
from plotly.offline import plot, iplot
import pandas as pd
from IPython.display import HTML # 导入HTML
import plotly.express as px

questions = [
    "What is the capital of France?",
    "How do you solve this problem?",
    "When did the World War II start?",
    "Where is Mount Everest located?",
    "Why is the sky blue?",
    "What is the purpose of life?",
    "How can I make a pie chart?",
    "When is your birthday?",
    "Why do we dream?",
    "How does the sun shine?",
]
Levels = [[],[],[],[],[],[]]
word_counts = {}
for question in questions:
    words = question.split()[:6]
    for i in range(0,6):

        if len(words)<i+1:
            Levels[i].append(None)
        else:
            Levels[i].append(words[i])


df = pd.DataFrame(
    dict(A=Levels[0], B=Levels[1], C=Levels[2],D=Levels[3],E=Levels[4],F=Levels[5],G=[1]*len(Levels[0]))
)
print(df)
fig = px.sunburst(df, path=['A','B','C','D','E','F'], values='G')
fig.show()

enter image description here

This repo also works, though the default is all n grams instead of prefix: https://github.com/mrzjy/sunburst