0

i have some list

article = ["A",'B','C','D','E','F','G','H']
quote = ['','A','A','C','C','','A','B']
pd.DataFrame({'Article':article,'Quote':quote})

It is to make a hierarchy by citing the quoted texts. I want this hierarchical format.

enter image description here

As a result, I want to show that A quoted by B, B quoted by H. How do i code it?

DYZ
  • 55,249
  • 10
  • 64
  • 93
judy
  • 17
  • 5
  • Possible duplicate of [Convert Python dict into a dataframe](https://stackoverflow.com/questions/18837262/convert-python-dict-into-a-dataframe) – Manmohan_singh May 11 '18 at 03:00

1 Answers1

0

I'm not quite sure what you mean by making this hierarchy, you can build a Graph for the quoting simply enough:

In []:
article = ["A",'B','C','D','E','F','G','H']
quote = ['','A','A','C','C','','A','B']
d = {}
for a, q in zip(article, quote):
    d.setdefault(q, []).append(a)
d

Out[]:
{'', ['A', 'F'], 'A': ['B', 'C', 'G'], 'B': ['H'], 'C': ['D', 'E']}

You can visualize this with a simple recursive function:

def fn(g, n, depth=0):
    print('{: >{width}}'.format(n, width=depth*4))
    for nn in g.get(n, []):
        fn(g, nn, depth+1)

In []:
fn(d, '')

Out[]:

   A
       B
           H
       C
           D
           E
       G
   F

Or you can display as a network using networkx:

In []:
import networkx as nx
G = nx.Graph(d)
G.remove_node('')
nx.draw(G, with_labels=True)

Out[]:

enter image description here

You can create the DataFrame you are looking for:

def fn(g, n):
    q = [[n]]
    while q:
        p = q.pop()
        if p[-1] not in d:
            yield p
            continue
        for n in g[p[-1]]:
            q.append(p + [n])

In []:
pd.DataFrame(list(reversed([(a, ','.join(q)) for _, a, *q in fn(d, '')])))

Out[]:
   0    1
0  A  B,H
1  A  C,D
2  A  C,E
3  A    G
4  F     

Which you can write to Excel using Panda's capabilities.

AChampion
  • 29,683
  • 4
  • 59
  • 75
  • Thinking along the same line... But `DiGraph`. – DYZ May 11 '18 at 01:18
  • @AChampion This is the answer I was looking for. You are a genius. – judy May 11 '18 at 01:20
  • Typo, fixed - also added recursive text display. – AChampion May 11 '18 at 01:26
  • @AChampion Thank you very much – judy May 11 '18 at 01:33
  • @AChampion Sorry, but I have one more question. How do I move recursive function output to Excel? – judy May 11 '18 at 01:35
  • Probably easiest to create the `DataFrame` and use that to write to excel... updated (revisited the earlier code too). – AChampion May 11 '18 at 01:58
  • @AChampion i got the error ('list' object is not callable) ... *q in fn(d, '') << fn(d,'') what is that mean '' ? – judy May 11 '18 at 02:48
  • What happens with `list(fn(d, ''))`, `''` is the root of the tree, both `'A'` and `'F'` have `''` in the original `quote` variable. What version of python are you using? – AChampion May 11 '18 at 03:53
  • @AChampion I want to move the output of the second and third code you uploaded to Excel sheet1 and sheet2, but I do not know how. The Python version is 3.6.2. An error occurred when i executed your last code. – judy May 11 '18 at 04:27
  • Post the code you have and the error in the question... I can't help otherwise. – AChampion May 11 '18 at 04:54
  • @AChampion This is the first time I have posted on this site, so I posted an issue in my question. – judy May 11 '18 at 05:17