I have the following tree represented in parent/child relationships:
import pandas as pd
df = pd.DataFrame(columns=['Parent','Child'])
df['Parent']=["A","A","A","B","B","B","C","C","F","G","G"]
df['Child']=["B","C","E","D","E","F","F","G","H","H","I"]
Some of the nodes have multiple parents. This should be removed by giving these common children different IDs based on the path. This is how it should look like after (right tree):
I wrote a funciton tha is supposed to write a path for every node and add it to the name, the results are collected in the dictionary "res". After a few day of trying this seems to be a bad approach as it doeas not split the paths. Below an example for node H.
Any ideas how the tree can be transformed?
res = {}
def find_parent(child, path):
path.append(str(child))
print("Path: ", path)
parents = df.loc[df['Child'] == child, ['Parent']]['Parent'].tolist()
print("Parents: ",parents)
if not parents:
print("Path end reached!")
print("Result: ", res)
# i+1
else :
for i in range(0,len(parents)-1):
if len(parents)>1: #dann neue paths
path = [(str(child))]
new_path = 'path_{}'.format(i)
print("-->add ",parents[i])
res[new_path] = str(''.join(path)) + parents[i]
print("Result: ", res)
print()
find_parent(parents[i], path)
else:
new_path = 'path_{}'.format(i)
print("-->add ",parents[i])
res[new_path] = str(''.join(path)) + parents[i]
print("Result: ", res)
print()
find_parent(parents[0],path)
return res
Example result for node "H"
find_parent("H", [])
{'path_0': 'FB'}
It should give H_FBA, HFCA and H_GCA.