I did a search on similar topics, but the answers are too vague for my level of understanding and comprehension, and I don't think they're specific enough to my question.
Similar threads:
Tree (directed acyclic graph) implementation
Representing a DAG (directed acyclic graph)
Question:
I have formatted a text file which contains data of the following format...
Example dataset:
GO:0000109#is_a: GO:0000110#is_a: GO:0000111#is_a: GO:0000112#is_a: GO:0000113#is_a: GO:0070312#is_a: GO:0070522#is_a: GO:0070912#is_a: GO:0070913#is_a: GO:0071942#part_of: GO:0008622
GO:0000112#part_of: GO:0000442
GO:0000118#is_a: GO:0016581#is_a: GO:0034967#is_a: GO:0070210#is_a: GO:0070211#is_a: GO:0070822#is_a: GO:0070823#is_a: GO:0070824
GO:0000120#is_a: GO:0000500#is_a: GO:0005668#is_a: GO:0070860
GO:0000123#is_a: GO:0005671#is_a: GO:0043189#is_a: GO:0070461#is_a: GO:0070775#is_a: GO:0072487
GO:0000126#is_a: GO:0034732#is_a: GO:0034733
GO:0000127#part_of: GO:0034734#part_of: GO:0034735
GO:0000133#is_a: GO:0031560#is_a: GO:0031561#is_a: GO:0031562#is_a: GO:0031563#part_of: GO:0031500
GO:0000137#part_of: GO:0000136
I'm looking to construct a weighted directed DAG from this data (the above is just a snippet). The whole dataset of 106kb is here: Source
--------------------------------------------------
Taking into consideration line-by-line, the data of each line is explained as follows...
First line as an example:
GO:0000109#is_a: GO:0000110#is_a: GO:0000111#is_a: GO:0000112#is_a: GO:0000113#is_a: GO:0070312#is_a: GO:0070522#is_a: GO:0070912#is_a: GO:0070913#is_a: GO:0071942#part_of: GO:0008622
'#' is the delimeter/tokenizer for the line data.
The First term, GO:0000109 is the node name.
The subsequent terms of is_a: GO:xxxxxxx OR part_of: GO:xxxxxxx are the nodes which are connected to GO:0000109.
Some of the subsequent terms have connections too, as depicted in the dataset.
When it is is_a, the weight of the edge is 0.8.
When it is part_of, the weight of the edge is 0.6.
--------------------------------------------------
I have Google-d on how DAGs are, and I understand the concept. However, I still have no idea how to put it into code. I'm using Java.
From my understanding, a graph generally consists of nodes and arcs. Does this graph require an adjacency list to determine the direction of the connection? If so, I'm not sure how to combine the graph and adjacency list to communicate with each other.
After constructing the graph, my secondary goal is to find out the degree of each node from the root node. There is a root node in the dataset.
For illustration, I have drawn out a sample of the connection of the first line of data below:
Image Link
I hope you guys understand what I'm trying to achieve here. Thanks for looking through my problem. :)