0

I have a contact dataset. The dataset describes the ID of two groups of person, and how long their overlap time is. The contact between person1 and person2 is paired. It means person1[i] => person2[i]. I want to visualize the contacts. Because there are some values assigned to each person, like income, activity area zone, etc....Then I can compare the contacts between different groups.

I hope the final visualisation might be something like this. But the colour will be different according to the value of groups (high income, low income).

enter image description here

Because I assume this is relevant to Graph theory, I use Graphs.jl to visualize the data. However the visualization of the dataset and the theory may be irrelevant to each other.

Here is an example of the dataset.

person1 = [471801, 1265801, 3938401, 10566801, ...] # length(person1) = 68
person2 = [352605801, 484811501, 179156501, 291065401, ...] # length(person2) = 68
overlaptime = [789, 321, 12314, 3256, ...] # length(overlaptime) = 68

I built the graph using the code:

sources_df1 = df1[:, :person1]
destinations_df1 = df1[:, :person2]
weights_df1 = df1[:, :overlaptime]
g_df1_test = SimpleWeightedGraph(sources_df1, destinations_df1, weights_df1)

Then Graph takes the ID of person as number, so the number of vertex reaches to million. I convert the person1 and person2 from int to string, but then the graph could not be built.

Is there a method or package I can build a network map of the contact dataset? Thank you.

Sundar R
  • 13,776
  • 6
  • 49
  • 76
Chao
  • 49
  • 4

1 Answers1

2

SimpleWeightedGraph is expecting a list of numbered vertices starting from 1, which is why you're creating such a large graph (it's interpreting those numbers as the actual node numbers). I'd suggest using a MetaGraph instead and then using your numbers as the vertex names, which you can then use more easily via set_indexing_prop!. See this question for more details.