I can very well plot CDF and CCDF when the data is in one column. But I am a little clueless how to plot a CDF or CCDF when the data is in the below given format. The pairs in round brackets ()
are the node pairs. The values in square brackets []
are the occurrence value and the number in between eg: 7
are the frequency. We don't have consider the frequency, only the occurrence values.
Input data format, They are millions of rows with lot of values between the square braces ([]
).
('4503', '656') 7 [2473.0, 35.0, 235.0, 157.0, 505.0, 45.0, 1303.0]
('2105', '674') 1 [2584.0]
('5139', '1086') 1 [1488.0]
('3690', '2034') 6 [1009.0, 1108.0, 132.0, 447.0, 157.0, 466.0]
('3867', '1982') 1 [1134.0]
I have to plot the CCDF of the data which is between the square braces ([]
) all together and not separately. I am not understanding how do I read the data between between the square braces and plot it.