I have the following data in a file. I want to extract the time
and the size
from relevant lines and plot a timeseries graph.
03/12 20:23:26.11: 04:23:26 L9 <Mx Acc Magnum All XDV:00111A0000000117 00D3001200870172 01FF6000F01CFE81 3D26000000000300
03/12 20:23:26.11: 04:23:26 L9 <Mx Acc MID 0x1500 Len 26 XDV:00111A0000000117 00D3001200870172 01FF6000F01CFE81 3D26000000000300
03/12 20:23:26.11: 04:23:26 L8 <Mx JK31 (Mx) JSP:17.37.6.99: Size = 166, Data: 00345C4101003031 E463EF0113108701 5A01FF6008F01CFE 81AB170000000003 EF01131087015A01 FF6008F01CFE81AB 170000000003EF01 131087015B01FF60 00F01CFE81701B00 00000003EF011310 87015B01FF6000F0 1CFE81701B000000 0003EF0113108701 5C01FF2000F01CFE 81CB240000000003 EF01131087015C01 57CC00F01CFE81CB 240000000003EF01 131087015D01FF20 00F01CFE815B2900 00000003EF011310 87015D01FF2000F0 1CFE815B29000000 0003EF0113108701 5E01FF6000F01CFE 819D280000000003 EF01131087015E01 FF6000F01CFE819D 0003
03/15 20:23:26.11: 04:23:26 L8 <Kx JK31 (Kx) JSP:15.33.2.93: Size = 163, Data: 00647741000030EF 01131087015A01FF 6008F01CFE81AB17 0000000003EF0113 1087015A01FF6008 F01CFE81AB170000 000003EF01131087 015B01FF6000F01C FE81701B00000000 03EF01131087015B 01FF6000F01CFE81 701B0000000003EF 01131087015C01FF 2000F01CFE81CB24 0000000003EF0113 1087015C01FF2000 F01CFE81CB240000 000003EF01131087 015D01FF2000F01C FE815B2900000000 03EF01131087015D 01FF2000F01CFE81 5B290000000003EF 01131087015E01FF 6000F01CFE819D28 0000000003EF0113 1087015E01FF6000 F01CFE819D280000 A6220000000003
03/15 20:23:26.11: 04:23:26 L9 <Kx JK31 (Kx) JSP:10.22.1.53:Size = 163, Data: 009D1141000030EF 01131087015A01FF 6008F01CFE81AB17 0000000003EF0113 1087015A01FF6008 F01CFE81AB170000 000003EF01131087 015B01FF6000F01C FE81701B00000000 03EF01131087015B 01FF6000F01CFE81 701B0000000003EF 01131087015C01FF 2000F01CFE81CB24 0000000003EF0113 1087015C01FF2000 F01CFE81CB240000 000003EF01131087 015D01FF2000F01C FE815B2900000000 03EF01131087015D 01FF2000F01CFE81 5B290000000003EF 01131087015E01FF 6000F01CFE819D28 0000000003EF0113 1087015E01FF6000 F01CFE819D280000 A6220000000003
I have the following program to do it.
from dateutil import parser
import matplotlib.pyplot as plt
match_list = ["L8 <Mx JK31 (Mx)", "L9 <Mx JK31 (Mx)"] ## put all match strings in this list
with open("test.txt") as fin:
print(' : {}', fin.name)
time_data = {} ## save data in dictionaries, with string keys and lists as values
size_data = {}
for line in fin:
for match in match_list:
if match in line:
if match not in time_data:
time_data[match] = [] ## initialize empty list the first time this key is encountered
size_data[match] = []
line = line.strip.split()
time_str = line[2]
t = parser.parse(time_str)
time_data[match].append(t)
size = int(line[9].strip(","))
size_data[match].append(size)
for match in match_list:
plt.figure() ## create a new figure for each data set
plt.plot(time_data[match], size_data[match])
plot.show() ## simultaneously show all plots
I am using two dictionaries above, time_data
and size_data
. Each of the data contains the elements of match_list
as their key. The values
are a list that contains datetime objects.
The above was done so that it would be easy to plot using matplotlib. Now I want to do the following.
As you can see in the sample data above for the same key L8 <Mx JK31 (Mx)
you have two values that has the same time (04:23:26
).
I want to modify the data structure (i.e the list inside my dictionaries) in such a way that I want the size values (i.e values in the list inside dictionary size_data
) to be summed up every minute.
Suppose there are 5 values as below
04:23:26 56 04:23:26 60 04:23:43 70 04:23:46 80 04:23:56 90
I want the above to be replace with 04:23:00
and 356
. How do I go about doing this.