0

I am working on some data extraction in a text file. I have been using MATLAB for a while now, but it seems kinda more stressful. I started using python for the extraction. Now I have a pretty complex question and I don't even have any idea about how to do it.

Here's what I've done so far. I have a log file that looks likes this:

2017-12-21T23:59:19.120Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:19.120Z 'D|Beat:2143|B->'
2017-12-21T23:59:19.120Z 'D|Beat:2113|sndB:0x5caa'
2017-12-21T23:59:19.175Z 'I|PSnd:  61|snd[3D]:FFFF m:0x5caa e:0'
2017-12-21T23:59:19.175Z 'I|PSnd: 233|sD[3D]m:0x5caa e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1259|WDTimeout: 300'
2017-12-21T23:59:19.175Z 'D|Beat:1282|sd:0x5caa: e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1302|sprts'
2017-12-21T23:59:19.175Z 'D|LgPl:  68|BSP:getSize:19'
2017-12-21T23:59:19.175Z 'D|Beat:5503|GetPckt:0x4e5e'
2017-12-21T23:59:19.175Z 'D|Beat:7140|Prtns->'
2017-12-21T23:59:19.175Z 'D|Beat:2008|sevt:72'
2017-12-21T23:59:19.175Z 'I|Beat:2021|SndQ:1'
2017-12-21T23:59:19.175Z 'D|Beat:1805|snd:0x4e5e'
2017-12-21T23:59:19.175Z 'I|PSnd:  61|snd[B0]:FFFF m:0x4e5e e:0'
2017-12-21T23:59:19.175Z 'I|PSnd: 233|sD[B0]m:0x4e5e e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1866|sd:0x4e5e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1192|drop:2402 q:43'
2017-12-21T23:59:19.301Z 'D|Beat:1220|Rcv<-RP, s:2402'
2017-12-21T23:59:19.301Z 'D|LgPl:  68|BSP:getSize:19'
2017-12-21T23:59:19.301Z 'I|Beat:1243|RcvQ:1'
2017-12-21T23:59:19.301Z 'D|Beat:1245|FrMsg:0x4cc0 QMsg:0x3ba4'
2017-12-21T23:59:19.301Z 'D|Beat:8934|AAltB->B1302'
2017-12-21T23:59:19.416Z 'D|Beat:1192|drop:2402 q:50'
2017-12-21T23:59:19.416Z 'D|Beat:10392|RStp'
2017-12-21T23:59:19.437Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:19.489Z 'D|Beat:6502|slt:2'
2017-12-21T23:59:19.489Z 'D|Beat:10341|RStrt'
2017-12-21T23:59:19.489Z 'D|Beat:4713|prtTS:2'
2017-12-21T23:59:19.489Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:19.552Z 'D|Beat:1192|drop:2402 q:36'
2017-12-21T23:59:19.820Z 'D|Beat:1192|drop:2402 q:48'
2017-12-21T23:59:19.820Z 'D|Beat:10747|PLife:67'
2017-12-21T23:59:19.820Z 'D|Beat:4906|nojump'
2017-12-21T23:59:19.820Z 'D|Beat:10392|RStp'
2017-12-21T23:59:19.820Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:19.873Z 'D|Beat:6502|slt:3'
2017-12-21T23:59:20.266Z 'D|Beat:6502|slt:4'
2017-12-21T23:59:20.266Z 'D|Beat:10341|RStrt'
2017-12-21T23:59:20.266Z 'D|Beat:4713|prtTS:4'
2017-12-21T23:59:20.266Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:20.318Z 'D|Beat:1192|drop:2301 q:49'
2017-12-21T23:59:20.339Z 'D|Beat:1358|drop:2301 q:49'
2017-12-21T23:59:20.339Z 'D|Beat:1220|Rcv<-RP, s:2402'
2017-12-21T23:59:20.339Z 'D|LgPl:  68|BSP:getSize:19'
2017-12-21T23:59:20.339Z 'I|Beat:1243|RcvQ:1'
2017-12-21T23:59:20.339Z 'D|Beat:1245|FrMsg:0x4192 QMsg:0x4cc0'
2017-12-21T23:59:20.339Z 'D|Beat:1192|drop:2402 q:48'
2017-12-21T23:59:20.454Z 'D|Beat:1192|drop:2402 q:51'
2017-12-21T23:59:20.579Z 'D|Beat:1192|drop:2402 q:48'
2017-12-21T23:59:20.610Z 'D|Beat:10747|PLife:68'
2017-12-21T23:59:20.610Z 'D|Beat:4906|nojump'
2017-12-21T23:59:20.610Z 'D|Beat:10392|RStp'
2017-12-21T23:59:20.610Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:20.632Z 'D|Beat:6502|slt:5'
2017-12-21T23:59:21.045Z 'D|Beat:6502|slt:6'
2017-12-21T23:59:21.045Z 'D|Beat:10341|RStrt'
2017-12-21T23:59:21.045Z 'D|Beat:4713|prtTS:6'
2017-12-21T23:59:21.045Z 'D|Beat: 971|RStrtD'

Now I need to extract out any line that contains RStrtD followed by another line having RStpD and then find the difference in the time between them, for every case of this in the text file, and then add the times together. I extracted using the code below:

print" trying out something spectacular"

def get_line(file_name, find_word1, find_word2):

    lines = []
    for line in file_name.strip().split('\n'):
        if find_word1 in line:
            lines.append(line)
        elif find_word2 in line:
            lines.append(line)
        else:
            pass
    return lines

def get_all_lines(f_name, find_word1, find_word2):

    f_content = open(f_name, 'r').read()
    return get_line(f_content,find_word1, find_word2)

def get_files_in (in_file, find_word1,find_word2, out_file):

    filtererd_lines = get_all_lines(in_file, find_word1, find_word2)
    joinliens = '\n'.join(filtererd_lines)
    open(out_file, 'w').write(joinliens)

#fix= "mm", "cts"
get_files_in("./sss1.txt", "RStrtD", "RStpD", "./result1.txt")

After running this, I received the following output:

2017-12-21T23:59:43.561Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:44.419Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:44.715Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:46.730Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:47.062Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:48.273Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:48.625Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:49.487Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:49.783Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:51.789Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:52.122Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:53.334Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:53.680Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:54.529Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:54.835Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:56.840Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:57.182Z 'D|Beat: 997|RStpD'

This is good, but I now need to subtract the times from each other on each line and then take the sum of all of these differences. I really don't know how iI can go about this. I'm not yet familier with time vectors in python.

mufty__py
  • 127
  • 1
  • 12
  • You should use [`datetime.strptime`](https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime) to parse the times into `datetime` objects. – Patrick Haugh Feb 07 '18 at 02:40
  • But the time striptime is for subtracting two time, what if i need to subtract time from the second row from the first row, and 4th from 3rd and so on...then add them alll together. – mufty__py Feb 07 '18 at 03:05
  • Your first step should be to get a list of `datetime` objects. Then use that list to generate the pairs of sequential times ([see here](https://stackoverflow.com/questions/6822725/rolling-or-sliding-window-iterator)). Then you can use subtraction to produce `timedelta` objects. – Patrick Haugh Feb 07 '18 at 03:07

1 Answers1

0

for doing math with time, I would recommend arrow. Its documentation is comprehensive and you can easily subtract time by just time2-time1 which will give you a time delta.

Salmaan P
  • 817
  • 1
  • 12
  • 32