1

I have an S19 file looking something like below:

S0030000FC
S30D0003C0000F0000000000000020
S3FD00000000782EFF1FB58E00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S3ED000000F83D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S31500000400FFFFFFFFFFFFFFFFFFFFFFFF7EF9FFFF7D
S3FD0000041010B5DFF828000468012147F22C10C4F20300016047F22010C4F2030000
S70500008EB4B8

I want to separate the first two characters and also the next two characters, and so on... I want it to look like below (last two characters are also to be separated for each line):

S0, 03, 0000, FC
S3, 0D, 0003C000, 0F00000000000000, 20
S3, FD, 00000000, 782EFF1FB58E00003D2B00003D2B00003D2B00003D2B00003D2B0000, 3D
S3, ED, 000000F8, 3D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D2B0000, 3D
S3, 15, 00000400, FFFFFFFFFFFFFFFFFFFFFFFF7EF9FFFF, 7D
S3, FD, 00000410, 10B5DFF828000468012147F22C10C4F20300016047F22010C4F20300, 00
S7, 05, 00008EB4, B8

How can I do this in Python? I have something like this:

 #!/usr/bin/python
 import string,os,sys,re,fileinput
 print "hi"
 inputfile = "k60.S19"
 outputfile = "k60_out.S19"

 # open the source file and read it
 fh = file(inputfile, 'r')
 subject = fh.read()
 fh.close()

 # create the pattern object. Note the "r". In case you're unfamiliar with Python
 # this is to set the string as raw so we don't have to escape our escape characters

 pattern2 = re.compile(r'S3')
 pattern3 = re.compile(r'S7')
 pattern1 = re.compile(r'S0')
 # do the replace
 result1 = pattern1.sub("S0, ", subject)
 result2 = pattern2.sub("S3, ", subject)
 result3 = pattern3.sub("S7, ", subject)

 # write the file
 f_out = file(outputfile, 'w')

 f_out.write(result1)
 f_out.write(result2)
 f_out.write(result3)
 f_out.close()

 #EoF

but it is not working as I like!! Can someone help me with how to come up with proper regular expression use for this?

Mehdi
  • 113
  • 1
  • 11

3 Answers3

4

try package bincopy, maybe you need it.

bincopy - Interpret strings as packed binary data

Mangling of various file formats that conveys binary information (Motorola S-Record, Intel HEX and binary files).

import bincopy
f = bincopy.BinFile()
f.add_srec_file("path/to/your/s19/flie.s19")
f.as_binary() # print s19 as binary

or you can easily use open() for a file:

with open("path/to/your/s19/flie.s19") as s19:
    for line in s19:
        type = line[0:2]
        count = line[2:4]
        adress = line[4:12]
        data = line[12:-2]
        crc = line[-2:]
        print type + ", "+ count + ", " + adress + ", " + data + ", " + crc + "\n"

hope it helps. Motorola S-record file format

Community
  • 1
  • 1
gyun
  • 41
  • 4
0

You can do it using a callback function as replacement with re.sub:

#!/usr/bin/python
import re

data = r'''S0030000FC
S30D0003C0000F0000000000000020
S3FD00000000782EFF1FB58E00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S3ED000000F83D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S31500000400FFFFFFFFFFFFFFFFFFFFFFFF7EF9FFFF7D
S3FD0000041010B5DFF828000468012147F22C10C4F20300016047F22010C4F2030000
S70500008EB4B8'''

pattern = re.compile(r'^(..)(..)((?:.{4}){1,2})(.*)(?=..)', re.M)

def repl(m):
    repstr = ''
    for g in m.groups():
        if (g):
            repstr += g + ', '
    return repstr

print re.sub(pattern, repl, data)

However, as Mark Setchell notices it, there is probably a nice way to do it with slicing.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
0

I know you are thinking Python and regexes, but this was made for awk and the following will maybe help you work out the way to do it using slicing:

awk '{r=length($0);print substr($0,1,2),substr($0,3,2),substr($0,5,8),substr($0,13,r-14),substr($0,r-1)}' OFS=, k60.s19

That says "get the length of the line in variable r, then print the first two characters, the next two characters, the next 8 characters and so on... and use a comma as the field separator".

EDITED

Here are a few more hints to get you started...

if you want to avoid printing line 1, you can do

awk 'FNR==1{next}  ...rest of awk script above ... ' 

If you want to only process lines longer than 40 characters, you can do

awk 'length($0)>40 {print}' yourfile

If you only want to process lines where the second field is "xx", you can do

awk '$2 ~ "xx" {print}' yourfile
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • thanks @Mark, what is that substr($0,13,r-14) for? I think that substr($0,r-1) means take the end of line+2 characters? – Mehdi Jun 13 '14 at 21:37
  • That's correct. The 13 means the 13th character from the start of the line and r-14 means to take r-14 characters to allow for the first 2, the gap, the next 2 and the block of 8 characters. – Mark Setchell Jun 13 '14 at 21:44
  • awk is a powerful command! what if you want to do replacement or expression-matching deletion, does awk work with sed? I thinK I have to pipe the two? for example if I want to get rid of the first line and last line, and also the middle string matches 'FD'? – Mehdi Jun 14 '14 at 00:16
  • I have added some extra hints and tips to my answer for you. Hope it helps! – Mark Setchell Jun 14 '14 at 10:15