4

I would like to modify a file containing a hex dump. There are 33 lines which contain strings like this:

0000000000000000b00b8000c7600795
0001906da451000000008fac0b000000

I would like to put two spaces after every two characters, like this:

00 00 00 00 00 00 00 00 b0 0b 80 00 c7 60 07 95

So far I've made this script that works, but it puts two spaces in each character. I can't see what parameter I can use with .join() to make it every two characters:

import os

os.rename( 'hex_dump.txt', 'hex_dump.old' )

destination = open( 'hex_dump.txt', "w" )
source = open( 'hex_dump.old', "r" )
for line in source:
    if len(line) > 2:
        destination.write("  ".join(line))
source.close()
destination.close()
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
Rescor
  • 53
  • 1
  • 5

4 Answers4

2

Say you have a file hex_dump.txt with the following contents:

0000000000000000b00b8000c7600795
0001906da451000000008fac0b000000

You could use str.join:

#!/usr/bin/python3.9

import os

os.rename('hex_dump.txt', 'hex_dump.old')

with open('hex_dump.txt', 'w') as dest, open('hex_dump.old', 'r') as src:
    for line in src:
        if len(line) > 2:
            dest.write(' '.join(line[i:i + 2] for i in range(0, len(line), 2)))

hex_dump.txt after running above:

00 00 00 00 00 00 00 00 b0 0b 80 00 c7 60 07 95 
00 01 90 6d a4 51 00 00 00 00 8f ac 0b 00 00 00
Sash Sinha
  • 18,743
  • 3
  • 23
  • 40
  • Thanks for your answer. I was tried this before I made this post : line[i:i+2] for i in range(0, len(line), 2) – Rescor Dec 21 '21 at 20:07
1

You can chunk your string into substrings of size 2 and rejoin the resultant substrings on a space:

def chunks(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

def split_string(s):
    return " ".join(chunks(s, n=2))
erip
  • 16,374
  • 11
  • 66
  • 121
0

You could read your file as binary input and then write each byte separated by space into your destination file:

with open ('hex_dump.old', 'rb') as f1:
    with open('hex_dump.txt', 'wb') as f2:
        in_byte= f1.read(1)
        while in_byte!= b"":
            f2.write(in_byte)
            f2.write(' '.encode())
        
Felix
  • 82
  • 4
0

Answer:

You can use a join on a comprehension list that you generate using parity of index.

"".join([e + " " if i % 2 else e for i, e in enumerate("0001906da451000000008fac0b000000")])

Replace the string "0001906da451000000008fac0b000000" by your variable line.


Time analysis:

line = "0001906da451000000008fac0b000000" * 1000000
import time
t0 = time.time()
"".join([e + " " if i % 2 else e for i, e in enumerate(line)])
print('"".join([e + " " if i % 2 else e for i, e in enumerate(line)])' + str(time.time() - t0) + " s")
' '.join(line[i:i + 2] for i in range(0, len(line), 2))
print('" ".join(line[i:i + 2] for i in range(0, len(line), 2))' + str(time.time() - t0) + " s")

The result:

"".join([e + " " if i % 2 else e for i, e in enumerate(line)])     3.010514974594116 s
" ".join(line[i:i + 2] for i in range(0, len(line), 2))     5.175166130065918 s

Conclusion:

My method is the fastest!

Vincent Bénet
  • 1,212
  • 6
  • 20