Python put space every two characters

Question

I would like to modify a file containing a hex dump. There are 33 lines which contain strings like this:

0000000000000000b00b8000c7600795
0001906da451000000008fac0b000000

I would like to put two spaces after every two characters, like this:

00 00 00 00 00 00 00 00 b0 0b 80 00 c7 60 07 95

So far I've made this script that works, but it puts two spaces in each character. I can't see what parameter I can use with .join() to make it every two characters:

import os

os.rename( 'hex_dump.txt', 'hex_dump.old' )

destination = open( 'hex_dump.txt', "w" )
source = open( 'hex_dump.old', "r" )
for line in source:
    if len(line) > 2:
        destination.write("  ".join(line))
source.close()
destination.close()

score 2 · Accepted Answer · answered Dec 21 '21 at 12:58

Say you have a file hex_dump.txt with the following contents:

0000000000000000b00b8000c7600795
0001906da451000000008fac0b000000

You could use str.join:

#!/usr/bin/python3.9

import os

os.rename('hex_dump.txt', 'hex_dump.old')

with open('hex_dump.txt', 'w') as dest, open('hex_dump.old', 'r') as src:
    for line in src:
        if len(line) > 2:
            dest.write(' '.join(line[i:i + 2] for i in range(0, len(line), 2)))

hex_dump.txt after running above:

00 00 00 00 00 00 00 00 b0 0b 80 00 c7 60 07 95 
00 01 90 6d a4 51 00 00 00 00 8f ac 0b 00 00 00

Thanks for your answer. I was tried this before I made this post : line[i:i+2] for i in range(0, len(line), 2) — Rescor, Dec 21 '21 at 20:07

score 1 · Answer 2 · answered Dec 21 '21 at 13:03

1

You can chunk your string into substrings of size 2 and rejoin the resultant substrings on a space:

def chunks(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

def split_string(s):
    return " ".join(chunks(s, n=2))

answered Dec 21 '21 at 13:03

erip

16,374
11
66
121

score 0 · Answer 3 · answered Dec 21 '21 at 12:57

You could read your file as binary input and then write each byte separated by space into your destination file:

with open ('hex_dump.old', 'rb') as f1:
    with open('hex_dump.txt', 'wb') as f2:
        in_byte= f1.read(1)
        while in_byte!= b"":
            f2.write(in_byte)
            f2.write(' '.encode())

Vincent Bénet · Answer 4 · 2021-12-21T13:07:34.513

0

Answer:

You can use a join on a comprehension list that you generate using parity of index.

"".join([e + " " if i % 2 else e for i, e in enumerate("0001906da451000000008fac0b000000")])

Replace the string "0001906da451000000008fac0b000000" by your variable line.

Time analysis:

line = "0001906da451000000008fac0b000000" * 1000000
import time
t0 = time.time()
"".join([e + " " if i % 2 else e for i, e in enumerate(line)])
print('"".join([e + " " if i % 2 else e for i, e in enumerate(line)])' + str(time.time() - t0) + " s")
' '.join(line[i:i + 2] for i in range(0, len(line), 2))
print('" ".join(line[i:i + 2] for i in range(0, len(line), 2))' + str(time.time() - t0) + " s")

The result:

"".join([e + " " if i % 2 else e for i, e in enumerate(line)])     3.010514974594116 s
" ".join(line[i:i + 2] for i in range(0, len(line), 2))     5.175166130065918 s

Conclusion:

My method is the fastest!

edited Dec 21 '21 at 13:07

answered Dec 21 '21 at 13:00

Vincent Bénet

1,212
6
20

1

This is not a rigorous way to benchmark methods, so your conclusion is not a good one. Using `%timeit`, I can see that the second method is 20% faster on avg across 7 runs. – erip Dec 21 '21 at 17:31
Maybe, but it's not the point of my question. – Rescor Dec 21 '21 at 20:14
@Rescor replace in your script `if len(line) > 2: destination.write(" ".join(line))` by the script in my answer... – Vincent Bénet Dec 21 '21 at 23:12
@erip I will have a look at it, thx! – Vincent Bénet Dec 21 '21 at 23:12

Python put space every two characters

4 Answers4