1

I have a text file (*.txt) which its content is "01110011" and I want to replace that such that: '00' ==> a , '01' ==> b , '10' ==> c , '11' ==> d from left to right. so the content becomes 'bdad'. According to this post, I used the code below but unfortunately, the replacement isn't directional (I mean it's not from left to right). May I ask you to help me, please?

# Read in the file
with open('file.txt', 'r') as file :
  filedata = file.read()

# Replace the target string
filedata = filedata.replace('00', 'a')
filedata = filedata.replace('01', 'b')
filedata = filedata.replace('10', 'c')
filedata = filedata.replace('11', 'd')

# Write the file out again
with open('file.txt', 'w') as file:
  file.write(filedata)
Aaron
  • 37
  • 6

4 Answers4

6

Just build a new string only substituting the 2-char substrings at even indeces:

repl = {
    '00': 'a',
    '01': 'b',
    '10': 'c',
    '11': 'd',
}


filedata = ''.join(repl[filedata[i:i+2]] for i in range(0, len(filedata), 2))
user2390182
  • 72,016
  • 6
  • 67
  • 89
0

Omitting the file handling, you could do this:-

mystring = '01110011'
mymap = {'00': 'a', '01': 'b', '10': 'c', '11': 'd'}
newstring = ''
while len(mystring) >= 2:
    newstring += mymap[mystring[:2]]
    mystring = mystring[2:]
print(newstring)
  • This is much less efficient than @schwobaseggl's answer. – Bill Aug 06 '21 at 12:25
  • No it isn't. I timed it. List comprehensions look compact but are not necessarily more efficient than writing your own loop. If you want to contact me by chat I'd be more than happy to show you my proof –  Aug 06 '21 at 12:37
  • Are you sure? Usually string operations like `newstring += mymap[mystring[:2]]` and `mystring = mystring[2:]` cause memory to be reallocated. – Bill Aug 06 '21 at 12:40
  • As I said, I'm more than happy to show you my proof. Posting code in comments is frowned upon and my proof isn't a 'solution' so, as I said, we can do this over chat –  Aug 06 '21 at 12:42
0

This would help:

with open('file.txt', 'r') as file :
  filedata = file.read()

New_String = ""
for i in range(0, len(filedata), 2):
    if filedata[i:i+2] == "00" : New_String += "a"
    if filedata[i:i+2] == "01" : New_String += "b"
    if filedata[i:i+2] == "10" : New_String += "c"
    if filedata[i:i+2] == "11" : New_String += "d"

print(New_String)
0

Here's a NumPy solution.

import numpy as np

# Read in the file
with open('file.txt', 'r') as file:
    byte_array = np.fromfile(file, dtype='uint8').reshape(-1, 2)

result = np.full((byte_array.shape[0], 1), b' ')
result[np.all(byte_array == (48, 48), axis=1)] = b'a'
result[np.all(byte_array == (48, 49), axis=1)] = b'b'
result[np.all(byte_array == (49, 48), axis=1)] = b'c'
result[np.all(byte_array == (49, 49), axis=1)] = b'd'

# Write the file out again
with open('file.txt', 'w') as file:
    file.write(result.tobytes().decode('utf-8'))

For more details on this method, see this answer.

Or, more conveniently, you could replace the middle section with:

repl = {
    (48, 48): b'a',
    (48, 49): b'b',
    (49, 48): b'c',
    (49, 49): b'd',
}
result = np.full((byte_array.shape[0], 1), b' ')
for k, v in repl.items():
    result[np.all(byte_array == k, axis=1)] = v
Bill
  • 10,323
  • 10
  • 62
  • 85