How to remove some bytes from a byte string?

Question

I am trying to remove a byte (\x00\x81) from a byte string sByte.

sByte = b'\x00\x81308 921 q53 246 133 137 022 1   0 1 1  1 130 C13 330 0000000199 04002201\n'

I am expecting to have as a result the following:

sByte = b'308 921 q53 246 133 137 022 1   0 1 1  1 130 C13 330 0000000199 04002201\n'

I have tried the following:

I tried to decode sByte; after running the below line of code,
```
sByte.decode('utf-8')
```
I received a traceback: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 1: invalid start byte.
I also tried the following, but did not work:
```
sByte.replace('\x00\x81', '')
```
I also found this:
json - UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 3131: invalid start byte) but it did not help removing \x00\x81.

I am not sure how we can remove or replace a byte in byte string.

score 3 · Answer 1 · answered Aug 03 '22 at 18:21

>>> sByte = b'\x00\x81308 921 q53 246 133 137 022 1   0 1 1  1 130 C13 330 0000000199 04002201\n'
>>> sByte[2:]
b'308 921 q53 246 133 137 022 1   0 1 1  1 130 C13 330 0000000199 04002201\n'

See also https://appdividend.com/2022/07/09/python-slice-notation/

The code snippet returns sByte from and including the third byte until the end.

If you wanted to store the variable again you could do this:

>>> sByte = b'\x00\x81308 921 q53 246 133 137 022 1   0 1 1  1 130 C13 330 0000000199 04002201\n'
>>> sByte = sByte[2:]
>>> sByte
b'308 921 q53 246 133 137 022 1   0 1 1  1 130 C13 330 0000000199 04002201\n'

score 2 · Accepted Answer · answered Aug 03 '22 at 18:20

2

bytes.replace doesn't work in-place, it returns a modified copy of the bytes object. You can use sByte = sByte.replace(b'\x00\x81', b'') (or bytes.removeprefix if the bytes always occur at the start). Depending on your circumstances, you can also set the errors parameter of the decode method to 'ignore': sByte = sByte.decode(encoding='utf-8', errors='ignore').

answered Aug 03 '22 at 18:20

isaactfa

5,461
1
10
24

Just as an "addition" to this answer, if OP wants to have a more "inclusive" substitute, they can always check out `re.sub`. For this specific answer it would be: `re.sub(b"\x00\x81", b'', sByte)` – Michael S. Aug 03 '22 at 18:33
@MichaelS. How does `re` behave differently in this case? – isaactfa Aug 03 '22 at 18:36
In this case, it does not, and `replace` is the better method. But if OP has other bytes that they want to remove with similar structures, then creating one `re.sub` that matches the format of all those other bytes could save them from doing a unique `replace` for each different byte. Again, in this case, your method is better. Just wanted OP to know about other options – Michael S. Aug 03 '22 at 18:40

How to remove some bytes from a byte string?

2 Answers2