0

So, I have a string that looks like this:

'\xfa\x7f\xe5<\xcf\xda\xf0v\xf1\x94\xd8q7w,\x1d\x1d\x9eU\xf6O\x9fnW~\xf0\xdd\xad\xd2\xf7\x13\nA\xbb\xda\xa2rge\xb02_3\xcb\x81\x03\xcb \x1c\x86\xbb\n\x04r\xcdCKQ\x9ew\xe7\xf1{\x08'

and I want to turn it into the bytes object that is the same but as bytes:

b'\xfa\x7f\xe5<\xcf\xda\xf0v\xf1\x94\xd8q7w,\x1d\x1d\x9eU\xf6O\x9fnW~\xf0\xdd\xad\xd2\xf7\x13\nA\xbb\xda\xa2rge\xb02_3\xcb\x81\x03\xcb \x1c\x86\xbb\n\x04r\xcdCKQ\x9ew\xe7\xf1{\x08'

but if I try to print the original string I get:

úå<ÏÚðvñ”Øq7w,žUöOŸnW~ðÝ­Ò÷
A»Ú¢rge°2_3ËË †»
rÍCKQžwçñ{

And I can't seem to find a way to keep the original string unaltered.

Rararat
  • 53
  • 6
  • 1
    Strings are composed of a series of Unicode Codepoints and Bytes is effectively a list of raw numbers from 0-255 (despite what Python's `repr` might indicate to you), they are two *very* different things. To convert between them codecs must be used, and the utf-8 code is typically the one, but they are fundamentally different things and you should not expect they "look the same". You will need to specify exactly what do you mean by "same" and "unaltered". – metatoaster Nov 25 '22 at 01:44
  • @metatoaster I see. Thanks! I was hoping for a way to receive the above string and use it in another function, but I need it as bytes. Encoding doesn't seem to work as some parts of the string can't be decoded. – Rararat Nov 25 '22 at 01:47
  • 1
    For that, you typically want to `encode` the `str` into `bytes` using the `utf-8` codec as that's the most common method. Unless the `str` happens to contain codepoints that maps to the corresponding index on the extended ASCII set which you will then need to use the corresponding codec, the most common one is typically `latin-1` (ISO-8859-1). Without knowing the exact values/APIs you are using I am afraid this is all I can advise you with. – metatoaster Nov 25 '22 at 01:50
  • @Dash Thanks! I'll try using those methods. – Rararat Nov 25 '22 at 01:53
  • @metatoaster Seems latin1 works for now. I'm doing something for a cryptography class, so the objects are technically keys, but since I should try to implement this on a webpage, I wanted to convert the keys/encryptions that will probably be received as strings to the bytes objects they represent – Rararat Nov 25 '22 at 01:59
  • 1
    The `\x` way of encoding things will introduce additional ambiguity (are they literal backslashes or are they already decoded into the corresponding character? Who knows!) Hence for cryptography purposes, most keys are provided as a base64 encoded string, decoded into bytes (if not already; any coding will work as all codepoints used in base64 are standard ASCII), and then that be converted to binary form (i.e. `bytes`) for the intended usage. Alternatively reading a hexadecimal string (e.g. `fa7fe53ccf...`) is another method, just use `bytes.fromhex`, e.g. `bytes.fromhex('fa7fe53ccf')` – metatoaster Nov 25 '22 at 02:07
  • @metatoaster oh, I see. Then, should I convert to hex? or should I use base64? I just need one that will be able to convert any bytes Edit: I used hex, works like a charm! Thanks! – Rararat Nov 25 '22 at 02:09
  • 1
    Any one will do, you are the author for this program, if you are ambitious you can support both (provide a way for the user to toggle between the two modes). Anyway, future reference: next time it is definitely useful to include what you are really trying to do in the question, avoids this additional back and forth. Though definitely include what you've tried but failed to get working for additional background on what you got stuck with also. – metatoaster Nov 25 '22 at 02:14
  • @metatoaster Oh, sorry. Last time trying to explain what I was trying to do just got me comments of "That's not how it is made", but it was how I needed to do it as it was for an assignment, so I tried to explain explicitly what I needed to do – Rararat Nov 25 '22 at 02:28
  • 1
    Hex gives you 4 bits per character while base64 gives you 6 bits per character, so base64 will be more efficient if you have a choice. – Mark Ransom Nov 25 '22 at 02:40

0 Answers0