-1

I got below info from other device:

foo = { "abc": "b'E3:DE'" }

I know "b" prefix means byte in Python 3. My intent is to convert it into a string. My Python version treats it as unicode type. I tried many ways, none work. The prefix "b" is always there and it is even considered as a character which can be uppercased.

foo = xxx.get("abc")
logger.info("1 foo type {0} against {1} isinstance(foo, unicode) {2}".format(type(foo), type(b''), isinstance(foo, unicode)))
logger.info("2 before anything {0}".format(foo))
foo1 = foo.encode("utf-8")
logger.info("3 after encode foo1 {0} type {1} upper {2}".format(foo1, type(foo1), foo1.upper()))
bar = foo.decode("utf-8")
logger.info("4 after decode bar {0} type {1} upper {2}".format(bar, type(bar), bar.upper()))

Output:

INFO|1 foo type <type 'unicode'> against <type 'str'> isinstance(foo, unicode) True
INFO|2 before anything b'E3:DE'
INFO|3 after encode foo b'E3:DE' type <type 'str'> upper B'E3:DE'
INFO|4 after decode foo b'E3:DE' type <type 'unicode'> upper B'E3:DE'

Do we have a built-in function to convert this unicode with "b" prefix into a string without "b" prefix? Or do I have to use substring to get rid of it?

gre_gor
  • 6,669
  • 9
  • 47
  • 52
nathan
  • 754
  • 1
  • 10
  • 24
  • Hi, I tried the methods in the link, but cannot solve the issue.. – nathan Oct 07 '22 at 03:43
  • 3
    Do you actually have `{"abc": "b'E3:DE'"}`, or do you have `{"abc": b"E3:DE"}`... There's a big difference. – BeRT2me Oct 07 '22 at 03:44
  • @BeRT2me I have the first one... – nathan Oct 07 '22 at 03:45
  • 1
    If that's true, then `b` means absolutely nothing, you just have a string that starts with `b` and has extra `'` in it as well. Take the substring. – BeRT2me Oct 07 '22 at 03:46
  • 1
    Either the "other device" is improperly handling bytes (if it also runs Python) or you are improperly reading from it. – gre_gor Oct 07 '22 at 03:53
  • @BeRT2me I see. May I ask why the type is "unicode" after my get operation. To substring, I have to convert unicode into string, lol – nathan Oct 07 '22 at 03:53
  • @gre_gor other fields are good. Eg "oxxx": "67ceac", I suspect the other device improperly handles this particular one. – nathan Oct 07 '22 at 03:57
  • 1
    Looks like the other device is converting bytes to string with `str(b)` instead of `b.decode()`. – gre_gor Oct 07 '22 at 04:09
  • 2
    The proper fix would be to fix the code on the other device. – gre_gor Oct 07 '22 at 04:14

1 Answers1

-2

You are likely to have a string literally starting with "b", not the indicator of the binary. From the error message, this string seems the unicode type. So, I think this is your situation.

x = u"b'E3DE'"
x
#u"b'E3DE'"
type(x)
#<type 'unicode'>

Since "b" is literal, you need to take the substring between b' and '. One way to do this is the regular expression like below.

import re
r = re.search(r"b'([^']*)'", x)
r.group(1)
#u'E3DE'

If you want to have the string, you can use encode method.

s = r.group(1).encode()
s
#'E3DE'
type(s)
#<type 'str'>
Kota Mori
  • 6,510
  • 1
  • 21
  • 25