0

I am usign the Rpi to extract information from SQL server's tables and I am having some problems at the moment to decode the following uncode data in Python: u'\U00300032\U00360031\U0030002d\U002d0039\U00310032' which is a date data, when I asign this value to a variable I get this error:

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-9: end of string in escape sequence

I have read so many information about the topic but I don't find useful information, what I want is convert to a string type.Am I missing something here?

David
  • 11,245
  • 3
  • 41
  • 46
  • "when I'm trying to assign to a variable". You mean you're trying to assign this literal string to a variable? Can you show us your code? – Jean-François Fabre Dec 01 '16 at 18:09
  • This answer seems to be a possible solution [“Unicode Error ”unicodeescape" codec can't decode bytes…](http://stackoverflow.com/a/1347854/1248974), replacing the `u'` with `r'` – chickity china chinese chicken Dec 01 '16 at 20:10
  • Jean, it's something similar, what I'am doing is: execute a querry since Python and I receive a list of all data I request, one of them is a date data which python reprents as the example **u'\U00300032\U00360031\U0030002d\U002d0039\U00310032'**, and when I try literrally do this: **date_sql =sql_list[3], where sql_list[3] is u'\U00300032\U00360031\U0030002d\U002d0039\U00310032'**, I have the error. downshift I think is good answer, the problem is that I can not manipulate the information because of the error appear first or well at the moment I dont know how to do it. Thaks!! – Roberto Dec 04 '16 at 03:54

1 Answers1

0

u'\U0010FFFF' is the largest Unicode code point. Your representation implies the data was decoded UTF-32LE, but Python (2.7.12, at least) would give:

UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-3: code point not in range(0x110000)

If the original data is written out as the bytes of UTF-32LE and decoded correctly as UTF-16LE, you get your date data:

>>> data = '\x32\x00\x30\x00\x31\x00\x36\x00\x2d\x00\x30\x00\x39\x00\x2d\x00\x32\x00\x31\x00'
>>> data.decode('utf-16le')
u'2016-09-21'
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251