0

I have a Python2 application which logs via the structlog library, and downstream the logs are captured an extracted using key/value syntax. However, the extraction isn't working when unicode strings are involved - the u is being prepended to unicode strings, breaking the parser.

Is it possible to configure the KeyValueRenderer to exclude the u''?

import structlog
structlog.configure(processors=[structlog.processors.KeyValueRenderer()])
l = structlog.get_logger()
l.error('I am ASCII')
l.error(u'I am Unicode')

Result:

event='I am ASCII'
event=u'I am Unicode'

Desired:

event='I am ASCII'
event='I am Unicode'

I know there are questions to change Python's global printing behavior for Unicode strings - but I'm just looking to change the behavior in structlog's approach to printing them.

J. Doe
  • 1
  • 1
  • 1
    How do you want strings with non-ASCII characters to appear? – Ignacio Vazquez-Abrams Aug 31 '17 at 21:54
  • Possible duplicate of [Suppress the u'prefix indicating unicode' in python strings](https://stackoverflow.com/questions/761361/suppress-the-uprefix-indicating-unicode-in-python-strings) – Mangohero1 Aug 31 '17 at 22:03
  • 1
    It seems likely that doing this will just break the parser differently, because it'll reconstruct the wrong string. You may want to consider instead enhancing the parser. – user2357112 Aug 31 '17 at 22:07
  • @IgnacioVazquez-Abrams in this case, the strings are all ASCII but the API we're getting them from defaults everything to unicode strings. to user2357112 I don't see how removing the u before the string is printed will cause us issues - since we're correctly parsing quoted strings without the u'' now. – J. Doe Aug 31 '17 at 22:40
  • `u'\u1000'` represents a very different string from `'\u1000'` (in Python 2 notation). – user2357112 Sep 01 '17 at 05:30
  • @mangoHero1 the difference here is that I'm looking specifically for a change to structlog's logging - I'm not looking for a global change to the unicode printing behavior. I've tried pulling levers like repr_native_str=True but that didn't get at what I needed – J. Doe Sep 01 '17 at 15:58
  • The docs for [KeyValueRenderer](https://structlog.readthedocs.io/en/stable/api.html#structlog.processors.KeyValueRenderer) show there is an argument for this: `repr_native_str=False`. – ekhumoro Sep 09 '17 at 16:11

1 Answers1

0

This is what structlog.processors.UnicodeEncoder is for. It takes unicode strings and encodes them to byte strings.

Once you use Python 3, you want structlog.processors.UnicodeDecoder to prevent b prefixes.

hynek
  • 3,647
  • 1
  • 18
  • 26