0

I created a module to be used a simple container for configuration settings. Then I read somewhere about security considerations and converted the whole things to a yml file which is read using the python yaml module, like this:

def read_config_yml(filename):
    try:
        with open(filename, encoding='utf-8') as f:
            data = yaml.safe_load(f)
            return data
    except Exception as e:
        msg = f'Missing or invalid configuration file\n({e})'
        raise RuntimeError(msg) from e

All is fine, except that I have a lot of values which hold unicode symbols, e.g.:

categories:
    a:
        color: limegreen
        icon: '\u25a3'
    b:
        color: coral
        icon: '\u25e9'
    c:
        color: limegreen
        icon: '\u25b2'

This worked just fine when used in pure Python (3.9.6), but now all the Unicode values are returned with an escape character — such as "\\u25e9" — which obviously breaks the rest of the existing code.

I tried different combinations in the yml file ("\u25b2", '\u25b2', 'u25b2', '\\u25b2') but I simply can't get the string I need.

Why is the yaml doing that? Is there a setting that can bypass this behavior? And will I have the same problem if I switch to JSON?

Cirrocumulus
  • 520
  • 3
  • 15
  • Are you on Python 2.x ? or 3.x? – BlackMath Aug 12 '21 at 09:52
  • Sorry :) I'm on 3.9.6. – Cirrocumulus Aug 12 '21 at 10:00
  • YAML is Unicode already. If you want to use symbolic character sequences like `\u1234` you will need to decode them in Python. – tripleee Aug 12 '21 at 10:19
  • ...or you use JSON. JSON has no security issues of its own, and it supports both escape sequences and literal Unicode characters. – Tomalak Aug 12 '21 at 10:23
  • I didn't think of using Unicode directly, because I am not used to see stuff like ```icon: ▣``` in computer code... It looks out of place to me. Maybe it's considered common now? The above works with my existing code, BTW. – Cirrocumulus Aug 12 '21 at 10:30

0 Answers0