TL;DR;
I want to transform a string (representing a regex) like "\\."
into "\."
in a clean and resilient way (something akin to sed 's/\\\\/\\/g'
, I don't know if this could break on edge cases though)
val.decode('string-escape')
is not an option since I'm using python3.
What I tried so far:
- variations of
val.replace('\\\\', '\\')
- looked at the answers to these two
questions but couldn't get them to work in my case
- variations of
val.encode().decode('unicode-escape')
- variations of
- had a look at the docs for strings but couldn't find a solution
I am sure that I missed a relevant part, because string escaping (and unescaping) seems like a fairly common and basic problem, but I haven't found a solution yet =/
Full Story:
I have a YAML-File like so
- !Scheme
barcode: _([ACGTacgt]+)[_.]
lane: _L(\d\d\d)[_.]
name: RKI
read: _R(\d)+[_.]
sample_name: ^(.+)(?:_.+){5}
set: _S(\d+)[_.]
user: _U([a-zA-Z0-9\-]+)[_.]
validation: .*/(?:[a-zA-Z0-9\-]+_)+(?:[a-zA-Z0-9])+\.fastq.*
...
that describes a "Scheme" Object. The 'name' key is an identifier and the rest describe regexes.
I want to be able to parse an object from that YAML so I wrote a from_yaml
class method:
scheme = Scheme()
loaded_mapping = loader.construct_mapping(node) # load yaml-node as dictionary WARNING! loads str escaped
# re.compile all keys except name, adding name as regular string and
# unescaping escaped sequences (like '\') in the process
for key, val in loaded_mapping.items():
if key == 'name':
processed_val = val
else:
processed_val = re.compile(val) # backslashes in val are escaped
scheme.__dict__[key] = processed_val
the problem is that loader.construct_mapping(node)
loads the strings with backslashes escaped, so the regex is not correct anymore.
I tried several variations of val.encode().decode('unicode-escape')
and val.replace('\\\\', '\\')
,
but had no luck with it
If anyone has an idea how to handle this I'd appreciate it very much! I am not married to this specific way of doing things and open to alternative approaches.
Kind Regards!