0

Let's suppose I have a pattern defined like this:

import re
pattern = re.compile(r'http\://(?P<host>\d+\.\d+\.\d+\.\d+)\:(?P<port>\d+)')

This pattern helps me parse a string to extract host and port parameters. But what if a need a reverse action: to build a string using this pattern and host and port values? Is there an elegant way to do it? Or I have to define a separate string pattern to format for this purpose?

import re
pattern = re.compile(r'http\://(?P<host>\d+\.\d+\.\d+\.\d+)\:(?P<port>\d+)')

values = {'host': '192.168.1.5', 'port': 80}
s = some_elegant_function(pattern, values)  # <--- What is here???

print(s)  # http://192.168.1.5:80 - expected result

I've had a look at the question: Generate a String that matches a RegEx in Python

It does not provide an answer, because the subject is about a set of relevant output strings without passing any parameters that I need in my case.

Fomalhaut
  • 8,590
  • 8
  • 51
  • 95
  • Well, you could build an RE that replaces each RE capturing group with the appropriate format string placeholder. – MisterMiyagi Oct 26 '20 at 07:57
  • There are a lot of tools & libraries mentioned here: https://stackoverflow.com/questions/22115/using-regex-to-generate-strings-rather-than-match-them But none for Python. Maybe you could take inspiration from some of them. – rdas Oct 26 '20 at 07:58
  • 1
    What is `some_elegant_function` supposed to do for regexes where alternatives (`|`) or various repetition counts (`+`, `*`, `{`) are involved? What can make more sense is going the other way: using a format string and transforming it into a regex - you'll have to escape everything and replace the placeholders with capture groups. – Matteo Italia Oct 26 '20 at 07:59
  • This is feasible if you only use `%s`; otherwise even that can be nontrivial (say your format string contains a `%f`—now you can't have `%` replace it with a string, and you'd have to replace it with the full grammar for a fp value) to downright impossible, especially if you can pass in arbitrary objects with arbitrary `__str__` or `__repr__`; in this case, it's entirely possible that the result cannot be reliably parsed back (or at least not in the same way). Personally, I wouldn't bother, or I would start from a common, easier format string to generate both the parse and format code. – Matteo Italia Oct 26 '20 at 08:14

0 Answers0