1

Given a template/formatting string "{foo}_{bar}", how can I programmatically extract the required formatting keys ["foo", "bar"]?

I have dicts of parameters for various experiments

[
    {"parameters": {"foo": 1, "bar": 2}, "format": `"{foo}_{bar}"`},
    {"parameters": {"biz": 3}, "format": "{biz}_{baz}"}
]

As you can see, the second parameter set is missing key baz. So when I do something like

"{biz}_{baz}".format(**parameters), it raises a KeyError, because baz is missing.

I want to replace all missing parmaters with NR, and fill all available parameters with their values.

The output is then:

[
    {"parameters": {"foo": 1, "bar": 2}, "format": `"{foo}_{bar}"`, "formatted": "1_2"},
    {"parameters": {"biz": 3}, "format": "{biz}_{baz}", "formatted": "3_NR"}
]

For context: I have 100+ strings, with no consistency between the expected parameters required for that string.

  • @jonrsharpe It feels to me that this is not an exact duplicate as its intentions are a bit different or elaborate. The title is a bit unfortunate, though. – Bram Vanroy Dec 13 '20 at 21:54
  • @BramVanroy the duplicate exactly answers the bolded first sentence – jonrsharpe Dec 13 '20 at 21:55
  • @jonrsharpe Exactly, but if you read further down, this user actually has a very specific end goal in mind. I believe that that is what this question revolves around as that is the part that would actually help them a lot more than merely that first question. I'm sure that if OP put in some effort to rewrite the post to emphasize the end goal, that this can be re-opened. – Bram Vanroy Dec 13 '20 at 22:00
  • @BramVanroy well maybe, but voting to open prior to their doing that seems premature, and maybe they asked about the bit they were actually stuck on. – jonrsharpe Dec 13 '20 at 22:05
  • Hi guys, the bolded sentence Is all that I really needed help with. It seems as if the correct answer to this question is in fact the accepted answer in the linked post. https://stackoverflow.com/questions/22830226/how-to-get-the-variable-names-from-the-string-for-the-format-method. I did a stack overflow search before asking the question, but somehow missed that post. Thank you both for your help! – Parmandeep Chaddha Dec 13 '20 at 22:14
  • @BramVanroy so now it's reopened *despite* the dupe's answer solving the OP's problem; that's a suboptimal outcome, because the next person who finds this sees the fragile solution below unless they go into the comments or spelunk through Linked questions. It would have been good to wait for the OP to clarify through an edit if the question needed reopening. – jonrsharpe Dec 14 '20 at 09:42
  • @jonrsharpe You are right. I should have waited with the reopen vote until OP had changed their post, which in the end they deemed was not necessary. Not sure if can still vote to close though. Sorry, my bad. – Bram Vanroy Dec 14 '20 at 12:06

1 Answers1

2

You can efficiently gather the required parameter names from the string, and then check whether there are any missing keys by checking for set similarity between the actual keys. If there are missing keys, add them with the value "NR". Finally, use .format to correctly format the string into a "formatted" key.

ds = [
    {"parameters": {"foo": 1, "bar": 2}, "format": "{foo}_{bar}"},
    {"parameters": {"biz": 3}, "format": "{biz}_{baz}"}
]

for d in ds:
    # Copy params so that we do not change params in-place
    params = d["parameters"]
    req_keys = set(d["format"][1:-1].split("}_{"))
    missing_keys = req_keys.difference(params.keys())

    if len(missing_keys) > 0:
        params = {**params, **{key: "NR" for key in missing_keys}}

    d["formatted"] = d["format"].format(**params)

print(ds)

# [{'parameters': {'foo': 1, 'bar': 2}, 'format': '{foo}_{bar}', 'formatted': '1_2'}, {'parameters': {'biz': 3}, 'format': '{biz}_{baz}', 'formatted': '3_NR'}]

Bram Vanroy
  • 27,032
  • 24
  • 137
  • 239
  • 1
    Did almost exactly this, except replaced `req_keys = set(d["format"][1:-1].split("}_{"))` with `req_keys=[var for _, var, _, _ in Formatter().parse(d['format']) if var]`. This avoids coupling everything to "}_{", in case there is a change in the future. – Parmandeep Chaddha Dec 13 '20 at 22:18