1

I have to perform some tests to tune some numeric parameters of a json file. To make simple, I replaced all these values by the string 'variable', then did the following:

numeric_vals = [10,20, 30, 40]  # numeric values to replace in that order 
with open ('mypath') as my_file:
    json_str = my_file.read()
for i in numeric_vals:
    json_str = json_str.replace("\"variable\"", str(i), 1)
c = json.loads(json_str)  #loading in order to work with

This works fine, but is there a more efficient way to do this ? Values that need to be replaced are in variable depths, and may be inside lists etc. My json file is 15Kb and I need to test many (really many!) configurations. At each test, about 200 variables need to be replaced. I am on python 2.7 , but python 3.5 is also an option. Thanks for your help!

EDIT :

here is a sample of my dict. It should be noted that the real thing is far longer and deeper:

 {
"1": {
    "transition": {
        "value": "variable", # edit here
        "unite": "sec"
    },
    "sequence": [
        {
            "step": "STEP",
            "name": "abc",
            "transition": {
                "value": "variable", #edit here
                "unite": "sec"
            },
            "entity": "aa",
            "_equipement": 3,
            "_value": 0
        },
        {
            "step": "FORK",
            "BRANCHES": [
                {
                    "": {
                        "sequence": [
                            {
                                "step": "STEP",
                                "name": "klm",
                                "transition": {
                                    "value": "variable", # edit here
                                    "unite": "sec"
                                },
                                "entity": "vvv",
                                "_equipement": 10,
                                "_value": 0,
                                "conditions": [
                                    [
                                        {
                                            "name": "ddd",
                                            "type": "el",
                                            "_equipement": 7,
                                            "_value": 1
                                        }
                                    ]
                                ]
                            }
                        ]
                    }
                },
                {
                    "SP": {
                        "sequence": [
                            {
                                "step": "STEP",
                                "name": "bbb",
                                "transition": {
                                    "value": "variable", # edit here
                                    "unite": "sec"
                                },
                                "entity": "bb",
                                "_equipement": 123,
                                "_value": 0,
                                "conditions": [
                                    [
                                        {
                                            "name": "abcd",
                                            "entity": "dgf",
                                            "type": "lop",
                                            "_equipement": 7,
                                            "_valeur": 0
                                        }
                                    ]
                                ]
                            }
                        ]
                    }
                }
            ]
        }
    ]
}

}

Matina G
  • 1,452
  • 2
  • 14
  • 28
  • can you show a small example of how the file looks like (not the whole file but an example with variable depths) ? Also have you tried `re.sub`? – Hadi Farah Jan 29 '19 at 10:44
  • re.sub seems like agood idea, no I didn't try it, thanks for the tip. Sharing dict in a moment. – Matina G Jan 29 '19 at 10:50
  • it's useful to know if `\"variable\"` is a key, value or whatnot to know whether string manipulation is more efficient or simply making alterations to the dictionary directly after you load the json file. – Hadi Farah Jan 29 '19 at 10:51
  • it is a value. That is why I am replacing \"variable\" by str(value) so as to transform a string value to an integer one – Matina G Jan 29 '19 at 10:53
  • so just paste one example and I'll see what to do. Also does your dictionary have lists in it, like list of dicts or dicts of lists at variable depths ? – Hadi Farah Jan 29 '19 at 10:54
  • I did some edits to your, waiting for someone to review, I added comments where the variable should change. I have to ask if the key always `"value"` before `"variable"` or not necessarily ? – Hadi Farah Jan 29 '19 at 11:06
  • Thaks for the edits. Yes, however, the key value may be used else where. variable needing to be modified is a value of the 'value' key in 'transition' dictionnary – Matina G Jan 29 '19 at 11:08

2 Answers2

2

It's generally a bad idea to do string operations on hierarchical/structured data as there may be many cases where you can break the structure. Since you're already parsing your JSON you can extend the decoder to specifically deal with your case during parsing, e.g.:

numeric_vals = [10, 20, 30, 40]  # numeric values to replace in that order

SEARCH = 'variable'
REPLACE = iter(numeric_vals)  # turn into an iterator for sequential access

def replace_values(value):
    return {k: next(REPLACE) if v == SEARCH else v for k, v in value.items()}

with open('path/to/your.json') as f:
    a = json.JSONDecoder(object_hook=replace_values).decode(f.read())

This ensures you're properly parsing your JSON and that it won't replace, for example, a key that happens to be called 'variable'.

Beware, tho, that it will raise an StopIteration exception if there are more "variable" values in the JSON than there are numeric_vals - you can unravel the dict comprehension in replace_values and deal with such case if you expect to encounter such occurrences.

zwer
  • 24,943
  • 3
  • 48
  • 66
  • Thanks. I am checking this out, it actally seems like a very good solution – Matina G Jan 29 '19 at 11:33
  • Best part in what you proposed is the fact that your solution works on f.read() , that is on a string. I was actually counting on opening only one time and load the string in RAM, the perform tests (on many - many lists of numeric values) so this helps a lot in order to limit disk access latency. – Matina G Jan 29 '19 at 11:42
1

You can get a performance improvement by taking the call to json.loads() outside of the loop, you only need to do this once at the end:

numeric_vals = [10, 20, 30, 40]

with open('mypath') as my_file:
    json_str = my_file.read()
    for i in numeric_vals:
        json_str = json_str.replace('"variable"', str(i), 1)
    c = json.loads(json_str)

Also prefer to use string.replace over re.sub, as per this post.

Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • @zwer notice the `1` parameter at the end of the call to `replace()`. OP is replacing one element at a time in each iteration. – Óscar López Jan 29 '19 at 11:20
  • Thanks, the json.loads identation was a typo. The 1 parameter is because I need to replace first variable by first number, then second variable, which is now first, by second variable etc. – Matina G Jan 29 '19 at 11:25