1

I wish to remove keys and values in one JSON dictionary based on another JSON dictionary's keys and values. In a sense I am looking perform a "subtraction". Let's say I have JSON dictionaries a and b:

a =  {
     "my_app":
        {
        "environment_variables":
           {
            "SOME_ENV_VAR":
                [
                "/tmp",
                "tmp2"
                ]
           },

        "variables":
           { "my_var": "1",
             "my_other_var": "2"
           }
        }
     }

b =  {
      "my_app":
        {
        "environment_variables":
           {
            "SOME_ENV_VAR":
                [
                "/tmp"
                ]
           },

        "variables":
           { "my_var": "1" }
        }
     }

Imagine you could do a-b=c where c looks like this:

c =  {
       "my_app":
       {
        "environment_variables":
           {
            "SOME_ENV_VAR":
                [
                "/tmp2"
                ]
           },

        "variables":
           { "my_other_var": "2" }
       }
     }

How can this be done?

fredrik
  • 9,631
  • 16
  • 72
  • 132
  • In the inner dictionary stored under the key ``"variables"`` in your dictionary a you seem to overwrite the value of the key ``"my_var"``. The value ``"1"`` does not appear in a or b. Is that working as intended? How do you want to get the value ``"1"`` in your dictionary c? – Nras Sep 10 '14 at 12:04
  • My intention is to perform a "subtract" in a sense. So imagine I wish to do a-b. Since `"my_var"` with value `1` is not actually in `b`, I wish to leave it as-is - resulting in c still having that key and that value in its dictionary. – fredrik Sep 10 '14 at 12:07
  • 1
    You can't have two records with the same key in your dictionary. Please, fix `a['variables']`. – soupault Sep 10 '14 at 12:14
  • @s0upa1t I'm not sure what you mean. Why would I not be able to have `a['variables'] = {"my_var": "1", "my_var": "2"}` and `b['variables'] = {"my_var": "2"}`? – fredrik Sep 10 '14 at 12:19
  • 1
    @fredrik To make it simple: `{'a': 1, 'a': 2} == {'a': 2}`. The last key wins. – Germano Sep 10 '14 at 12:22
  • Btw, it will be very helpful if you will limit depth of your dictionaries. I mean, can `a` or `b` contain `dict(dict)`, `dict(dict(dict))` and so on? Anyways, i can give you solution for declared case. – soupault Sep 10 '14 at 12:25
  • 1
    @fredrik dictionaries don't work that way. As stated in my first comment, you overwrite the old value ``"1"`` stored under the key ``"my_var"`` by the new value ``"2"``. That is why i asked how you want to get the ``"1"`` in the c-dictionary. Following your logic, c should be completely empty. – Nras Sep 10 '14 at 12:25
  • @Germano but that's not what I'm doing. – fredrik Sep 10 '14 at 12:26
  • @s0upa1t The depth limit is in fact like in the example. – fredrik Sep 10 '14 at 12:27
  • 1
    @Germano and Nras: Oh I see now what you mean. I am updating the example code. Thanks. – fredrik Sep 10 '14 at 12:28
  • possible duplicate of [How to subtract values from dictionaries](http://stackoverflow.com/questions/17671875/how-to-subtract-values-from-dictionaries) – Andrew Sep 10 '14 at 15:03

3 Answers3

1

You can loop through your dictionary using for key in dictionary: and you can delete keys using del dictionary[key], I think that's all you need. See the documentation for dictionaries: https://docs.python.org/2/tutorial/datastructures.html#dictionaries

magnetometer
  • 646
  • 4
  • 12
  • If you're using `del` you've got to be careful to delete from the right dictionary. Do not deleted an item from the dictionary you are iterating through, python won't like this. – Joe Smart Sep 10 '14 at 12:22
0

The following does what you need:

    def subtract(a, b):
        result = {}

        for key, value in a.items():
            if key not in b or b[key] != value:
                if not isinstance(value, dict):
                    if isinstance(value, list):
                        result[key] = [item for item in value if item not in b[key]]
                    else:
                        result[key] = value
                    continue

                inner_dict = subtract(value, b[key])
                if len(inner_dict) > 0:
                    result[key] = inner_dict

        return result

It checks if both key and value are present. It could del items, but I think is much better to return a new dict with the desired data instead of modifying the original.

   c = subtract(a, b)

UPDATE

I have just updated for the latest version of the data provided by in the question. Now it 'subtract' list values as well.

UPDATE 2

Working example: ipython notebook

rhlobo
  • 1,286
  • 9
  • 10
  • When I run your code using my dictionaries, I get `KeyError: u'variables'`. I'm trying to figure out how to modify your code to allow for this... – fredrik Sep 10 '14 at 13:25
  • It works for me on python 2.7 with the given `a` and `b` dictionaries... Can you give more info on that and on the data actually being used? – rhlobo Sep 10 '14 at 16:15
  • The funny thing is, the `KeyError` would occur if the dict does not contain that key when `b[key]` is called. As it is checked on the `if` statement, I would guess the indentation you are using is not the same as the above. Can you check it? – rhlobo Sep 10 '14 at 16:23
  • Oh, you are so right. How embarrasing. Yes, I am using a different indentation than the one posted initially. I have updated the initial code with the `"my_app"` key... – fredrik Sep 11 '14 at 08:03
  • So it works now, right? I've tested the updated data and I get `{'my_app': {'variables': {'my_other_var': '2'}}}`. – rhlobo Sep 11 '14 at 14:25
  • I can't get that to work, I'm getting `TypeError: unsupported operand type(s) for -: 'dict' and 'dict'` – fredrik Sep 12 '14 at 06:52
  • Which python version are you using? For sake of posterity, can you verify if it works without any other code other than defining data and the given function works? I did it here using python 2.7 and everything goes fine, so I suspect something else in the code might be causing this to fail. – rhlobo Sep 12 '14 at 19:26
0

The way you can do it is to:

  1. Create copy of a -> c;
  2. Iterate over every key, value pair inside b;
  3. Check if for same top keys you have same inner keys and values and delete them from c;
  4. Remove keys with empty values.

You should modify code, if your case will be somehow different (no dict(dict), etc).


print(A)
print(B)
C = A.copy()

# INFO: Suppose your max depth is as follows: "A = dict(key:dict(), ...)"
for k0, v0 in B.items():
    # Look for similiar outer keys (check if 'vars' or 'env_vars' in A)
    if k0 in C:
        # Look for similiar inner (keys, values)
        for k1, v1 in v0.items():
            # If we have e.g. 'my_var' in B and in C and values are the same
            if k1 in C[k0] and v1 == C[k0][k1]:
                del C[k0][k1]
        # Remove empty 'vars', 'env_vars'
        if not C[k0]:
            del C[k0]

print(C)

{'environment_variables': {'SOME_ENV_VAR': ['/tmp']}, 
 'variables': {'my_var': '2', 'someones_var': '1'}}

{'environment_variables': {'SOME_ENV_VAR': ['/tmp']},
 'variables': {'my_var': '2'}}

{'variables': {'someones_var': '1'}}
soupault
  • 6,089
  • 4
  • 24
  • 35