-1

(Python beginner alert) I am trying to create a custom JSON from an existing JSON. The scenario is -

I have a source which can send many set of fields but I want to cherry pick some of them and create a subset of that while maintaining the original JSON structure.

Original Sample

{
    "Response": {
        "rCode": "11111",
        "rDesc": "SUCCESS",
        "pData": {
            "code": "123-abc-456-xyz",
            "sData": [
                {
                    "receiptTime": "2014-03-02T00:00:00.000",
                    "sessionDate": "2014-02-28",
                    "dID": {
                        "d": {
                            "serialNo": "3432423423",
                            "dType": "11111",
                            "dTypeDesc": "123123sd"
                        },
                        "mode": "xyz"
                    },
                    "usage": {
                        "duration": "661",
                        "mOn": [
                            "2014-02-28_20:25:00",
                            "2014-02-28_22:58:00"
                        ],
                        "mOff": [
                            "2014-02-28_21:36:00",
                            "2014-03-01_03:39:00"
                        ]
                    },
                    "set": {
                        "abx": "1",
                        "ayx": "1",
                        "pal": "1"
                    },
                    "rEvents": {
                        "john": "doe",
                        "lorem": "ipsum"
                    }
                },
                {
                    "receiptTime": "2014-04-02T00:00:00.000",
                    "sessionDate": "2014-04-28",
                    "dID": {
                        "d": {
                            "serialNo": "123123",
                            "dType": "11111",
                            "dTypeDesc": "123123sd"
                        },
                        "mode": "xyz"
                    },
                    "usage": {
                        "duration": "123",
                        "mOn": [
                            "2014-04-28_20:25:00",
                            "2014-04-28_22:58:00"
                        ],
                        "mOff": [
                            "2014-04-28_21:36:00",
                            "2014-04-01_03:39:00"
                        ]
                    },
                    "set": {
                        "abx": "4",
                        "ayx": "3",
                        "pal": "1"
                    },
                    "rEvents": {
                        "john": "doe",
                        "lorem": "ipsum"
                    }
                }
            ]
        }
    }
}

Here the sData array tag has got few tags out of which I want to keep only 24 and get rid of the rest. I know I could use element.pop() but I cannot go and delete a new incoming field every time the source publishes it. Below is the expected output -

Expected Output

{
    "Response": {
        "rCode": "11111",
        "rDesc": "SUCCESS",
        "pData": {
            "code": "123-abc-456-xyz",
            "sData": [
                {
                    "receiptTime": "2014-03-02T00:00:00.000",
                    "sessionDate": "2014-02-28",
                    "usage": {
                        "duration": "661",
                        "mOn": [
                            "2014-02-28_20:25:00",
                            "2014-02-28_22:58:00"
                        ],
                        "mOff": [
                            "2014-02-28_21:36:00",
                            "2014-03-01_03:39:00"
                        ]
                    },
                    "set": {
                        "abx": "1",
                        "ayx": "1",
                        "pal": "1"
                    }
                },
                {
                    "receiptTime": "2014-04-02T00:00:00.000",
                    "sessionDate": "2014-04-28",
                    "usage": {
                        "duration": "123",
                        "mOn": [
                            "2014-04-28_20:25:00",
                            "2014-04-28_22:58:00"
                        ],
                        "mOff": [
                            "2014-04-28_21:36:00",
                            "2014-04-01_03:39:00"
                        ]
                    },
                    "set": {
                        "abx": "4",
                        "ayx": "3",
                        "pal": "1"
                    }
                }
            ]
        }
    }
}

I myself took reference from How can I create a new JSON object form another using Python? but its not working as expected. Looking forward for inputs/solutions from all of you gurus. Thanks in advance.

karD
  • 43
  • 10
  • 1
    Why do you want to do this? As long as the final consumer only uses the fields it needs, what's the problem with having extra fields in there? – Tim Roberts Jul 12 '21 at 05:05
  • Its a way to limit the amount of information I need to pass to downstream systems. I cannot let them have access to everything so need to restrict it. Hope it makes sense. – karD Jul 12 '21 at 05:06
  • 1
    Remember that, once you `json.loads` this, it's just a Python dict. You can create a new `sData` list and copy over only the fields you want, and then replace `data['Response']['pData']['sData']` with your new list. You might even create a `subset` function that extracts your subset. – Tim Roberts Jul 12 '21 at 05:09

1 Answers1

1

Kind of like this:

data = json.load(open("fullset.json"))

def subset(d):
    newd = {}
    for name in ('receiptTime','sessionData','usage','set'):
        newd[name] = d[name]
    return newd

data['Response']['pData']['sData'] = [subset(d) for d in data['Response']['pData']['sData']]

json.dump(data, open('newdata.json','w')) 
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30