0

I have a JSON file returned from a SalesForce (SOQL) query like the following example and I want to use "json_normalize" to convert in a pandas dataFrame.

with open('SOQL_RESULT.json') as data_file: data = json.load(data_file)

df = json_normalize(data)

SOQL_RESULT.json file :

[
    {
        "attributes": {
            "type": "Opp_Product_Item__c",
            "url": "/services/data/v38.0/sobjects/Opp_Product_Item__c/a0L40000007M7ViEAK"
        },
        "Id": "a0L40000007M7ViEAK",
        "Name": "OPI-273731",
        "Opportunity__c": "0064000000LlgMZAAZ",
        "Opportunity_Solution__c": "a0M400000038XGeEAM",
        "Product__c": "01t400000013pTyAAI",
        "Product_Sales_Team__r": {
            "totalSize": 3,
            "done": true,
            "records": [
                {
                    "attributes": {
                        "type": "Product_Sales_Team__c",
                        "url": "/services/data/v38.0/sobjects/Product_Sales_Team__c/a0X40000001xZfLEAU"
                    },
                    "Id": "a0X40000001xZfLEAU",
                    "Name": "PIT-120844",
                    "Team_Role__c": "Specialist - CPS",
                    "Access__c": "Read/Write",
                    "Product__c": "a0L40000007M7ViEAK"
                },
                {
                    "attributes": {
                        "type": "Product_Sales_Team__c",
                        "url": "/services/data/v38.0/sobjects/Product_Sales_Team__c/a0X40000001xZfQEAU"
                    },
                    "Id": "a0X40000001xZfQEAU",
                    "Name": "PIT-120845",
                    "Team_Role__c": "CSE",
                    "Access__c": "Read/Write",
                    "Product__c": "a0L40000007M7ViEAK"
                },
                {
                    "attributes": {
                        "type": "Product_Sales_Team__c",
                        "url": "/services/data/v38.0/sobjects/Product_Sales_Team__c/a0X40000001xdkbEAA"
                    },
                    "Id": "a0X40000001xdkbEAA",
                    "Name": "PIT-121462",
                    "Team_Role__c": "CSE - Collaboration",
                    "Access__c": "Read",
                    "Product__c": "a0L40000007M7ViEAK"
                }
            ]
        }
    },
    {
        "attributes": {
            "type": "Opp_Product_Item__c",
            "url": "/services/data/v38.0/sobjects/Opp_Product_Item__c/a0L40000009fnUPEAY"
        },
        "Id": "a0L40000009fnUPEAY",
        "Name": "OPI-372996",
        "Opportunity__c": "0064000000QlQmsAAF",
        "Opportunity_Solution__c": "a0M40000004IyUqEAK",
        "Product__c": "01t400000013pTdAAI",
        "Product_Sales_Team__r": {
            "totalSize": 1,
            "done": true,
            "records": [
                {
                    "attributes": {
                        "type": "Product_Sales_Team__c",
                        "url": "/services/data/v38.0/sobjects/Product_Sales_Team__c/a0X40000004N1LlEAK"
                    },
                    "Id": "a0X40000004N1LlEAK",
                    "Name": "PIT-150866",
                    "Team_Role__c": "CSE",
                    "Access__c": "Read/Write",
                    "Product__c": "a0L40000009fnUPEAY"
                }
            ]
        }
    }
]

But instead of having 1 row of nested data in the Product_sales_team__.r.records column like this :

Id    Name    Opportunity_Solution__c ... Product_Sales_Team__r.records
a...  OP...   a0M...                      [{'attributes': {'type': 'Product_Sales_Team__c', …
a...  OP...   a0M...                      [{'attributes': {'type': 'Product_Sales_Team__c', …

I want to add 3 new columns (Team_Role__c, Acces__c and Product__c ) with the data under Product_Sales_Team__r. records In the final dataFrame.

I tried with :

df = json_normalize(data, 'Product_Sales_Team__r[records]', ['Id'], ['Name'], ['Opportunity__c'], 
       record_prefix='Product_Sales_Team__r_records_Name')

But It did not work and based on Google examples and the official "json_normalize" documentation I do not understand why ?

here is how the desired dataFrame should look like :

Id    Name    Opportunit...   Team_Role__c    Acces__c    Product__c
a1...  OP...   a0M...           A...           B...        P1...         
a1...  OP...   a0M...           A...           B...        P2...  
a1...  OP...   a0M...           A...           B...        P3...  
a2...  OP...   a0M...           A...           B...        P1...  
a2...  OP...   a0M...           A...           B...        P2...    
a2...  OP...   a0M...           A...           B...        P3...  
...

Any help would be appreciate!

Ken

  • Possible duplicate of [Using Pandas json\_normalize on nested Json with arrays](https://stackoverflow.com/questions/45418334/using-pandas-json-normalize-on-nested-json-with-arrays) – Jan Trienes Sep 07 '17 at 19:37
  • Have a look at the [answer](https://stackoverflow.com/a/45419878/3423035) from @pirsquared. This works for your data. – Jan Trienes Sep 07 '17 at 19:38

0 Answers0