2

I have a question similar to this one. However, I need my JSON to be partially-nested. Currently, my dataframe looks like this:

df = pd.DataFrame({'subsidary': ['company name','company name'],
                   'purchase_order_number': ['PO Num', 'PO Num'],
                   'invoice_date': ['2018-10-15', '2018-10-15'],
                   'vendor_invoice_number': ['777','777'],
                   'vendor_sku': ['SKU888', 'SKU888'],
                   'quantity': ['10', '20'],
                   'rate': ['12.00', '11.00'],
                   'amount': ['120.00', '220.00'],
                   'freight': ['5.00', '5.00'],
                   'taxes': ['0.00', '0.00']})

Using the link above and the code below:

j = (df.groupby(['subsidary','purchase_order_number','invoice_date','vendor_invoice_number'], as_index=False)
           .apply(lambda x: x[['vendor_sku','quantity','rate','amount']].to_dict('r'))
           .reset_index()
           .rename(columns={0:'item_charges'})   
           .to_json(orient='records'))

print(json.dumps(json.loads(j), indent=2, sort_keys=False))

I was able to get it to look like this:

[
  {
    "subsidary": "company name",
    "purchase_order_number": "PO Num",
    "invoice_date": "2018-10-15",
    "vendor_invoice_number": "777",
    "item_charges": [
      {
        "vendor_sku": "SKU888",
        "quantity": "10",
        "rate": "12.00",
        "amount": "120.00"
      },
      {
        "vendor_sku": "SKU888",
        "quantity": "20",
        "rate": "11.00",
        "amount": "220.00"
      }
    ]
  }
]

However, I would like it to look like this:

[
  {
    "subsidary": "Natural Partners",
    "purchase_order_number": "AZ003387-PO",
    "invoice_date": "2018-10-15",
    "vendor_invoice_number": "76947",
    "item_charges": [
      {
        "vendor_sku": "SUP002",
        "quantity": "12.00",
        "rate": "14.50",
        "amount": "174.00"
      },
      {
        "vendor_sku": "SUP004",
        "quantity": "3.00",
        "rate": "8.75",
        "amount": "26.25"
      }
    ],
    "invoice_charges": 
    {
       "freight": '5.00',
       "taxes": '0.00',
    }
  }
]

Is there a way for me to do this within python?

Thank you in advance.

Aaron England
  • 1,223
  • 1
  • 14
  • 26

1 Answers1

1

You can do it by storing each nesting before processing the next.

df = pd.DataFrame({'subsidary': ['company name','company name'],
                   'purchase_order_number': ['PO Num', 'PO Num'],
                   'invoice_date': ['2018-10-15', '2018-10-15'],
                   'vendor_invoice_number': ['777','777'],
                   'vendor_sku': ['SKU888', 'SKU888'],
                   'quantity': ['10', '20'],
                   'rate': ['12.00', '11.00'],
                   'amount': ['120.00', '220.00'],
                   'freight': ['5.00', '5.00'],
                   'taxes': ['0.00', '0.00']})

# Your original procedure
j = df.groupby(
    ['subsidary','purchase_order_number','invoice_date',
    'vendor_invoice_number', "freight", "taxes"],
     as_index=False).apply(lambda x: x[['vendor_sku','quantity','rate','amount']].to_dict('r')
                     ).reset_index().rename(columns={0:'item_charges'})

# Store the item_charges and do it again      
item_charges = j["item_charges"]
j=j.groupby(['subsidary','purchase_order_number','invoice_date',
             'vendor_invoice_number',"freight", "taxes"], as_index=False
              ).apply(lambda x: x[["freight", "taxes"]].to_dict('r')
              ).reset_index().rename(columns={0:'invoice_charges'})

# Add back the stored item_charges
j["item_charges"] = item_charges
j = j.to_json(orient='records')
print(json.dumps(json.loads(j), indent=2, sort_keys=False))

I should say that I'm not thrilled with this approach and I can't imagine that it is performant, but it was what I could think of that works. And it works -- output below:

[
  {
    "subsidary": "company name",
    "purchase_order_number": "PO Num",
    "invoice_date": "2018-10-15",
    "vendor_invoice_number": "777",
    "freight": "5.00",
    "taxes": "0.00",
    "invoice_charges": [
      {
        "freight": "5.00",
        "taxes": "0.00"
      }
    ],
    "item_charges": [
      {
        "vendor_sku": "SKU888",
        "quantity": "10",
        "rate": "12.00",
        "amount": "120.00"
      },
      {
        "vendor_sku": "SKU888",
        "quantity": "20",
        "rate": "11.00",
        "amount": "220.00"
      }
    ]
  }
]
Charles Landau
  • 4,187
  • 1
  • 8
  • 24
  • Thank you. I had to switch the code around a little bit to get in the desired order, but your code worked great. I appreciate the help! – Aaron England Nov 08 '18 at 15:07