0

I have a JSON file like below, how can I make a dataframe out of this. I want to make the main key an index and subkey as a column.

{
  "PACK": {
    "labor": "Recycle",
    "actual": 0,
    "Planned": 2,
    "max": 6
  },
  "SORT": {
    "labor": "Mix",
    "actual": 10,
    "Planned": 4,
    "max": 3
  }
}

The expected output is something like, I tried to use df.T but does not work. Any help on this is appreciated.

        actual  planned
PACK      0       2
SORT      10      4
          
KARASZI István
  • 30,900
  • 8
  • 101
  • 128
Learner
  • 335
  • 3
  • 16

2 Answers2

3

You can read your json file to dict. Then create dataframe with dict values as data and dict keys as index.

import json
import pandas as pd


with open('test.json') as f:
    data = json.load(f)

df = pd.DataFrame(data.values(), index=data.keys())
print(df)

        labor  actual  Planned  max
PACK  Recycle       0        2    6
SORT      Mix      10        4    3

The select columns with

df = df[['actual', 'planned']]
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
  • It's significantly faster to use `json.load(f)` instead of `json.loads(f.read())` because it doesn't need to create an intermediate string for the file contents – Andrew Mascillaro May 12 '21 at 15:04
  • @AndrewMascillaro Thank you for your suggestion. BTW, your method is really a good method. Don't know that there is already built-in function for that. – Ynjxsjmh May 12 '21 at 15:07
  • If you have a lot of columns that you want to drop, you can drop all columns except those in a specific list. See https://stackoverflow.com/questions/45846189/how-to-delete-all-columns-in-dataframe-except-certain-ones – Oxford_orange May 12 '21 at 15:11
  • @Oxford_orange Thank you for your suggest. But I think selecting the targeted columns and dropping the unwanted column have the same effect. – Ynjxsjmh May 12 '21 at 15:13
  • Yes, the alternative of only keeping columns in a list is helpful with complex and variable datasets. – Oxford_orange May 12 '21 at 15:16
2

Pandas can read JSON files in many formats. For your use case, the following option should read your data the way you want:

pd.read_json(json_file, orient="index")

More information about the orient option can be found at the official documentation.