Given the following JSON file to Pandas (df = pd.read_json(file)
):
[
{
"Name": "foo",
"Details": {
"Vendor": "Microsoft",
"Item": "aaa"
}
},
{
"Name": "bar",
"Details": {
"Vendor": "Microsoft",
"Item": "bbb"
}
},
{
"Name": "baz",
"Details": {
"Vendor": "Microsoft",
"Item": "ccc"
}
},
{
"Name": "baz2",
"Details": {
"Vendor": "Microsoft",
"Item": "ccc"
}
},
{
"Name": "qux",
"Details": {
"Vendor": "IBM",
"Item": "aaa"
}
}
]
I want to perform unique counts of items in the JSON file. I want to know the number of unique vendors, and the number of unique vendor-item combinations. With the above JSON, there are 2 unique vendors (Microsoft and IBM) and there are 4 unique vendor-item combinations (baz and baz2 are duplicates).
I believe my current attempts have failed because I have JSON stored inside of my DataFrame
.
df = pd.read_json(file)
print(df)
Outputs:
Name Details
0 foo {'Vendor': 'Microsoft', 'Item': 'aaa'}
1 bar {'Vendor': 'Microsoft', 'Item': 'bbb'}
2 baz {'Vendor': 'Microsoft', 'Item': 'ccc'}
3 baz2 {'Vendor': 'Microsoft', 'Item': 'ccc'}
4 qux {'Vendor': 'IBM', 'Item': 'aaa'}
I've also attempted the following: print(df.groupby("Details").Vendor.nunique())
which results in the error:
AttributeError: 'DataFrameGroupBy' object has no attribute 'Vendor'