After review of similar questions on SO, I have been unable to find a solution to DataFrame formatting with a nested dictionary to a desired outcome.
Being new to Pandas and moderately new to Python, I have spent the better part of two days trying and failing at various potential solutions (json_normalize, dictionary flattening, pd.concat, etc..).
I have a method which creates a DataFrame from a API call:
def make_dataframes(self):
# removed non-related code
self._data_frame_counts = pd.DataFrame({
'Created': (self._data_frame_30days.count()['Created']),
'Closed': (self._data_frame_30days.count()['Closed']),
'Owner':
(self._data_frame_30days['Owner'].value_counts().to_dict()),
'Resolution':
(self._data_frame_30days['Resolution'].value_counts().to_dict()),
'Severity':
(self._data_frame_30days['Severity'].value_counts().to_dict())
})
that writes a nested dictionary from Pandas value_count/s:
{'Created': 35,
'Closed': 6,
'Owner': {'aName': 30, 'first.last': 3, 'last.first': 2},
'Resolution': {'TruePositive': 5, 'FalsePositive': 1},
'Severity': {2: 31, 3: 4}}
Which after execution looks like:
Created Closed Owner Resolution Severity
aName 35 6 30.0 NaN NaN
first.last 35 6 3.0 NaN NaN
last.first 35 6 2.0 NaN NaN
TruePositive 35 6 NaN 5.0 NaN
FalsePositive 35 6 NaN 1.0 NaN
2 35 6 NaN NaN 31.0
3 35 6 NaN NaN 4.0
I want it to look like the following. Where data is accurately aligned with axis and accounts for missing data-points not present in the dictionary but could be there in future runs.
Created Closed Owner Resolution Severity
total 35 6 NaN NaN NaN
aName NaN NaN 30 NaN NaN
first.last NaN NaN 3 NaN NaN
last.first NaN NaN 2 NaN NaN
anotherName NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN 0
2 NaN NaN NaN NaN 31
3 NaN NaN NaN NaN 4
second.Name NaN NaN NaN NaN NaN
third.name NaN NaN NaN NaN NaN
TruePositive NaN NaN NaN 5 NaN
FalsePositive NaN NaN NaN 1 NaN