0

I have a dictionary that has the following structure:

'username1': {  'attributes': { 'AccountType': ['01'], 
                                'UnitCode': ['001'], 
                                'UnitDesc': ['Marketing'], 
                                'title': ['Assistant'], }, 
                'extra': 'CN=000000'    }
'username2': {  'attributes': { 'AccountType': ['01'], 
                                'UnitCode': ['002'], 
                                'UnitDesc': ['Resources'], 
                                'title': ['Manager'], }, 
                'extra': 'CN=000000'    }

I am trying to achieve this output in a DataFrame:

+-----------+-------------+----------+
|           | AccountType | UnitCode |
+-----------+-------------+----------+
| username1 |          01 |      001 |
| username2 |          01 |      002 |
+-----------+-------------+----------+

The answer provided on a similar question here got me closer to what I was after, but I need to drop the data in the 'extra' section.

df = pd.concat({k: pd.DataFrame(v).T for k, v in mydict.items()}, axis=0)

When I call df.index it looks like this:

MultiIndex([('username1', 'attributes'),
            ('username1',      'extra'),
            ('username2', 'attributes'),
            ('username2',      'extra')])

I've attempted to remove the content from the dataframe after conversion and also from the dictionary before converting to no success. I'm not sure how to stop the duplication.

3 Answers3

0

You can drop the extra rows by resetting index and dropping them

df = df.reset_index()
df.loc[df['level_1'].ne('extra')]
     level_0     level_1 AccountType UnitCode     UnitDesc        title
0  username1  attributes        [01]    [001]  [Marketing]  [Assistant]
2  username2  attributes        [01]    [001]  [Marketing]  [Assistant]
Kenan
  • 13,156
  • 8
  • 43
  • 50
0

If you can enlist your dictionaries

x = [{'username1': {  'attributes': { 'AccountType': ['01'], 
                            'UnitCode': ['001'], 
                            'UnitDesc': ['Marketing'], 
                            'title': ['Assistant'], }, 
            'extra': 'CN=000000'    }},
 {'username2': {  'attributes': { 'AccountType': ['01'], 
                            'UnitCode': ['002'], 
                            'UnitDesc': ['Resources'], 
                            'title': ['Manager'], }, 
            'extra': 'CN=000000'    }},
{'username3': {  'attributes': { 'AccountType': ['01'], 
                            'UnitCode': ['003'], 
                            'UnitDesc': ['Marketing_3'], 
                            'title': ['Assistant_3'], }, 
            'extra': 'CN=000000'    }},
 {'username4': {  'attributes': { 'AccountType': ['01'], 
                            'UnitCode': ['004'], 
                            'UnitDesc': ['Resources_4'], 
                            'title': ['Manager_4'], }, 
            'extra': 'CN=000000'    }}]


df = pd.DataFrame(x[0]['username1']['attributes'])
for i in range(1, len(x)):
    df = pd.concat([df, pd.DataFrame(x[i][list(x[i].keys())[0]]['attributes'])])
df.index = [list(x[i].keys())[0] for i in range(len(x))]
df.drop(['UnitDesc', 'title'], axis=1, inplace=True)



         AccountType UnitCode
username1          01      001
username2          01      002
username3          01      003
username4          01      004
0

...another option

# Create a list containing all dictionaries
dict_list = (username1,username2)

# Create required column names for dataframe
attribute_list = ('AccountType','UnitCode')

# Create required indexing for dataframe
username_list = ('username1','username2')

# Create empty dataframe
df = pd.DataFrame(columns = {'AccountType' ,'UnitCode'}, index = username_list)

# Fill dataframe
for num,username in enumerate(username_list):
  for attribute in attribute_list:
    df[attribute][username] = str(dict_list[num]['attributes'][attribute])

# Clean data
for col in list(df.columns):
    df[col] = df[col].map(lambda x: str(x)[:-2])
    df[col] = df[col].map(lambda x: str(x)[2:])
Dean
  • 151
  • 1
  • 1
  • 10