This should do it:
portfolios = []
for _, account in df.groupby('AccountName'):
portfolio = {stock['StockName']: stock['Allocation']
for _, stock in account.iterrows()}
portfolios.append(portfolio)
First use the groupby()
function to group the rows of the dataframe by AccountName
. To access the individual rows (stocks) for each account, you use the iterrows()
method. As user @ebb-earl-co explained in the comments, the _
is there as a placeholder variable, because iterrows()
returns (index, Series
) tuples, and we only need the Series
(the rows themselves). From there, use a dict comprehension to create a dictionary mapping StockName
-> Allocation
for each stock. Finally, append that dictionary to the list of portfolios
, resulting in the expected output:
[{'ABC': 0.4, 'ABD': 0.6}, {'EFG': 0.5, 'HIJ': 0.4, 'LMN': 0.1}]
One more thing: if you decide later that you want to label each dict in the portfolios
with the account name, you could do it like this:
portfolios = []
for acct_name, account in df.groupby('AccountName'):
portfolio = {stock['StockName']: stock['Allocation']
for _, stock in account.iterrows()}
portfolios.append({acct_name: portfolio})
This will return a list of nested dicts like this:
[{'MN001': {'ABC': 0.4, 'ABD': 0.6}},
{'MN002': {'EFG': 0.5, 'HIJ': 0.4, 'LMN': 0.1}}]
Note that in this case, I used the variable acct_name
instead of assigning to _
because we actually will use the index to "label" the dicts in the portfolios
list.