Quick summary of my problem: ghGby is a dictionary of keys each corresponding to a dataframe groupby. I would like to divide each groupby into their own groupby's corresponding to each set of days in the column of the groupby.
groupDays = {}
groupDayIndexer = []
for x in ghGby.keys():
for y in ghGby[x].DAY.unique().keys():
if x in groupDays:
groupDays[x].append(ghGby[x].get_group(y))
else:
groupDays[x] = ghGby[x].get_group(y)
Data looks like this:
ghGby[keyA]:
day|transaction|etc.
51 | ......... | ...
51 | ......... | ...
63 | ......... | ...
63 | ......... | ...
63 | ......... | ...
94 | ......... | ...
.get_group(y) returns each set of days as an individual object just fine, but when I append them to groupDays I only get one day of the groupby rather than each one like this:
print(groupDays['keyA'])
{keyA: [day51GroupBy, day63GroupBy, day94GroupBy]}
more background information:
original dataset looks like this, just many thousands of household_keys. My objective is to be able to access a subset of this large dataset by specifying my desired day and desired household key. As these are transactions, the same key can have multiple entries on the same day.
household_key DAY PRODUCT_ID
1929 4 1004906
1929 4 1004906
1929 95 1004906
1929 202 1004906
1929 207 1004906
my desired output:
print(groupDays['household_key1929'])
{[ghGby[groupDays['household_key1929'].get_group(day4), ghGby[groupDays['household_key1929'].get_group(day52), ghGby[groupDays['household_key1929'].get_group(day95), ghGby[groupDays['household_key1929'].get_group(day202)]}
I would like to do this so that I can access my data easier, like this:
display(groupDays['household_key1929'][0])
household_key DAY PRODUCT_ID
1929 4 1004906
1929 4 1004906
I am accessing the first element of the list of days associated to household key 1929, in this case it would be day 4.