1

so what i have is the following:

test_df = pd.DataFrame({"index":[1,2,3,1],"columns":[5,6,7,5],"values":[9,9,9,9]})

  index columns values
0   1     5       9
1   2     6       9
2   3     7       9
3   1     5       9

i would like the following, the index cols as my index, the columns cols as the columns and the values aggregated in their respective fields, like this:

  5    6   7
1 18  nan nan 
2 nan  9  nan
3 nan nan  9

thank you!!

EDIT: sorry i made i mistake. the value columns are also categorical, and i need their individual values.. so instead of 18 it should be something like [9:2,10:0,11:0] (assuming the possible value categoricals are 9,10,11)

roirodriguez
  • 1,685
  • 2
  • 17
  • 31
david
  • 97
  • 4

1 Answers1

0

What about?:

test_df.pivot_table(values='values', index='index', columns='columns', aggfunc='sum')

Also: This is just about reading the manual here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html. I suspect you've to read better about the 'aggfunc' param.

roirodriguez
  • 1,685
  • 2
  • 17
  • 31
  • sorry i made i mistake. the value columns are also categorical, and i need their individual values.. so instead of 18 it shoule either be [9,9] or something like [9:2,10:0,11:0] (assuming the possible value categoricals are 9,10,11) – david Jan 09 '20 at 12:21
  • Sorry, my original answer didn't work, i've just edited. Try that one. – roirodriguez Jan 09 '20 at 12:22
  • Edited again to include aggregation in response to your edition. – roirodriguez Jan 09 '20 at 12:24
  • Also, i see that this is a duplicate of https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe, take a look there. I think you should mark this to be deleted if that question answers yours, maybe a more experienced user can comment on this. – roirodriguez Jan 09 '20 at 12:25
  • i want to count the categorical value features not aggregate them, its not a duplicate of the other post – david Jan 09 '20 at 12:27
  • Ok, so the answer i gave works, i've tested it gives your desidered output. Does it work for you? – roirodriguez Jan 09 '20 at 12:28
  • hey thanks for your help. i fixed it with groupby(["index","columns"])["values].apply(lambda x: x.value_counts()).reset_index() first – david Jan 09 '20 at 12:34
  • I used the line you've just posted and that doesn't match the expected output in your question. I do not mind if you just leave this not marked as correct even if it is, but i think you may want to read https://stackoverflow.com/help/how-to-ask and some other resources out there. – roirodriguez Jan 09 '20 at 12:44
  • i made a mistake when i posted my initial question, i updated it a second after you first posted. i dont want to aggregate by summing, i want to count the unique entries in the values – david Jan 09 '20 at 12:56
  • after grouping i can then use a pivot table and turn the new "level_2" column into the columns. that would have been the solution. took me 2 lines so i am sure it could have been done better – david Jan 09 '20 at 13:11
  • OK, perfect. I think you should've edited your original question to reflect the new expected output too. Also you can answer your own question and mark it as correct (you can try to do that with this one if you wish). – roirodriguez Jan 09 '20 at 13:15
  • As a final comment, if you let me, questions here normally suppose some previous work. If this is a question you solved by yourself in 15 minutes or so, probably it doesn't need to be asked :) As always, read the manuals first and try your best. If it doesn't work then maybe you may ask here. – roirodriguez Jan 09 '20 at 13:18
  • i cant endit anymore: it doesnt even need that much: test_df.pivot_table(index = "index",columns = "columns", values = "values",aggfunc= lambda x: Counter(x)) – david Jan 10 '20 at 07:22