2

I have a pandas table with 3 columns: parent_male, parent_female, offsprings - all strings. I want to create a simple sparse crosstab table of male vs female and the offsprings as the values - how can I write an aggfunc that do so. (no real aggregation is needed) - just put an empty string in the blanks.

Dror Hilman
  • 6,837
  • 9
  • 39
  • 56

2 Answers2

6

IIUC you need pivot:

df = df.pivot(index='parent_male', columns='parent_female', values='offsprings')

If get error:

ValueError: Index contains duplicate entries, cannot reshape

use pivot_table

So final solution is:

ct = pd.pivot_table(d['male'], d['female'], d['offsprings'], aggfunc=','.join)
LW001
  • 2,452
  • 6
  • 27
  • 36
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • thanks - I had a problem with the values as strings, since the index contains duplicate entries. I found the solution below... – Dror Hilman Oct 30 '16 at 09:09
  • 1
    if have there problem you can use `aggfunc=','.join` ;) – jezrael Oct 30 '16 at 09:10
  • final solution, that works well thanks to you guys: ct = pd.crosstab(hyb['male'], hyb['female'], hyb['line'], aggfunc=','.join) – Dror Hilman Oct 30 '16 at 09:14
  • There is interesting [question](http://stackoverflow.com/q/36267745/2901002) about `crosstab` vs `pivot_table`. – jezrael Oct 30 '16 at 09:21
2

I found the answer here... Pandas Groupby Agg Function Does Not Reduce and I used the info. from the comments above...

ct = pd.crosstab(d['male'], d['female'], d['offsprings'], aggfunc=','.join)
Community
  • 1
  • 1
Dror Hilman
  • 6,837
  • 9
  • 39
  • 56