I have a pandas table with 3 columns: parent_male, parent_female, offsprings - all strings. I want to create a simple sparse crosstab table of male vs female and the offsprings as the values - how can I write an aggfunc that do so. (no real aggregation is needed) - just put an empty string in the blanks.
Asked
Active
Viewed 8,010 times
2 Answers
6
IIUC you need pivot
:
df = df.pivot(index='parent_male', columns='parent_female', values='offsprings')
If get error:
ValueError: Index contains duplicate entries, cannot reshape
use pivot_table
So final solution is:
ct = pd.pivot_table(d['male'], d['female'], d['offsprings'], aggfunc=','.join)
-
thanks - I had a problem with the values as strings, since the index contains duplicate entries. I found the solution below... – Dror Hilman Oct 30 '16 at 09:09
-
1if have there problem you can use `aggfunc=','.join` ;) – jezrael Oct 30 '16 at 09:10
-
final solution, that works well thanks to you guys: ct = pd.crosstab(hyb['male'], hyb['female'], hyb['line'], aggfunc=','.join) – Dror Hilman Oct 30 '16 at 09:14
-
There is interesting [question](http://stackoverflow.com/q/36267745/2901002) about `crosstab` vs `pivot_table`. – jezrael Oct 30 '16 at 09:21
2
I found the answer here... Pandas Groupby Agg Function Does Not Reduce and I used the info. from the comments above...
ct = pd.crosstab(d['male'], d['female'], d['offsprings'], aggfunc=','.join)

Community
- 1
- 1

Dror Hilman
- 6,837
- 9
- 39
- 56