1

I have a dataframe of Date, Buy, and Sell values and doing a pivot_table on that dataframe to get all Buy/Sell value of each date which works.

data = [('20170325', 'Buy', 400 ),
        ('20170325', 'Buy', 401 ),
        ('20170323', 'Buy', 400 ),
        ('20170324', 'Sell', 400 )]

testDf = pd.DataFrame(data, columns=['Date', 'Scenario', 'Value'])
df1 = pd.pivot_table(testDf, columns='Scenario', index='Date', values='Value', 
                     fill_value = '', aggfunc=lambda x: x.tolist() if len(x)>1 else x)
df1
#    Scenario         Buy Sell
#    Date                     
#    20170323         400     
#    20170324              400
#    20170325  [400, 401]

But when my data changes a bit, pivot_table throws error: ValueError: Function does not reduce. Not able to understand why. Here is the error with different data. Please note that Date of 20170325 is changed to 20170321.

data = [('20170321', 'Buy', 400 ),
        ('20170321', 'Buy', 401 ),
        ('20170323', 'Buy', 400 ),
        ('20170324', 'Sell', 400 )]

testDf = pd.DataFrame(data, columns=['Date', 'Scenario', 'Value'])
df1 = pd.pivot_table(testDf, columns='Scenario', index='Date', values='Value', 
                     fill_value = '',aggfunc=lambda x: x.tolist() if len(x)>1 else x)

Traceback (most recent call last): File "", line 1, in
File "ext2\vc12_win32\lib\python2.7\site-packages\pandas\tools\pivot.py", line 114, in pivot_table
File "ext2\vc12_win32\lib\python2.7\site-packages\pandas\core\groupby.py", line 729, in agg
File "ext2\vc12_win32\lib\python2.7\site-packages\pandas\core\groupby.py", line 2978, in aggregate
File "ext2\vc12_win32\lib\python2.7\site-packages\pandas\core\groupby.py", line 1227, in _python_agg_general
File "ext2\vc12_win32\lib\python2.7\site-packages\pandas\core\groupby.py", line 1733, in agg_series
File "ext2\vc12_win32\lib\python2.7\site-packages\pandas\core\groupby.py", line 1767, in _aggregate_series_pure_python
ValueError: Function does not reduce

Parfait
  • 104,375
  • 17
  • 94
  • 125
Nilanjan
  • 176
  • 8

2 Answers2

1

Consider even a groupby with pivot:

df1 = testDf.groupby(['Date', 'Scenario'])['Value'].apply(lambda x: list(x)).reset_index()\
            .pivot(index='Date', columns='Scenario', values='Value').fillna('')

# Scenario         Buy   Sell
# Date                       
# 20170323       [400]       
# 20170324              [400]
# 20170325  [400, 401]       

# Scenario         Buy   Sell
# Date                       
# 20170321  [400, 401]       
# 20170323       [400]       
# 20170324              [400]
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

I got the same error with your first dataset also. I believe this is due to the duplicate values on a day.

This appears to be a feature of pandas from this SO post.

Let's try to convert your tolist() to use tuple instead.

df1 = pd.pivot_table(testDf, columns='Scenario', index='Date', values='Value', fill_value = '',aggfunc=lambda x: tuple(x))
Community
  • 1
  • 1
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • But one idea, if use `lambda x: tuple(x) if len(x)>1 else x)` How does it work? – jezrael Apr 19 '17 at 14:36
  • I don't think it will work because of mixed types maybe and still not reducing. Nope that conditional at the end kills it also. – Scott Boston Apr 19 '17 at 14:37
  • @ScottBoston Tuple works. But wondering why list does not. Strange that I am not getting any error with my first data set. May be something to do with pandas version? I am using v0.17.1. – Nilanjan Apr 20 '17 at 05:57
  • @ScottBoston Further looking at the last comment of the post you shared, found that "This is a misfeature in DataFrame. If the aggregator returns a list for the first group, it will fail with the error you mention; if it returns a non-list (non-Series) for the first group, it will work fine." This kind of explains the behavior I am seeing. pivot_table sorts the index involved. In my first dataset first group is with date '20170323' and aggregator returns single value. Hence, it works. For my second dataset first group is with date '20170321' and aggregator returns a list. Hence the error. – Nilanjan Apr 20 '17 at 06:02