0

Why does pandas pass the assert_series_equal when comparing a series of lists and a series of tuples?

Example, this test passes:

l = pd.Series([[1], [2], [3]])
t = pd.Series([(1,), (2,), (3,)])
pd.testing.assert_series_equal(l, t)

I find this especially worrisome since you can't aggregate a multi-indexed dataframe for a series of lists if the first result in the aggregator returns a list for the first group. However, this does work for a tuples.

Example:

>>> df = pd.DataFrame([[0, 0, 0], [1, 1, 2], [[1], [2], [3]], [(1,), (2,), (3,)]]).T
>>> df
   0  1    2     3
0  0  1  [1]  (1,)
1  0  1  [2]  (2,)
2  0  2  [3]  (3,)

>>> df.groupby([0, 1])[2].agg(sum)
ValueError: Function does not reduce

>>> df.groupby([0, 1])[3].agg(sum)
0  1
0  1    (1, 2)
   2      (3,)

See this answer for more detail

Jurgy
  • 2,128
  • 1
  • 20
  • 33

0 Answers0