Let's say I have next vector sparse and dt_diff(integer value)
. So, I need aggregation values by cuid
and sum these values
+-----------------------------------+--------------------------------------+---------------+
| cuid | features | dt_diff |
+-----------------------------------+--------------------------------------+---------------+
| 12654467 |(2013492,[1743933,2013491],[2.0,2.0]) | 4 |
| 12654467 |(1876451,[1000000,1876451],[5.0,7.0]) | 10 |
+-----------------------------------+--------------------------------------+---------------+
So, output is
+-----------------------------------+--------------------------------------+---------------+
| cuid | features | dt_diff |
+-----------------------------------+--------------------------------------+---------------+
| 12654467 |(3889943,[2743933,3889942],[7.0,9.0]) | 14 |
+-----------------------------------+--------------------------------------+---------------+