I am trying to learn beginner data science and I have 2 datasets from which the 1st one is:
+----+-------+--------+------+------+-------+
| ID | bool | num1 | A | B | event |
+----+-------+--------+------+------+-------+
| a1 | TRUE | 123456 | 1001 | 1003 | 0 |
| a2 | FALSE | 123456 | 1006 | 1009 | 1 |
| a3 | TRUE | 144444 | 1020 | 1022 | 2 |
+----+-------+--------+------+------+-------+
and the 2nd one:
+----+--------+-------+------+----------+------+-------+------+
| ID | num1 | event | C | category | num2 | num3 | num4 |
+----+--------+-------+------+----------+------+-------+------+
| a1 | 123456 | 0 | 1002 | aa | 1.11 | -1.01 | 1.23 |
| a1 | 123456 | 0 | 1003 | bb | 3.21 | 2.92 | 4.03 |
| a2 | 144444 | 1 | 1008 | aa | 6.34 | 5.56 | 7.02 |
| a2 | 144444 | 1 | 1009 | aa | 5.65 | 3.99 | 6.32 |
+----+--------+-------+------+----------+------+-------+------+
From them I want to make the 3rd one like this where data is event column based:
+-------+----+-------+--------+-----------+------------+------------+------------+----------+----------+
| event | ID | bool | num1 | C values | count cat1 | count cat2 | count cat3 | min num2 | avg num2 |
+-------+----+-------+--------+-----------+------------+------------+------------+----------+----------+
| 0 | a1 | TRUE | 123456 | 1002:1003 | 1 | 1 | 0 | 1.11 | 2.16 |
| 1 | a2 | FALSE | 123456 | 1008:1009 | 2 | 0 | 0 | 5.65 | 5.995 |
| 2 | a3 | TRUE | 144444 | 1020 | 0 | 0 | 1 | 4.02 | 4.02 |
+-------+----+-------+--------+-----------+------------+------------+------------+----------+----------+
This is a reduced example. I have read about stacking, groupby, count based on another column, numpy.where, reshaping etc. but I've failed to combine them to achieve anything similar to what I want. What are the suggestions to solve this, starting from simple ones? Different solutions are welcome so I could try and understand them all. Using Python, Pandas.