For this pandas DataFrame (that is in reality much longer), I would like to get the value of b and date, where b is minimum and b is maximum for that day. Performance is an issue.
b date
0 1 1999-12-29 23:59:12
1 2 1999-12-29 23:59:13
2 3 1999-12-29 23:59:14
3 3 1999-12-30 23:59:12
4 1 1999-12-30 23:59:13
5 2 1999-12-30 23:59:14
6 2 1999-12-31 23:59:12
7 3 1999-12-31 23:59:13
8 1 1999-12-31 23:59:14
So I would to get
b date
0 1 1999-12-29 23:59:12
2 3 1999-12-29 23:59:14
3 3 1999-12-30 23:59:12
4 1 1999-12-30 23:59:13
7 3 1999-12-31 23:59:13
8 1 1999-12-31 23:59:14
This is how the dataframe gets generated:
import datetime
import pandas as pd
df = pd.DataFrame({"a": ["29.12.1999 23:59:12",
"29.12.1999 23:59:13",
"29.12.1999 23:59:14",
"30.12.1999 23:59:12",
"30.12.1999 23:59:13",
"30.12.1999 23:59:14",
"31.12.1999 23:59:12",
"31.12.1999 23:59:13",
"31.12.1999 23:59:14"],
"b": [1,
2,
3,
3,
1,
2,
2,
3,
1]})
df["date"] = pd.to_datetime(df.a)
df.drop(["a"],axis=1,inplace=True)