I have data with timestamps like this:
timestamps | price |
---|---|
2021/11/8 9:00:00 | 63 |
2021/11/8 9:01:00 | 64 |
2021/11/8 9:02:00 | 65 |
2021/11/8 9:03:00 | 64 |
2021/11/13 10:02:00 | 58 |
2021/11/11 12:03:00 | 55 |
I can read these timestamps and transfer them into timestamps type in python like this:
df["timestamp"] = pd.to_datetime(df['timestamp'])
I need to analyze data for every date by a for-loop.
I think I need to do it in two steps: First, find all dates and save them in a list(Date). Second, match every date from the list(Date) to the original data set to extract all prices. Does anyone know how to do these two steps?
Please notice that:1. This is a big data set, and the timestamps are not sorted. These dates don't increase regularly. 2. There is no period or start time and end time in the data, I don't know the start time and the end time. Of course, I can sort them first to get the start time and end time. But I still don't know how many dates there are. In other words, the timestamps are not continuous for the date.
Suppose, I need to randomly choose 5 prices for each date and sorted them by the time without recording the hour, minute and second. Expected output:
timestamps | prices |
---|---|
2021/11/8 | 61 |
2021/11/8 | 63 |
2021/11/8 | 65 |
2021/11/8 | 61 |
2021/11/8 | 61 |