This can be conveniently achieved by using pandas's groupby or filtering capabilities. Looking at a solution using filtering, you can retrieve all unique column values via pandas.Series.unique()
. Using those unique values, you can filter the dataframe by the values in the column Timestart
using the function .loc[]
with a boolean expression.
In the following example, the filtered result is stored in a dict called data
, which of course could be also a nested list or the like as well.
import pandas as pd
# Reads the csv-file
df = pd.read_csv("testing.csv")
# Find all values in the Timestart-column
timestart_unique = df.Timestart.unique()
# dict indexed by unique Timestart-values
data = {}
# Populates the dict by filtering
for t in timestart_unique:
data[t] = df.loc[df.Timestart == t]
Accessing data[1800]
will then give you something like:
id Vin Make Timestart Timestop
0 1 12345 London 1800 1845
1 2 23456 NY 1800 1845
3 4 345678 London 1800 1845