0

I have a csv file as below:

Weather

I need to split csv file into multiple lists with elements that have Timestart equal. Using code below but it isnt working.

from numpy import e, not_equal
import pandas as pd
import csv
data= pd.read_csv("testing.csv")
t = data.sort_values('Timestart')

for x in range(len(t.Timestart)):
   for y in range(1+len(t.Timestart)):
      if t.Timestart[x] == t.Timestart[y]:
         print('Vin with same timewindow is: ',t.Vin[y])

1 Answers1

0

This can be conveniently achieved by using pandas's groupby or filtering capabilities. Looking at a solution using filtering, you can retrieve all unique column values via pandas.Series.unique(). Using those unique values, you can filter the dataframe by the values in the column Timestart using the function .loc[] with a boolean expression.

In the following example, the filtered result is stored in a dict called data, which of course could be also a nested list or the like as well.

import pandas as pd

# Reads the csv-file
df = pd.read_csv("testing.csv")

# Find all values in the Timestart-column
timestart_unique = df.Timestart.unique()

# dict indexed by unique Timestart-values
data = {}  

# Populates the dict by filtering
for t in timestart_unique:
    data[t] = df.loc[df.Timestart == t]

Accessing data[1800] will then give you something like:


    id  Vin     Make    Timestart   Timestop
0   1   12345   London  1800    1845
1   2   23456       NY  1800    1845
3   4   345678  London  1800    1845
jgru
  • 181
  • 5