2

I've loaded a CSV file using Pandas. The CSV file has 4000 rows. It loaded correctly. When printing out the data frame, all 4000 rows are printed. But when I iterate through the rows using a "for" loop, it only prints the first row in the file.

This is my code:

import pandas as pd

df = pd.read_csv('EX2_EM_GMM.csv')
for sample in df:
    print sample

An Ideas? Thanks!

amin
  • 1,413
  • 14
  • 24
Tal Geller
  • 61
  • 3
  • 8
  • 1
    Possible duplicate of [How to iterate over rows in a DataFrame in Pandas?](http://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) – amin Dec 08 '16 at 21:53

3 Answers3

4

For iterating over DataFrame rows you can use .iterrows() function.

for index, row in df.iterrows():
    # Process each row
amin
  • 1,413
  • 14
  • 24
2

In your case I think following examples provide solution and also providing time of execution. for this amount of rows I will use itertuples()

itertuples() and iterrows()

import pandas as pd
import numpy as np

di = {k:np.random.randn(4000) for k in ["a", "b", "c", "d"]}
df = pd.DataFrame(di)

for row in df.itertuples():
    print row

%timeit [row for row in df.itertuples()]

%timeit [row for row in df.iterrows()]

output sample result of execution time

n1tk
  • 2,406
  • 2
  • 21
  • 35
0

Use iterrows() to loop through each of the rows. The default iteration will just show the columns.

for sample in df.iterrows():
    print sample
jeff carey
  • 2,313
  • 3
  • 13
  • 17