1

I am reading data from a text file and then I do a sort of random walk among the rows. How would you mark a row as "read"?

This is how I'm reading the data:

import pandas as pd
set = pd.read_csv('file.txt', sep=" ", header = None)
set.columns = ["A", "B", "C", "D", "E", "F", "G"]`
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
albus_c
  • 6,292
  • 14
  • 36
  • 77

2 Answers2

3

Shuffle the dataframe with numpy using the technique in this question, then iterate over the rows.

so:

df = pd.read_csv('file.txt', sep=" ", header = None)
df.columns = columns = ["A", "B", "C", "D", "E", "F", "G"]
df = df.apply(numpy.random.permutation)

for row in df.iterrows():
    #process row here
Community
  • 1
  • 1
bananafish
  • 2,877
  • 20
  • 29
  • Thanks bananafish. What I'm doing is not exactly a random walk: each row describes the position of a point on a plane, and I need to group the points finding the curve they are part of. My plan is to a) starting from the first line, search for the closest point which could be part of the curve; b) flag the line corresponding to the point just found. – albus_c Feb 04 '14 at 23:47
  • Oh. In that case, maybe add another column that you flag as true/false depending on whether you've processed it? – bananafish Feb 05 '14 at 00:30
0

To add a column: data.insert(8, "flag", 0). 0 can be changed to 1 or other values later in the code

albus_c
  • 6,292
  • 14
  • 36
  • 77