0

I'm attempting to run through a column in my Python data file and only want to keep the lines of data that have values of 5, 6, 7, 8, or 9 in a certain column.

var = [5, 6, 7, 8, 9]

import glob
import numpy as np

filname = glob.glob(''+fildir+'*')
for k in filname:
    data = np.genfromtxt(k,skip_header=6,usecols=(2,3,4,5,8,9,10,11))
    if data[:,1] not in var:
        continue

"fildir" is just the directory where all of my files are at. data[:,1] have values that range from 1-15 and like I said, I just want to keep lines that have values 5-9. When I run this code I get:

 ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Any helpful hints?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
DJV
  • 863
  • 3
  • 15
  • 30

1 Answers1

0

Your first problem is that you are trying to evaluate the boolean value of a numPy array.

if data[:,1] not in var:

In your example, data[:,1] is a collection of all the second column values from your filed read as floats. So, disregarding your usecols= for the moment, a file that contains

1 2 3 4 5 6 7 8
9 10 11 12 13 14 16

will in your example produce

>> data[:,1]
array([2., 10.])

which is not what you want to check anyway. Why exactly the error occurs is somewhat deeper explained here.

If all you want to do is store a list of all rows that have a value from the var list in their second column I'd suggest a simple approach.

from glob import glob
import numpy as np

var = [5, 6, 7, 8, 9]

filname = glob('fildir/*')

# get the desired rows from all files in folder
# use int as dtype because float is default
# I'm not an expert on numpy so I'll .tolist() the arrays
data = [np.genfromtxt(k, dtype=int, skip_header=6).tolist() for k in filname]

# flatten the list to have all the rows file agnostic
data = [x for sl in data for x in sl]

# filter the data and return all the desired rows
filtered_data = filter(lambda x: x[1] in var, data)

There is probably a more numPythonic way to do this depending on the data structure you want to keep your rows in, but this one is very simple.

jhnwsk
  • 971
  • 12
  • 15