Selecting specific range of columns from .CSV file

Question

I have a CSV file which has 78000 columns. I am trying to select the columns 2-100, 102-200, and the last 300 columns. The rest of the columns need to be skipped.

I have used numpy.loadtxt to select range of columns:

numpy.loadtxt(input_file_name, delimiter=",", skiprows = 1, usecols=range(1,99))

How can we select blocks of columns doing something similar, like:

numpy.loadtxt(input_file_name, delimiter=",", skiprows = 1, usecols=(range(1,99),range(101,199),range(74999,77999)))

This is a general duplicate, but I've added a numpy solution to my answer which should be useful to know. If anyone else wants to answer the question, ping me and I'll reopen it, as long as the solution isn't linked in the duplicate (i.e., a numpy solution). — cs95, Jan 19 '18 at 09:13
Thanks for accepting! You can also upvote answers if they were useful, so please consider doing so. Thanks. — cs95, Jan 19 '18 at 09:58

cs95 · Accepted Answer · 2018-01-19T11:56:52.117

1

Use the numpy row selector, np.r_.

>>> np.r_[range(3), range(15, 18), range(100, 103)]

Or (using hpaulj's suggestion),

>>> np.r_[0:3, 15:16, 100:103]

array([  0,   1,   2,  15,  16,  17, 100, 101, 102])

For your code, this is how you'd call it -

numpy.loadtxt(
  input_file_name, 
  delimiter=",", 
  skiprows = 1, 
  usecols=np.r_[range(1, 99), range(101, 199), range(74999, 77999)]
)

edited Jan 19 '18 at 11:56

answered Jan 19 '18 at 09:07

cs95

379,657
97
704
746

1

`np.r_[0:3, 15:16, 100:103]` should also work. – hpaulj Jan 19 '18 at 11:56
@hpaulj Thanks, good one. Slipped my mind as I was focusing on OP's requirement with the range objects. – cs95 Jan 19 '18 at 11:57

Selecting specific range of columns from .CSV file

1 Answers1

Linked