0

Let's say I have a file 'test.csv' containing the following headers and data:

h1  c1  h2  h3  c2
1   0   2   3   1
3   0   2   1   0
0   1   2   3   3

What is the best option in python to only select and save the columns of interest and discard all others?

Assuming I'm only interested in saving the h columns, I thought of something along these lines:

f = open('test.csv')
s = save('new_test.csv', data = f, saveColumns=['h1','h2','h3'])´

n = load('new_test.csv')
print n

h1  h2  h3
1   2   3
3   2   1
0   2   3
HappyPy
  • 9,839
  • 13
  • 46
  • 68
  • Is this a numpy thing? – Jiminion Aug 08 '13 at 18:51
  • This [post](http://stackoverflow.com/questions/16503560/read-specific-columns-from-csv-file-with-python-csv?answertab=active#tab-top) accomplishes something very similar. – Alex Aug 08 '13 at 18:53
  • If no lumpy, I'd make a dictionary for each heading, and load the values as a list for the dictionary. Then at the end, just print out the valid entries. You will need a list of the dictionaries at the start to keep the column order. – Jiminion Aug 08 '13 at 18:54
  • You can pass the data to python with awk and only give it the column you want. – Rocky Pulley Aug 08 '13 at 18:55
  • but in the end, numpy = useful + so much easier – sihrc Aug 08 '13 at 19:03

3 Answers3

1
f = open("test.csv")
header = {i: x for i, x in enumerate(f.readline().split())}
columns = ('h1','h2','h3')
for l in f:
    print [x for i, x in enumerate(l.split()) if header[i] in columns]
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
1

I found a very straightforward way of doing this:

import pandas as pd
selectColumns = ['h1','h2','h3']
table = pd.read_csv('test.csv')
tableNew = table[selectColumns]
pd.to_csv('tableNew')
HappyPy
  • 9,839
  • 13
  • 46
  • 68
0
>>> d=csv.DictReader(open("some.csv"),delimiter="\t")
>>> fields = ["h1","h2","h3"]
>>> new_rows = [[row[f] for f in fields] for row in d]
>>> d=csv.DictWriter(open("new_csv.csv","w"),fields,delimiter="\t")
>>> d.writerows(new_rows)
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179