1

SO,

I have a CSV file with a varying number of columns which I think means that the traditional means of making headers as I have attempted below won't work...

The reason I want some headers is because it gets incredibly difficult to use csv files with 100+ columns and do things like rPol = math.exp((((-1.2359386)+float(row[48])*(row[33])+row[29]+row[13]+row[97]/row[50])))

Trying to remember the identity of each row is a nussance, and it would be much easier if I could do something like: rPol = math.exp((((-1.2359386)+float(depolscore)*(master_num)+ripen+offset_score+full/epilic)))

import csv

reader = csv.reader(open("test.csv"), delimiter=",")
headers = {"data", "title", "here", "testing", "stackoverflow"}
csv.DictWriter(reader, headers)
reader.writeheader()
for row in reader:
    print testing

How would I go about giving specific columns a header without doing something like this:

for row in reader:
    # put the columns into variables...
    data = row[0]
    title = row[1]
    here = row[2]
    testing = row[3]
    stackoverflow = row[4]

    # Do math
    score = data * here / stackoverflow
    # Print for user sake
    print score
    # Change the testing value
    testing = testing + (score - title)

    # Put values back into the reader object?
    row[0] = data
    row[1] = title
    row[2] = here
    row[3] = testing
    row[4] = stackoverflow

Any ideas?

treddy
  • 2,771
  • 2
  • 18
  • 31
Dennis Sylvian
  • 967
  • 1
  • 9
  • 31
  • this post: [python enum](http://stackoverflow.com/questions/36932/how-can-i-represent-an-enum-in-python) might be helpful. – gongzhitaao Dec 19 '13 at 15:40
  • 3
    It looks like you're trying to reimplement something like `numpy` [structured arrays](http://docs.scipy.org/doc/numpy/user/basics.rec.html) or a [`pandas`](http://pandas.pydata.org) [`DataFrame`](http://pandas.pydata.org/pandas-docs/dev/dsintro.html#dataframe). If you're performing ops on named columns in tabular data, I'd recommend using `pandas`-- it'll make things you haven't even thought of easier. – DSM Dec 19 '13 at 15:43

1 Answers1

2

You could try using a namedtuple! It's a subclass of tuple, allowing for easy creation from an iterable and easy access to fields by name. The only gotcha you should be aware of is that namedtuples, like tuples, are immutable so you'd have to store the new tuples somewhere:

headers = ["data", "title", "here", "testing", "stackoverflow"]
Row = namedtuple('Row', headers)
for raw_row in reader:
    row = Row._make(raw_row)

    # Do math
    score = row.data * row.here / stackoverflow
    # Print for user sake
    print score
    # Change the testing value
    new_testing = row.testing + (score - row.title)
    new_row = row._replace(testing=new_testing)

    # Do something with new_row...
munchybunch
  • 6,033
  • 11
  • 48
  • 62