How to save certain columns from csv file as variables in Python?

Question

I want to save and graph certain columns in Python using matplotlib. The argument for the columns will be obtained from the cmdline, so I'll have to use sys.argv to obtain them. Here's what I have currently:

EDIT: I should also mention that the column numbers can vary depending on what the user chooses. For instance, they could do just columns 1, 2 or just columns 1.

with open('./P14_data.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    cols = [index for index in sys.argv[1:]]

    #I want to extract the columns corresponding to cols

    for col in cols:

    x[col] = [rows for rows in data]
    print x

But this returns an empty list [].

As for the output, I'd like to graph each column as a one-dimensional array. So for instance, with a csv file of the form:

If a user inputs '1', I want my code to save only column one variables in an array:

data = [[1, 1, 0, 0,..]]

plt.plot(data)

I know pandas is a valid option, but I like to learn it this way first. Thanks!

http://stackoverflow.com/questions/16503560/read-specific-columns-from-a-csv-file-with-csv-module Check it out — Nikolay 'Alagunto' Tkachenko, May 03 '17 at 01:06
@kiran.koduru The expected output would be a graph of the column data, in other words `plt.plot(x)` `plt.show()` where x is a one-dimensional array of the columns`[['4,7,3,3]]` — Nikitau, May 03 '17 at 01:14
Could you add to your description a subset of the input file and the expected output ? It would help to understand what you want. — EvensF, May 03 '17 at 05:12

score 0 · Answer 1 · 2017-05-03T01:25:45.673

0

You can get the first column in the row

with open('./P14_data.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    included_cols = [1,2,3]
    x = [[rows[0]] for rows in data]
    x1 = [[row[0].split(',')[0] for row in x]]
    x2 = [[row[0].split(',')[1] for row in x]]
    x3 = [[row[0].split(',')[2] for row in x]]

    print x1
    # [['4', '7', '3', '3']]

    print x2
    # [['9', '11', '5', '6']]

    print x3
    # [['5', '4', '2', '3']]

edited May 03 '17 at 01:25

answered May 03 '17 at 01:20

Thanks! This is what I was looking for. Is there a way to iterate through all the columns and save them as different variables (i.e: x1, x2, x3 if there are 3 columns)? – Nikitau May 03 '17 at 01:22
@Nikitau I have updated my answer – May 03 '17 at 01:26
Sorry I keep bugging you, but I forgot to mention that the column numbers can vary. Is there any way to account for this? I'm guessing a valid way would be to first save it as a list of a list, and extract it later for plotting. – Nikitau May 03 '17 at 01:58

score 0 · Answer 2 · answered May 03 '17 at 01:26

Here is a method that saves the headings and data in separate arrays. To get each column we take the transpose of the data array and select whichever columns we are interesting in.

Here is data.csv:

index,value1,value2
0,10,20
1,12,18
2,5,6
3,9,10
4,11,8

And here is the code:

import matplotlib.pyplot as plt
import numpy as np
import csv

with open('data.csv','r') as csvfile:
    r = csv.reader(csvfile, delimiter=',')
    data = [i for i in r]

headings = data.pop(0)
data = np.array([[np.float(j) for j in i] for i in data])

c1 = data.T[1]
c2 = data.T[2]

fig, ax = plt.subplots(1)
ax.plot(c1, label=headings[1])
ax.plot(c2, label=headings[2])
ax.legend()
fig.show()

And the plot:

Hi this is great, but is there a placeholder I can use if there are no headings in the csv? The amount of columns will vary with no header names. — Nikitau, May 03 '17 at 02:14

score 0 · Answer 3 · answered May 03 '17 at 01:32

Well you could try something like that:

#!/usr/bin/env python3

import csv

with open('./P14_data.csv', newline='') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    x = [rows for rows in data]
    transposed = list(zip(*x))
    print(transposed)

Or even simpler:

#!/usr/bin/env python3

import csv

with open('./P14_data.csv', newline='') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    transposed = list(zip(*data))
    print(transposed)

Key points:

zip()
Unpacking function argument list
The return value of csv.reader() is an iterable of iterables

How to save certain columns from csv file as variables in Python?

3 Answers3

Linked