0

I want to save and graph certain columns in Python using matplotlib. The argument for the columns will be obtained from the cmdline, so I'll have to use sys.argv to obtain them. Here's what I have currently:

EDIT: I should also mention that the column numbers can vary depending on what the user chooses. For instance, they could do just columns 1, 2 or just columns 1.

with open('./P14_data.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    cols = [index for index in sys.argv[1:]]

    #I want to extract the columns corresponding to cols

    for col in cols:

    x[col] = [rows for rows in data]
    print x

But this returns an empty list [].

As for the output, I'd like to graph each column as a one-dimensional array. So for instance, with a csv file of the form:

1 5 
1 3
0 2
0 3
1 1
1 3

If a user inputs '1', I want my code to save only column one variables in an array:

data = [[1, 1, 0, 0,..]]

plt.plot(data)

I know pandas is a valid option, but I like to learn it this way first. Thanks!

Nikitau
  • 371
  • 3
  • 16

3 Answers3

0

You can get the first column in the row

with open('./P14_data.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    included_cols = [1,2,3]
    x = [[rows[0]] for rows in data]
    x1 = [[row[0].split(',')[0] for row in x]]
    x2 = [[row[0].split(',')[1] for row in x]]
    x3 = [[row[0].split(',')[2] for row in x]]

    print x1
    # [['4', '7', '3', '3']]

    print x2
    # [['9', '11', '5', '6']]

    print x3
    # [['5', '4', '2', '3']]
  • Thanks! This is what I was looking for. Is there a way to iterate through all the columns and save them as different variables (i.e: x1, x2, x3 if there are 3 columns)? – Nikitau May 03 '17 at 01:22
  • @Nikitau I have updated my answer –  May 03 '17 at 01:26
  • Sorry I keep bugging you, but I forgot to mention that the column numbers can vary. Is there any way to account for this? I'm guessing a valid way would be to first save it as a list of a list, and extract it later for plotting. – Nikitau May 03 '17 at 01:58
0

Here is a method that saves the headings and data in separate arrays. To get each column we take the transpose of the data array and select whichever columns we are interesting in.

Here is data.csv:

index,value1,value2
0,10,20
1,12,18
2,5,6
3,9,10
4,11,8

And here is the code:

import matplotlib.pyplot as plt
import numpy as np
import csv

with open('data.csv','r') as csvfile:
    r = csv.reader(csvfile, delimiter=',')
    data = [i for i in r]

headings = data.pop(0)
data = np.array([[np.float(j) for j in i] for i in data])

c1 = data.T[1]
c2 = data.T[2]

fig, ax = plt.subplots(1)
ax.plot(c1, label=headings[1])
ax.plot(c2, label=headings[2])
ax.legend()
fig.show()

And the plot: enter image description here

Robbie
  • 4,672
  • 1
  • 19
  • 24
  • Hi this is great, but is there a placeholder I can use if there are no headings in the csv? The amount of columns will vary with no header names. – Nikitau May 03 '17 at 02:14
  • You can just remove the line "headings = data.pop(0)" – Robbie May 03 '17 at 03:02
0

Well you could try something like that:

#!/usr/bin/env python3

import csv

with open('./P14_data.csv', newline='') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    x = [rows for rows in data]
    transposed = list(zip(*x))
    print(transposed)

Or even simpler:

#!/usr/bin/env python3

import csv

with open('./P14_data.csv', newline='') as csvfile:
    data = csv.reader(csvfile, delimiter=';')
    transposed = list(zip(*data))
    print(transposed)

Key points:

EvensF
  • 1,479
  • 1
  • 10
  • 17