0

New to Python, and programming in general and trying to:

1) Read multiple (identically formatted) CSV files from a folder

2) Plot column X 'Time' vs column Y 'pH' from each of the CSV files on a single plot

3) Create a legend using the filename (without .csv) as the reference for each line of the plot.

I have been able to open a single CSV file and plot X vs Y, but have had no success iterating over the files and overlaying multiple lines on a single plot.

Any help would be greatly appreciated! I've tried a few different ways of reading files in, and I'm just showing one of them below. I'd rather read in the files as individual pandas datatables, so that I can maniupulate them later. For now, I'm hoping just to get some basic code working.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas import Series, DataFrame
from numpy import nan as NA
import glob

ferms = glob.glob ('Python/CSV/*.csv')
print ferms

for ferm in ferms:
    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    ax.plot(ferms['EFT(h)'], ferms['pH1.PV [pH]'], 'k--')
    plt.xlabel('EFT(h)')
    plt.ylabel('pH')
    plt.show()

Revised code based on @Paul H suggestion


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas import Series, DataFrame
from numpy import nan as NA
import glob

ferms = glob.glob ('Python/CSV/*.csv')
print ferms
fig = plt.figure()
ax = fig.add_subplot(1,1,1)

for ferm in ferms:
# define the dataframe
    data = pd.read_csv(ferm)    
    ax.plot(ferms[0], ferms[3], 'k--')

plt.xlabel('EFT(h)')
plt.ylabel('pH')
plt.show()

new error:

--> 235 return array(a, dtype, copy=False, order=order) 236 237 def asanyarray(a, dtype=None, order=None):

ValueError: could not convert string to float: Python/CSV\20135140.csv


Just to check, I went into my csv files and deleted the headers, thinking they could have been the cause of the 'string to float' error. However, even with only numbers in my csvs, it threw the same error.

biltron
  • 3
  • 1
  • 4
  • 4
    Please show the code you used and explain what about it is not working. – BrenBarn Jun 19 '13 at 22:41
  • You will get much better help here if you show us what you are doing. As your question is now it reads as 'please do my work for me' which tends to annoy the people who might help you ;) – tacaswell Jun 19 '13 at 23:20
  • what doesn't work when you try to overlay multiple lines? – tom10 Jun 20 '13 at 03:17
  • thanks for your comments - updated post with code from one approach I have tried (have tried many). – biltron Jun 20 '13 at 15:50

2 Answers2

0

it looks like it's not working because you're creating a new figure with each loop.

Try this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas import Series, DataFrame
from numpy import nan as NA
import glob

ferms = glob.glob ('Python/CSV/*.csv')
print ferms
fig = plt.figure()
ax = fig.add_subplot(1,1,1)

for ferm in ferms:
    # define the dataframe
    data = pd.read_csv(ferm)
    ax.plot(data['EFT(h)'], data['pH1.PV [pH]'], 'k--')

plt.xlabel('EFT(h)')
plt.ylabel('pH')
plt.show()
Paul H
  • 65,268
  • 20
  • 159
  • 136
  • Thanks @Paul H. Getting closer I think. However now it is throwing an error about not being able to convert string to float...import pandas as pd import numpy as np import matplotlib.pyplot as plt from pandas import Series, DataFrame from numpy import nan as NA import glob ferms = glob.glob ('Python/CSV/*.csv') print ferms fig = plt.figure() ax = fig.add_subplot(1,1,1) for ferm in ferms: ax.plot(ferms[0], ferms[3], 'k--') plt.xlabel('EFT(h)') plt.ylabel('pH') plt.show() error: ValueError: could not convert string to float: Python/CSV\20135140.csv – biltron Jun 20 '13 at 19:30
  • @biltron i can't read all that code in a comment. so my next observation is that you're never actually making a dataframe. i edited my response to help you out some more. – Paul H Jun 20 '13 at 19:58
  • thanks for the dataframe tip. However, I still get the error thrown back 'could not convert string to float'. – biltron Jun 20 '13 at 22:54
  • We can't really help you without some data. Can you post some? Specifically, post the file that's throwing that error. – Paul H Jun 21 '13 at 00:39
0

I had a similar problem trying to plot data from any number of files. Here is a link to my post Traceback lines on plot of multiple files. Basically you want to plot the data from each file but not include plt.show() within the loop that iterates through each file. The plt.show() should be outside of the loop.