0

I have 14 .csv files (1 .csv file per location) that will be used to make a 14 bar plots of daily rainfall. The following code is an example of what one bar plot will look like.

import numpy as np
import pandas as pd 
from datetime import datetime, time, date
import matplotlib.pyplot as plt

# Import data
dat = pd.read_csv('a.csv')
df0 = dat.loc[:, ['TimeStamp', 'RF']]

# Change time format
df0["time"] = pd.to_datetime(df0["TimeStamp"])
df0["day"] = df0['time'].map(lambda x: x.day)
df0["month"] = df0['time'].map(lambda x: x.month)
df0["year"] = df0['time'].map(lambda x: x.year)
df0.to_csv("a2.csv", na_rep="0")  # write to csv

# Combine for daily rainfall
df1 = pd.read_csv('a2.csv', encoding='latin-1',
              usecols=['day', 'month', 'year', 'RF', 'TimeStamp'])
df2 = df1.groupby(['day', 'month', 'year'], as_index=False).sum()
df2.to_csv("a3.csv", na_rep="0", header=None)  # write to csv

# parse date
df3 = pd.read_csv("a3.csv", header=None, index_col='datetime', 
             parse_dates={'datetime': [1,2,3]}, 
             date_parser=lambda x: pd.datetime.strptime(x, '%d %m %Y'))

def dt_parse(date_string):
dt = pd.datetime.strptime(date_string, '%d %m %Y')
return dt

# sort datetime
df4 = df3.sort()
final = df4.reset_index()

# rename columns
final.columns = ['date', 'bleh', 'rf']

final[['date','rf']].plot()

plt.suptitle('Rain 2015-2016', fontsize=20)
plt.xlabel('Date', fontsize=18)
plt.ylabel('Rain / mm', fontsize=16)
plt.savefig('a.jpg')
plt.show()

And the final plot looks like this: enter image description here

How can I automate this code (i.e. write a for-loop perhaps?) so that I don't have to re-type the code for each .csv file? It would be nice if the code also saves the figure with the name of the .csv as the name of the .jpg file.

The names of the 14 files are as such: names = ["a.csv","b.csv", "c.csv","d.csv","e.csv","f.csv"...]

Here's an example of the type of file that I'm working with: https://dl.dropboxusercontent.com/u/45095175/test.csv

JAG2024
  • 3,987
  • 7
  • 29
  • 58

1 Answers1

1

First method: you need to put all your csv files in the current folder. You also need to use the os module.

import os
for f in os.listdir('.'):                 # loop through all the files in your current folder
    if f.endswith('.csv'):                # find csv files
        fn, fext = os.path.splitext(f)    # split file name and extension

        dat = pd.read_csv(f)              # import data
        # Run the rest of your code here

        plt.savefig('{}.jpg'.format(fn))  # name the figure with the same file name 

Second method: if you don't want to use the os module, you can put your file names in a list like this:

files = ['a.csv', 'b.csv']

for f in files:
    fn = f.split('.')[0]

    dat = pd.read_csv(f)
    # Run the rest of your code here

    plt.savefig('{}.jpg'.format(fn))
Longwen Ou
  • 859
  • 4
  • 4
  • I actually already do this. The first line in my code is: `os.chdir('/Users/me/desktop')` – JAG2024 Feb 24 '17 at 04:00
  • @JAG2024 Have you tried the rest of the code? Does it work? – Longwen Ou Feb 24 '17 at 04:05
  • Almost! I just get the error message `Traceback (most recent call last): File "run.py", line 57, in plt.savefig('{}.jpg'.format(fn)) # name the figure with the same file name NameError: name 'fn' is not defined` – JAG2024 Feb 24 '17 at 05:00
  • I copied and pasted your "First method" code and added my code where you said # Run the rest of your code here. – JAG2024 Feb 24 '17 at 05:10
  • Is there also a way to include the name of the csv file in the renaming of the new csv files (e.g. change `a3` in `df2.to_csv("a3.csv", na_rep="0", header=None)` to the name of the csv file used as well as in `plt.suptitle('Rain 2015-2016', fontsize=20)`? – JAG2024 Feb 24 '17 at 06:35
  • @JAG2024 I just tried the first method on my computer and it works fine. Since `fn` is defined in line 4 so you should not get `NameError: name 'fn' is not defined`. – Longwen Ou Feb 24 '17 at 14:31
  • @JAG2024 If you want to rename the new csv files, you just need to do something like this: `df2.to_csv('{}3.csv'.format(fn), na_rep="0", header=None)` – Longwen Ou Feb 24 '17 at 14:33