2

I have created a multi-line graph, of water temperatures throughout the year, with python using pandas:

import pandas as pd
filepath = "C:\\Users\\technician\\Desktop\\LowerBD2014.csv" 

data = pd.read_csv(filepath, header = 0, index_col = 0)

data.plot(kind = 'line', use_index = True, title="timeseries", figsize=(20,10))

Now, I would like to add another line for Air Temperature. Unfortunately, the dates and times, when data was collected, don't match. I was thinking that I could work around this my importing 2 separate .csv files into the same graph, but I am unsure how to do that.

Any suggestions would be great. I can also add all of the data to one file, I just worry that the Air Temperature will not plot correctly without a secondary horizontal axis (I don't know how to do this either).

Here is the graph created using ax=ax for one for the data set plots:

https://i.stack.imgur.com/5wTeI.jpg

L. Richardson
  • 21
  • 1
  • 3

2 Answers2

2

Once your two csv's are imported as two dataframes, just plot the first assigned to a named matplotlib axes object (ax in the block below) then pass that axes to the second plot call.

import pandas as pd
import numpy as np

# two made-up timeseries with different periods, for demonstration plot below
#air_temp   = pd.DataFrame(np.random.randn(12), 
#                          index=pd.date_range('1/1/2016', freq='M', periods=12), 
#                          columns=['air_temp'])
#water_temp = pd.DataFrame(np.random.randn(365), 
#                          index=pd.date_range('1/1/2016', freq='D', periods=365), 
#                          columns=['water_temp'])

# the real data import would look something like this:
water_temp_filepath = "C:\\Users\\technician\\Desktop\\water_temp.csv" 
air_temp_filepath = "C:\\Users\\technician\\Desktop\\airtemp.csv" 

water_temp = pd.read_csv(water_temp_filepath, header = 0, index_col = 0,
                         parse_dates=True, infer_datetime_format=True)
air_temp = pd.read_csv(air_temp_filepath, header = 0, index_col = 0,
                       parse_dates=True, infer_datetime_format=True)

# plot both overlayed
ax = air_temp.plot(figsize=(20,10))
water_temp.plot(ax=ax)

enter image description here

  • Thank you so much! This is exactly what I needed. It seems like I could set the periods to the number of data points I have for each file. Is that correct? – L. Richardson Jun 17 '16 at 17:44
  • No need to set the period, just read in the data as you did in your question, into two dataframes, one for airtemp, one for water temp. See edited answer. – Tim Spillane Jun 17 '16 at 18:22
  • After running through your suggested way, I still run into a problem. Since the 2 files have different time increments, Pandas is defaulting to the first series that was entered. Since the water temperature has more data points than air temperature, Pandas is extending the water temperature points past the air temperature; even though they end on the same date. Does this make sense? – L. Richardson Jun 21 '16 at 15:54
  • I included a link to a picture that shows the problem in my original question – L. Richardson Jun 21 '16 at 16:26
  • Try adding the arguments "parse_dates=True, infer_datetime_format=True" to the read_csv calls, pandas should then create and index of type "DatetimeIndex" rather than "index". – Tim Spillane Jun 21 '16 at 18:58
  • That's it! Thank you so much for your help. I'm still curious as to why "index" wouldn't work? – L. Richardson Jun 21 '16 at 19:19
1

As someone here said, if your columns are the same for both csv files, you can follow their code.

or

you can try combining the two CSV files in one, then using that.

file_a = open('first.csv','r')
file_a_data = file_a.read()
file_a.close()

file_b = open('second.csv','r')
file_b_data = file_b.read()
file_b.close()

combined_data = file_a_data + file_b_data

csv = open('test.csv','w')
csv.write(combined_date)
csv.close()

data = pd.read_csv(file_path_to_final_csv, ...,...)
Community
  • 1
  • 1
Ishaan
  • 707
  • 1
  • 7
  • 16
  • I think the only problem with combining the files is that they have 2 different time increments, so I would need to plot on 2 different x axes to show the data – L. Richardson Jun 17 '16 at 17:46