Arrange two variables by two mismatching irregular time-series in R/Python?

Question

I have date/time hourly that gives stream discharge and date/time on the hour at irregular intervals that gives stream sediment concentration. I'm unsure how to post data frames here, but it looks like:

 Datetimedis, Discharge, Datetimesed, Sediment
6/12/15 12:00  1.1 6/12/15 18:00  1231
6/12/15 13:00  113 6/13/15 1:00 12312
6/12/15 14:00  123 21 6/13/15 8:00 12321
6/12/15 15:00  12 6/13/15 15:00 12312
6/12/15 16:00  12 6/14/15 19:00 4324
6/12/15 17:00  23 6/15/15 2:00 534523
6/12/15 18:00  123 6/15/15 9:00 52341

etc

I have ~2500 raws of data for the discharge, and ~500 columns for sediment. Is there any way to use ddply or an R package or python to paste the values of sediment next to the discharge value that corresponds to the same time?

In this example data, I would want for instance the sediment value at 6/12/15 18:00 to paste next to the discharge value at that time.

I need to paste them there with the space in-between containing NA values or empty values so that I can later interpolate them.

Welcome to SO. With R you can use `dput(head())` then copy and paste the result here. That allows users to help you by reproduce you code, furthermore a little piece of the desired output is very useful as well. [Here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) you can find more informations about how to make a reproducible example. — SabDeM, Jul 21 '15 at 23:47
If these values were in a CSV I could totally help but I've never used the libraries you reference. — nivix zixer, Jul 22 '15 at 00:10
When you say "~2500 columns for discharge and ~500 for sediment", what do you mean? Your example dataframe only has one discharge column and one sediument column? or is the sediment dataframe different to the discharge dataframe? YOu will have to give us a small reproducible example. — mathematical.coffee, Jul 22 '15 at 00:56
Sorry, I meant rows! I'll try to make a reproducible example. — John Brandt, Jul 22 '15 at 01:00
Suspect that using the zoo-package's merge function will be helpful but a reproducible example is needed since it appears two files exist and only one has been illustrated. — IRTFM, Jul 22 '15 at 01:07
I figured it out with the zoo package's merge function actually! pasted my code below. I spent all day trying to figure this out of course to have it only be 3 lines of code. — John Brandt, Jul 22 '15 at 01:36

score 0 · Answer 1 · answered Jul 22 '15 at 01:21

Assuming you have one dataframe with the datetime and discharge(df) and another with the time and sediment concentration(df2) you could do the following with python pandas ...

create a new column with the time from Datetime in df:

 df['hours'] = df.index.hour

then use the map function from pandas to map the sediment concentration in df2 to df:

df['Sediment']=df['hours'].map(df2)

score 0 · Answer 2 · answered Jul 22 '15 at 01:30

0

I figured it out using the zoo package!

For anyone in the future who uses this as reference, I split up the two separate date and value entries into separate data frames, d1 and d2.

zoo1 <- read.zoo(d1, header=TRUE)
zoo2 <- read.zoo(d2, header=TRUE)
zoomerge <- merge(d1, d2)

worked perfectly!

answered Jul 22 '15 at 01:30

John Brandt

41
3

But you still haven't stated the problem properly. When you say 'irregular time-series', do you expect NA/NaNs everywhere you have entries for one time series (presumably Discharge) but not Sediment? Or do you want interpolation of NAs? – smci Jul 22 '15 at 01:47
See [zoo::na.approx](http://www.inside-r.org/packages/cran/zoo/docs/na.approx), spline, na.locf etc. for zoo's excellent interpolating functions. – smci Jul 22 '15 at 03:45

Arrange two variables by two mismatching irregular time-series in R/Python?

2 Answers2