2

I have a time series tt.txt of daily data from 1st May 1998 to 31 October 2012 in one column as this:

    v1
   296.172
   303.24
   303.891
   304.603
   304.207
   303.22
   303.137
   303.343
   304.203
   305.029
   305.099
   304.681
   304.32
   304.471
   305.022
   304.938
   304.298
   304.120

Each number in the text file represents the maximum temperature in kelvin for the corresponding day. I want to put the data in 3 columns as follows by adding year, jday, and the value of the data:

     year jday MAX_TEMP 
1    1959  325 11.7      
2    1959  326 15.6      
3    1959  327 14.4    
josliber
  • 43,891
  • 12
  • 98
  • 133

2 Answers2

1

If you have a vector with dates, we can convert it to 'year' and 'jday' by

v1 <- c('May 1998 05', 'October 2012 10')
v2 <- format(as.Date(v1, '%b %Y %d'), '%Y %j')
df1 <- read.table(text=v2, header=FALSE, col.names=c('year', 'jday'))
df1
#  year jday
#1 1998  125
#2 2012  284

To convert back from '%Y %j' to 'Date' class

df1$date <- as.Date(do.call(paste, df1[1:2]), '%Y %j')

Update

We can read the dataset with read.table. Create a sequence of dates using seq if we know the start and end dates, cbind with the original dataset after changing the format of 'date' to 'year' and 'julian day'.

dat <- read.table('tt.txt', header=TRUE)
date <- seq(as.Date('1998-05-01'), as.Date('2012-10-31'), by='day')
dat2 <- cbind(read.table(text=format(date, '%Y %j'), 
              col.names=c('year', 'jday')),MAX_TEMP=dat[1])
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @user3290596 Assuming that the date starts from '1998-05-01' and ends with '2012-10-31', we can create a sequence with `seq`, change the `format` and cbind with the initial dataset – akrun Jun 08 '15 at 14:29
  • an error occurred in dat2: dat2 <- cbind(read.table(text=format(date, '%Y %j'),col.names=c('year', 'jday')),MAX_TEMP=dat[1]) Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 5298, 5449 looks like length mismatch – user3290596 Jun 08 '15 at 18:43
  • @user3290596 The `seq` created is of a particular length. If your data i.e. Max_temp didn't have that many elements, it will cause that error. Without providing much details, it is hard to understand which day you have a missing value for MAX_TEMP – akrun Jun 08 '15 at 18:47
0

You can use yday

as.POSIXlt("8 Jun 15", format = "%d %b %y")$yday
jalapic
  • 13,792
  • 8
  • 57
  • 87