1

I have a dataframe with Julian dates in the format:

2455764.833333
2455764.875000
2455764.916667
dput <- structure(list(date = structure(c(2L, 1L, 1L, 1L, 1L), .Label = c("", 
"2011-07-21T20:00:00"), class = "factor"), longitude = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = "-6.396", class = "factor"), latitude = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = "56.6283", class = "factor"), julian = structure(1:5, .Label = c("2455764.833333", 
"2455764.875000", "2455764.916667", "2455764.958333", "2455765.000000"
), class = "factor"), record_no = 1:5, temp = structure(c(1L, 
3L, 2L, 4L, 5L), .Label = c("12.414", "12.463", "12.515", "12.618", 
"12.767"), class = "factor"), depth = structure(c(1L, 1L, 1L, 
1L, 1L), .Label = "  34.00", class = "factor")), row.names = c(NA, 
5L), class = "data.frame")

Online Julian date converter converts correctly (for the above it is 22 July 2011) - but I need the time elements from the details after the decimal points in addition to the date.

The origin is January 1, 4713 BC. I've read that as.Date doesn't handle BC dates. If I convert $julian to numeric it drops the data after the decimal point.

I've tried various suggestions from on here but haven't found any that work with the BC origin AND the time element.

tiree$date2 <- as.Date(tiree$julian, origin = structure(-2440588, class = "Date"))

from Convert Julian Date to Date - R gives me Error in charToDate(x) : character string is not in a standard unambiguous format (edit: as per suggestion to convert to numeric the error is removed but output is incorrect).

Any suggestions welcomed - I think I am probably missing something obvious!

Many thanks

Bun
  • 13
  • 3
  • Try `tiree$date2 <- as.Date(as.numeric(tiree$julian), origin = structure(-2440588, class = "Date"))` – Ronak Shah Sep 02 '20 at 08:45
  • Thanks @Ronak Shah - it does solve the unambiguous format error, but that approach doesn't give the correct dates/time element sadly anyway. – Bun Sep 02 '20 at 09:48
  • Please provide `dput(head(tiree$julian))`. – jay.sf Sep 02 '20 at 10:02
  • 1
    Have edited to add dput – Bun Sep 02 '20 at 11:32
  • @Bun Well done, it's essential to always use `dput` when sharing data here, so we can see the structure of your data. You can learn more here: [how-to-make-a-great-r-reproducible-example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) – jay.sf Sep 02 '20 at 11:43

1 Answers1

1

You have factor data and need to convert properly to numeric.

Also the origin doesn't appear to have been correct. We do not get ahead with days (which as.Date uses) and have to take seconds into account used by as.POSIXlt (see discussion in comments).

dat <- transform(dat, 
                 ## date version
                 julian2=as.Date(as.numeric(levels(julian))[julian], 
                                 origin=structure(-2440588, class = "Date")),
                 ## date-time version
                 julian3=as.POSIXlt(as.numeric(levels(julian))[julian]*86400, 
                                    origin=structure(-210866760000, 
                                                     class=c("POSIXct", "POSIXt"),
                                                     tzone="GMT"),
                                    tz="GMT"))

Result

dat[c("julian", "julian2", "julian3")]  ## relevant columns selected
#           julian    julian2             julian3
# 1 2455764.833333 2011-07-21 2011-07-22 07:59:59
# 2 2455764.875000 2011-07-21 2011-07-22 09:00:00
# 3 2455764.916667 2011-07-21 2011-07-22 10:00:00
# 4 2455764.958333 2011-07-21 2011-07-22 10:59:59
# 5 2455765.000000 2011-07-22 2011-07-22 12:00:00
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thanks @jay.sf - I can get that from the numeric as Ronak shah kindly suggested earlier - but the main thing I need is the times from after the decimal in addition to the date (as you can see there are multiple lines per date), and this approach doesn't give that. Thanks again. – Bun Sep 02 '20 at 11:47
  • @Bun I know, just noticed, you want `as.POSIXct` rather than `as.Date`. Since `as.POSIXct` calculates with seconds, whereas `as.Date` with days, we need to multiply with the seconds of a day, i.e. `86400`. Probably `86400 + 1` to fully match with the `date` column. See edit. – jay.sf Sep 02 '20 at 11:48
  • 1
    Amazing, thank you so much @jay.sf - I've been trying to find the answer to this all morning but had a feeling it would be something simple - you've saved me hours. – Bun Sep 02 '20 at 12:04
  • 1
    I am not an R programmer, but am familiar with Julian dates. @jay.sf answer appears wrong. According to Multiyear Computer Interactive Almanac by US Naval Observatory, 2455764.833333 UT = 2011 Jul 22 08:00:00.0 Fri; 2455764.875000 UT = 2011 Jul 22 09:00:00.0 Fri; 2455764.916667 UT = 2011 Jul 22 10:00:00.0 Fri; 2455764.958333 UT = 2011 Jul 22 11:00:00.0 Fri; 2455765.000000 UT = 2011 Jul 22 12:00:00.0 Fri. Julian dates begin at noon and are always in some flavor of Universal Time. – Gerard Ashton Sep 02 '20 at 19:42
  • To clarify my previous comment, Julian dates always begin at noon Universal Time. There are a few varieties of Universal Time that can differ by a few minutes, such as UTC or International Atomic Time. – Gerard Ashton Sep 02 '20 at 20:00
  • @GerardAshton Thank you for pointing that out! Consequently, the predefinition of OP was not correct. Actually I'm not a julian time expert, so let's work together. It was assumed that the values e.g. `2455764.833333` are fractions of seconds since the origin in gregorian date midnight of `"-4713-11-24"`, might there be a mistake? – jay.sf Sep 02 '20 at 20:10
  • @jay.sf I do not see any unambiguous statements by the OP as to what the values, such as 2455764.833333, should be in some stated time zone. One definition of Julian date is the number of days (not seconds) since noon, universal time, -4713-11-24, in the Gregorian proleptic calendar. (That is, apply the rules of the Gregorian calendar from today backward as far as desired, ignoring the fact that it didn't exist before 15 October 1582). Time of day is expressed as a decimal fraction of a day, so noon is XXX.0, 6 pm is XXX.25, midnight is XXX.5, and 6 am is XXX.75. – Gerard Ashton Sep 02 '20 at 20:28
  • @GerardAshton Ah, it's the `structure(-2440588, class = "Date")` given by OP which evaluates to `"-4713-11-24"`. – jay.sf Sep 02 '20 at 20:32
  • It appears the R Date class only handles integer days, so as I said, no clear statement about what times of day the OP thinks are correct. – Gerard Ashton Sep 02 '20 at 20:48
  • 1
    @GerardAshton Yes `"Date"` is just capable of integers, therefore we have to use "`POSIX`" format. I believe I've figured it out, see edit! `structure(-210866760000, class=c("POSIXct", "POSIXt"), tzone="GMT")` now evaluates to `"-4713-11-24 12:00:00 GMT"`. The remaining seconds deviation seem to be rounding issues. – jay.sf Sep 02 '20 at 21:00