2

I've got some data which is character class:

class(solarX[,MEASDATE])
[1] "character"

Which looks like :

    head(solarX[,MEASDATE])
[1] "1/05/2015 0:00" "1/05/2015 0:30" "1/05/2015 1:00" "1/05/2015 1:30" "1/05/2015 2:00" "1/05/2015 2:30"

However, I don't want it in character format and need to be able to access the Hours and Minutes part of the date-time.

What i've tried is:

strptime(solarX[,MEASDATE], "%d/%m/%Y %H:%M"))

Which works great, it gives me:

 [1] "2015-05-01 00:00:00 AEST" "2015-05-01 00:30:00 AEST" "2015-05-01 01:00:00 AEST" "2015-05-01 01:30:00 AEST" "2015-05-01 02:00:00 AEST"
[6] "2015-05-01 02:30:00 AEST"



 class(strptime(solarX[,MEASDATE], "%d/%m/%Y %H:%M"))
[1] "POSIXlt" "POSIXt"

However, when I go to augment my original data table by doing:

solarX[, date := strptime(solarX[,MEASDATE], "%d/%m/%Y %H:%M")]

(solarX is my data table)

I get the following warning:

    Warning message:
In `[.data.table`(solarX, , `:=`(date, strptime(solarX[, MEASDATE],  :
  Supplied 11 items to be assigned to 17568 items of column 'date' (recycled leaving remainder of 1 items)

And the data table returned looks terrible:

    MEASDATE      rrp exp_kwh                     date
1: 1/05/2015 0:00 33.99299       0             0,0,0,0,0,0,
2: 1/05/2015 0:30 31.53335       0        0,30, 0,30, 0,30,
3: 1/05/2015 1:00 29.37092       0             0,0,1,1,2,2,
4: 1/05/2015 1:30 28.03197       0             1,1,1,1,1,1,
5: 1/05/2015 2:00 26.82800       0             4,4,4,4,4,4,
6: 1/05/2015 2:30 25.22149       0 115,115,115,115,115,115,

Clearly I don't want the "date" column to look like that, rather have it filled with the values i got from the original strptime function.

SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
Gin_Salmon
  • 837
  • 1
  • 7
  • 19
  • `solarX[, date := strptime(MEASDATE, "%d/%m/%Y %H:%M")]` – HubertL May 11 '16 at 01:02
  • remove the `solarX` inside the `strptime` – SymbolixAU May 11 '16 at 01:03
  • yeah, what @HubertL said. – SymbolixAU May 11 '16 at 01:03
  • Heads up that `data.table` doesn't like `POSIXlt`. They consume a huge amount of memory. Why not `as.POSIXct`? – MichaelChirico May 11 '16 at 01:08
  • 1
    @MichaelChirico - I think `data.table` converts it to `POSIXct` internally (and warns you that it's done it). – SymbolixAU May 11 '16 at 01:09
  • In fact, from `?data.table`: `POSIXlt` is not supported as a column type because it uses 40 bytes to store a single datetime. They are implicitly converted to `POSIXct` type with _warning_. You may also be interested in `IDateTime` instead; it has methods to convert to and from `POSIXlt`. – MichaelChirico May 11 '16 at 01:09
  • Hey @HubertL and @Symbolix, I've tried your suggestions: `solarX[, date := strptime(MEASDATE, "%d/%m/%Y %H:%M")]` `Warning message: In `[.data.table`(solarX, , `:=`(date, strptime(MEASDATE, "%d/%m/%Y %H:%M"))) : Supplied 11 items to be assigned to 17568 items of column 'date' (recycled leaving remainder of 1 items).` Am i still doing something wrong there? – Gin_Salmon May 11 '16 at 01:10
  • 1
    There is something weird with the `MEASDATE` column wjich seems to contain only 11 elements (even 6 if you take a look at the result of `strptime(solarX[,MEASDATE], "%d/%m/%Y %H:%M"))` compared to the 17568 that are in the rest of the `data.frame`. – HubertL May 11 '16 at 01:13

1 Answers1

1

You can't use POSIXlt in data.table, use POSIXct instead:

library(data.table)
## setting up the data
solarX <- fread('MEASDATE      rrp exp_kwh  
"1/05/2015 0:00" 33.99299 0
"1/05/2015 0:30" 31.53335 0
"1/05/2015 1:00" 29.37092 0
"1/05/2015 1:30" 28.03197 0
"1/05/2015 2:00" 26.82800 0
"1/05/2015 2:30" 25.22149 0')

solarX[, date := as.POSIXct(MEASDATE, format = "%d/%m/%Y %H:%M")]

#           MEASDATE      rrp exp_kwh                date
# 1: 1/05/2015 0:00 33.99299       0 2015-05-01 00:00:00
# 2: 1/05/2015 0:30 31.53335       0 2015-05-01 00:30:00
# 3: 1/05/2015 1:00 29.37092       0 2015-05-01 01:00:00
# 4: 1/05/2015 1:30 28.03197       0 2015-05-01 01:30:00
# 5: 1/05/2015 2:00 26.82800       0 2015-05-01 02:00:00
# 6: 1/05/2015 2:30 25.22149       0 2015-05-01 02:30:00

str(solarX)
Classes ‘data.table’ and 'data.frame':  6 obs. of  4 variables:
 $ MEASDATE: chr  "1/05/2015 0:00" "1/05/2015 0:30" "1/05/2015 1:00" "1/05/2015 1:30" ...
 $ rrp    : num  34 31.5 29.4 28 26.8 ...
 $ exp_kwh: int  0 0 0 0 0 0
 $ date   : POSIXct, format: "2015-05-01 00:00:00" "2015-05-01 00:30:00" "2015-05-01 01:00:00" "2015-05-01 01:30:00" ...
 - attr(*, ".internal.selfref")=<externalptr> 
eddi
  • 49,088
  • 6
  • 104
  • 155
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
  • I can't see how you got this result. Apart from your gsub with the "_" which results in the same MEASDATE column that I have, do I need to change anything else to replicate this? – Gin_Salmon May 11 '16 at 01:25
  • @Gin_Salmon everything upto and including the `gsub` line is just me getting the data into my environment, as taken from the data you've provided. The `solarX[, date := ...]` line is the only line you **should** need – SymbolixAU May 11 '16 at 01:29