I have 4190145 observations. And I want to change my date to the end day of the month. I explain in the following.
Here is my part of data:
Time1
2015/01/15
2015/02/24
2015/07/18
2015/11/10
2016/02/20
2016/04/26
2016/08/17
I want to create new column Time2
:
Time1 Time2
2015/01/15 2015/01/31
2015/02/24 2015/02/28
2015/07/18 2015/07/31
2015/11/10 2015/11/30
2016/02/20 2016/02/29
2016/04/26 2016/04/30
2016/08/17 2016/08/31
The code is:
data[, Time2 := Time1]
day(data$Time2) <- days_in_month(data$Time1)
However, I got the error.
Error: cannot allocate vector of size N Mb
Therefore, I search my problem on Stack Overflow and find this.
I use gc()
but still not work. So I see my sessionInfo()
:
sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.6.0 data.table_1.10.4
loaded via a namespace (and not attached):
[1] lazyeval_0.2.0 R6_2.2.0 assertthat_0.1 magrittr_1.5 DBI_0.5-1 tools_3.3.3 dplyr_0.5.0 tibble_1.2
[9] Rcpp_0.12.9 stringi_1.1.2 stringr_1.1.0
I think it is because my computer just 32-bit. And memory.limit()
can just set size to 4000 because my RAM is 4 GB.
Moreover, I figure out that if I use data.table
to execute, I can run.
So, my problem is how to change my code from day(data$Time2) <- days_in_month(data$Time1)
to data.table
form.
Maybe like data[, day(data$Time2) := days_in_month(data$Time1)]
?
I know this is incorrect because I get
Error: Can't assign to the same column twice in the same query (duplicates detected).
Any suggestion?