0

I have a data frame with case IDs and timestamps.

str(Frame$Timestamp) 
POSIXct[1:3320], format: "2018-01-02 09:10:14" "2018-01-02 09:10:14" "2018-01-02 09:35:30" "2018-01-02 10:30:43" "2018-01-02 17:10:09" ...

In the console I can execute group_by(Frame, CaseID) without problems.

When I knit a .Rmd-notebook with the same command I receive the following error:

Error in grouped_df_impl(data, unname(vars), drop) : 
Column 'Timestamp'is of unsupported POSIXlt/POSIXt calls:
<Anonymous> ... group_by.data.frame -> grouped_df -> grouped_df_impl Execution halted.

What can I do to make it possible to use group_by() in that case?

Ben

Roman
  • 4,744
  • 2
  • 16
  • 58
Ben Engbers
  • 433
  • 3
  • 12
  • 1
    Hi @BenEngbers, your question does not meet the requirements of [minimal, complete, and verifiable example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Could you maybe a) **give a better idea of your data** (see link for tips) and b) also **show the code chunks that are above** the mentioned `group_by()` function in your .Rmd notebook? Otherwise it will be really hard to help. **Why is that important?** This error could have resulted in a couple of things that have their root in a transformation of the data frame that came before the `group_by()` – Roman Oct 13 '18 at 11:44
  • Perhaps the order of packages you're loading differs between your session and Rmd document? – Roman Luštrik Oct 13 '18 at 11:49

1 Answers1

0

Below approach works.

Output of group_by()

group_by(Frame, Timestamp)
## # A tibble: 1,000 x 2
## # Groups:   Timestamp [999]
##    Timestamp           CaseID
##    <dttm>               <int>
##  1 2018-10-01 19:09:56  18592
##  2 2018-10-16 14:20:57  49269
##  3 2018-09-30 02:37:33  66986
##  4 2018-10-11 20:16:19  22090
##  5 2018-10-20 13:16:46  11802
##  6 2018-10-05 17:05:00  70791
##  7 2018-10-14 05:54:05  32192
##  8 2018-10-13 22:44:01  92938
##  9 2018-09-28 21:40:36  86432
## 10 2018-10-14 03:53:11  90539
## # ... with 990 more rows

Contents of Test.Rmd

---
title: "Test"
author: "Roman Abashin"
date: "13 Oct 2018"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

Load library 
```{r}
library(dplyr)
```

Create data
```{r}
set.seed(1701)
Frame <- data.frame(
    Timestamp = (as.POSIXct("2018-10-10 10:10:10") + 
        sample(-1000000:1000000,1000, replace = TRUE)), 
    CaseID = sample(10000:99999, 1000, replace = FALSE))
str(df)
```

Group by
```{r}
group_by(Frame, Timestamp)
```
Roman
  • 4,744
  • 2
  • 16
  • 58
  • Thanks for your answer (and the original comment). I noticed that the ony real difference between your example and my notebook, is the line "output: html_document". After changing "output: pdf_document" to "output: html_document", I could knit without problems. After changing to the original (and intended) pdf, the oridignal error was reproduced. I also saw that you used a tibble. I tried if changing the dataframe to a tibble would make any difference, but that didn't help. What further information do you need? – Ben Engbers Oct 13 '18 at 14:43
  • @BenEngbers, if I comment out the loading of the library into `# library(dplyr)` I can reproduce your exact error message. This means, you do not load the library correctly for some reason. It would be really helpful if you could post your code. – Roman Oct 13 '18 at 14:49
  • I can't post the complete code because it's internals show too much details. In the proces of removing all the confidential parts, suddenly I could knit to pdf without errors! I don't know what is happening but I believe it has to deal with code which was inserted by a colleague. If I can figure out what caused the problem, I'll let you know. – Ben Engbers Oct 13 '18 at 18:44
  • 1
    The results from most of the chunks were cached and apparently some of the cached objects still used the POSIXlt structure. After deleting the cache the notebook was processed without problems. – Ben Engbers Oct 15 '18 at 09:12