5

I have a trouble with the mutate function in dplyr and the error says;

Error: incompatible size (0), expecting 5 (the group size) or 1

There are some previous posts and I tried some of the solutions but no luck for my case.

group-factorial-data-with-multiple-factors-error-incompatible-size-0-expe

r-dplyr-using-mutate-with-na-omit-causes-error-incompatible-size-d

grouped-operations-that-result-in-length-not-equal-to-1-or-length-of-group-in-dp

Here is what I tried,

ff <- c(seq(0,0.2,0.1),seq(0,-0.2,-0.1))
flip <- c(c(0,0,1,1,1,1),c(1,1,0,0,0,0))
df <- data.frame(ff,flip,group=gl(2,6)) 

> df
     ff flip group
1   0.0    0     1
2   0.1    0     1
3   0.2    1     1
4   0.0    1     1
5  -0.1    1     1
6  -0.2    1     1
7   0.0    1     2
8   0.1    1     2
9   0.2    0     2
10  0.0    0     2
11 -0.1    0     2
12 -0.2    0     2

I want to add new group called c1 and c2 based on some conditions as follows

 dff <- df%>%
      group_by(group)%>%
      mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
      spread(direc,flip)%>%
      arrange(group,group)%>%
      mutate(c1=ff[head(which(forward>0),1)],c2=ff[tail(which(backward>0),1)])

Error: incompatible size (0), expecting 5 (the group size) or 1

I also add do and tried

do(data.frame(., c1=ff[head(which(.$forward>0),1)],c2=ff[tail(which(.$backward>0),1)]))

Error in data.frame(., c1 = ff[head(which(.$forward > 0), 1)], c2 = ff[tail(which(.$backward > : arguments imply differing number of rows: 5, 1, 0

but when I only mutate c1 column everything seems to be working. Why?

Community
  • 1
  • 1
Alexander
  • 4,527
  • 5
  • 51
  • 98

2 Answers2

3

It might be informative to step through the pipe to see what is going on.

df %>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  spread(direc,flip)%>%
  arrange(group,group)
# Source: local data frame [10 x 4]
# Groups: group [2]
#       ff  group backward forward
#    <dbl> <fctr>    <dbl>   <dbl>
# 1   -0.2      1        1      NA
# 2   -0.1      1        1      NA
# 3    0.0      1        1       0
# 4    0.1      1       NA       0
# 5    0.2      1       NA       1
# 6   -0.2      2        0      NA
# 7   -0.1      2        0      NA
# 8    0.0      2        0       1
# 9    0.1      2       NA       1
# 10   0.2      2       NA       0

BTW: Why arrange(group,group)? Doubling the order variable is pointless.

Looking here, you'll see that you have (1) backward values that are not greater than 0. When you run something like which(FALSE) you get integer(0). This might be a good time to realize that dplyr needs the vector length of the rhs to be the same length as the number of rows in the group.

Instead of your mutate, I'll show it with a slight modification: return the number of unique values returned in the which call for c2:

df %>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  spread(direc,flip)%>%
  arrange(group,group)%>%
  mutate(
    c1 = ff[head(which(forward>0),1)],
    c2len = length(which(backward > 0))
  )
# Source: local data frame [10 x 6]
# Groups: group [2]
#       ff  group backward forward    c1 c2len
#    <dbl> <fctr>    <dbl>   <dbl> <dbl> <int>
# 1   -0.2      1        1      NA   0.2     3
# 2   -0.1      1        1      NA   0.2     3
# 3    0.0      1        1       0   0.2     3
# 4    0.1      1       NA       0   0.2     3
# 5    0.2      1       NA       1   0.2     3
# 6   -0.2      2        0      NA   0.0     0
# 7   -0.1      2        0      NA   0.0     0
# 8    0.0      2        0       1   0.0     0
# 9    0.1      2       NA       1   0.0     0
# 10   0.2      2       NA       0   0.0     0

In order to meaningfully index on ff, you need something other than integer(0) in your returns.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • I am sorry it seems there is no output for c2? am I right? – Alexander Jan 16 '17 at 08:58
  • The point is effectively alistaire's, but I wasn't assuming that that was your logic. My answer is strictly answering the question "Why", assuming you could fix the flawed logic. – r2evans Jan 16 '17 at 13:22
  • (Correction, the point was @Sotos' point.) A common omission in SO questions (hinted at in the [SO help](http://stackoverflow.com/help/mcve) under "Verifiable") is to provide a sample of the expected output. Sometimes this can be as easy as generating a static data.frame with the values you expect, or at least what they will look like. Especially with complicated input structures, it almost always benefits everybody if the question is asked with small and representative data; in which case, it is often much easier to dictate what the output should be. – r2evans Jan 16 '17 at 15:52
3

Just expanding on @allistaire's comment.

  1. Your specified conditions are the cause of the error. specifically, tail(which(backward>0),1)
  2. Given code can be optimised to get rid of the spread()

you can try

dff <- df%>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  arrange(group)%>%
  mutate(c1=ff[head(which(direc=="forward" & flip > 0),1)])

It seems like you are looking to identify influx points where direction changes, for each group. In this scenario, please clarify exactly how flip is related, or maybe if you change flip <- c(c(0,0,1,1,1,1),c(1,1,0,0,0,0)) to flip <- c(c(0,0,1,1,1,1),c(1,1,0,1,1,1)) so that flip marks change in direction of ff , you can use

dff <- df%>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  arrange(group)%>%
  mutate(c1=ff[head(which(direc=="forward" & flip > 0),1)]) %>%
  mutate(c2=ff[tail(which(direc=="backward"& flip >0),1)])

which gives:

Source: local data frame [12 x 6]
Groups: group [2]

      ff  flip  group    direc    c1    c2
   <dbl> <dbl> <fctr>    <chr> <dbl> <dbl>
1    0.0     0      1  forward   0.2  -0.2
2    0.1     0      1  forward   0.2  -0.2
3    0.2     1      1  forward   0.2  -0.2
4    0.0     1      1 backward   0.2  -0.2
5   -0.1     1      1 backward   0.2  -0.2
6   -0.2     1      1 backward   0.2  -0.2
7    0.0     1      2  forward   0.0  -0.2
8    0.1     1      2  forward   0.0  -0.2
9    0.2     0      2  forward   0.0  -0.2
10   0.0     1      2 backward   0.0  -0.2
11  -0.1     1      2 backward   0.0  -0.2
12  -0.2     1      2 backward   0.0  -0.2
Aramis7d
  • 2,444
  • 19
  • 25
  • thanks for your answer. But it seems that your solution is not working on my real df. Flip order is very important and must not be changed. When you change to `flip <- c(c(0,0,1,1,1,1),c(1,1,0,1,1,1))`, you are changing the the data frame. – Alexander Jan 17 '17 at 00:11
  • Why you are changing the `flip` column? It must not be changed and I am still getting error since I cannot change `flip` order in my read df. – Alexander Jan 17 '17 at 00:13
  • I only changed it as an example. you need to understand the cause of the error here, which is that for `c2`, `direc=="backward" & flip >0` does not satisfy both groups in the example and the resultant vector returned is of a length different than the number of groups. – Aramis7d Jan 17 '17 at 05:49