0

I'm trying to do the following ifelse statement. My purpose is to create a new column in which I can identify individuals that have the same value in four columns.

example<-data.frame(replicate(4,sample(1:2,20,rep=TRUE)))

within(example,example$X <- ifelse(example$X1!=example$X2,NA,
                                 ifelse(example$X2!=example$X3,NA,
                                        ifelse(example$X3!=example$X4,NA,example$X1))))

In my case, the four columns are years. I want to identify panel individuals. With my data the code is the following:

within(educ_1,educ_1$X <- ifelse(educ_1$N2014!=educ_1$N2015,NA,
                                 ifelse(educ_1$N2015!=educ_1$N2016,NA,
                                   ifelse(educ_1$N2016!=educ_1$N2017,NA,educ_1$N2014))))

However, I'm getting the following error:

Error in [<-.data.table(*tmp*, , nl, value = list(educ_1 = list(PER_ID = c(9.95048326217056e-313, : Supplied 67 items to be assigned to 14191305 items of column 'educ_1'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.

One thing though is that I used fread function to import the data, since it is about 14millions observations then I thought it would be more efficient.

Any thoughts or suggestions about the syntax or the error? What should I do?. Why does it work with simple statements but not with my own data?

output of example

user213544
  • 2,046
  • 3
  • 22
  • 52
  • Use random seed for reproducible input example `set.seed(1); example <- data.frame(replicate(4,sample(1:2,20,rep=TRUE)))` then provide expected output. – zx8754 Jan 30 '20 at 11:56
  • Related post: https://stackoverflow.com/q/43580891/680068 – zx8754 Jan 30 '20 at 12:06

1 Answers1

0

Using data.table syntax you could do:

setDT(example) # Shouldn't be necessary if you used fread() to import the data
example[, 
        X := uniqueN(unlist(.SD)) == 1, 
        by = 1:nrow(example),
        .SDcols = patterns("X[0-9]")]
s_baldur
  • 29,441
  • 4
  • 36
  • 69