1

First, I apologize for clunky formatting...I do not program, just trying to learn r and run some analyses...

I am running dcast to reshape a dataset from long to wide. This code was used previously without issues, but now about 6 months later it will not work correctly.

Example of dataset structure

 - id m v1 v2 v3 
 - A  1 f  1  p
 - A  2 e  2  o
 - A  3 k  3  j
 - A  4 l  1  o
 - B  1 k  2  p
 - B  2 d  3  o
 - B  3 a  1  j
 - B  4 l  6  o
 ...

There are 4218 unique IDs and each has one occurrence of each m. I have verified the count of m matches the unique id and that among the vector of id*m there are no duplicates.

I am trying to get a wide dataset:

id m  m1v1 m1v2 m1v3 m2v1 m2v2 m2v3

My code is as follows:

y <- dcast(setDT(mydata), 'id' ~ 'm', value.var = c('v1', 'v2', 'v3')

This ran without error about 6 months ago, but reloading in new data with the same structure but different variable names has resulted in the error:

"Aggregate function missing, defaulting to 'length'"

There are no duplicates (verified all rows unique with respect to id and m), so I cannot figure out why this is happening. All other answers regarding this issue are due to duplicate values.

mt1022
  • 16,834
  • 5
  • 48
  • 71
smd
  • 11
  • 1
  • I'm pretty sure that the message (not an error) means that there are duplicates. Try `mydata[, if (.N > 1L) .SD, by=.(id, m)]` to see them. – Frank Apr 05 '18 at 01:02
  • 1
    I can't find duplicates. This code returns an empty table. To check I also pasted the values from id and m together and then checked that the unique count was equal to the number of rows. Fairly certain there are no duplicates, which is why I am perplexed. – smd Apr 05 '18 at 04:29
  • 1
    did u pass in a function into `fun.aggregate`? and also shouldnt it be `dcast(dat, id ~ m, value.var=c('v1','v2','v3'))` without quoting the variables in the formula – chinsoon12 Apr 05 '18 at 06:02
  • 1
    I added the quotes because the variables in the original dataset have a space in them. This is what rstudio was doing, but interesting that when I change the colname to "id" and "m" it runs correctly. Thank you. I didn't realize this will be an issue. – smd Apr 05 '18 at 20:55

0 Answers0