20

I have seen this reshape2 several times on SO but haven't seen a solution to my particular problem;

I have a dataset like this;

head(data)
student    test    score
Adam      Exam1     80
Adam      Exam2     90
John      Exam1     70
John      Exam2     60

I am trying to cast this to a wide format that looks like this;

Student    Exam1    Exam2 ........ ExamX
Adam         80       90
John         70       60

using;

dcast(data,student~test,value.var='score')

but the data ends up looking like something like this;

Student    Exam1     Exam2
Adam        0          0
John        0          1

with this error;

Aggregation function missing: defaulting to length

Any ideas why it is changing all of these values to a (0 or 1)?

chattrat423
  • 603
  • 2
  • 11
  • 24
  • 1
    You need to provide a sequence column. But, based on the example, it works though `dcast(data, student~test, value.var='score')` Provide an example with duplicate rows – akrun May 26 '15 at 16:05
  • 6
    It's not an error. It's a warning to let you know that since you didn't provide a value for `fun.aggregate` (e.g., `fun.aggregate=mean`), it defaults to returning the length, which is a count of the number of rows falling into that combination of categories. I don't see `job_type` in your sample data. Did you want `dcast(data,student ~ test ,value.var='score')`? – eipi10 May 26 '15 at 16:10
  • 2
    Hi, i have similar problem now, and i dont know how to fix this. Was the problem, that `value.var` was wrong typed? – Bobesh Dec 03 '15 at 16:44
  • 2
    @Bobesh: It's some time ago but still: Sometimes a simple `object <- unique(object)` works as the problem can be caused by identical duplicate rows. – AlexDeLarge Jan 06 '17 at 15:24
  • @AlexDeLarge What is object in this case? – Vijay Ramesh Dec 07 '18 at 15:34
  • @VijayRamesh Works on data frames/data tables, vectors and arrays. – AlexDeLarge Dec 11 '18 at 10:58

1 Answers1

19

Thanks to @akrun who pointed it out.

Well, there's a high chance that your data has duplicate row that look either like this:

student    test    score
Adam      Exam1     80
Adam      Exam1     85
Adam      Exam2     90
John      Exam1     70
John      Exam2     60

Or like this:

student   class     test    score
Adam      Biology   Exam1     80
Adam      Theology  Exam1     85
Adam      Theology  Exam2     90
John      Biology   Exam1     70
John      Theology  Exam2     60

When you cast it like this: dcast(data, student + class ~ test, value.var='score')

JelenaČuklina
  • 3,574
  • 2
  • 22
  • 35