-1

I have the following dataframe:

class   outcome   count    total
A       TP        5        20
A       FP        5        20
A       TN        5        20
A       FN        5        20
B       TP        10       40
B       FP        10       40
B       TN        10       40
B       FN        10       40

Where I essentially want this in so called wide format i.e.

type    TP    FP    TN    FN    total
A       5     5     5     5     20 
B       10    10    10    10    40

I can almost get there by doing:

> dcast(test,test$outcome ~ test$class)
Using class...outcome...count....total as value column: use value.var to override.
  .  A       FN        5        20  A       FP        5        20  A       TN        5        20  A       TP        5        20
1 .  A       FN        5        20  A       FP        5        20  A       TN        5        20  A       TP        5        20
   B       FN        10       40  B       FP        10       40  B       TN        10       40  B       TP        10       40
1  B       FN        10       40  B       FP        10       40  B       TN        10       40  B       TP        10       40

where I now have a column for each outcome type, but I have missing column names, duplicate rows and my wanted column headers (TP,FP,TN,FN) as column values...still.

So also quite far off.

Is this even possible with dcast?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
brucezepplin
  • 9,202
  • 26
  • 76
  • 129

2 Answers2

3

Here's how to do that with tidyr using spread:

library(tidyr)
df1 <-read.table(text="class   outcome   count    total
A       TP        5        20
A       FP        5        20
A       TN        5        20
A       FN        5        20
B       TP        10       40
B       FP        10       40
B       TN        10       40
B       FN        10       40",header=TRUE, stringsAsFactors=FALSE)

library(tidyr)
spread(df1,outcome,count)

  class total FN FP TN TP
1     A    20  5  5  5  5
2     B    40 10 10 10 10

And here's the dcast solution:

dcast(df1, class +total ~ outcome, value.var="count")
  class total FN FP TN TP
1     A    20  5  5  5  5
2     B    40 10 10 10 10
Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
0

Using reshape in base R (where df is your data frame)

reshape(df, idvar = c("class", "total"), timevar = "outcome", direction = "wide")

#  class total count.TP count.FP count.TN count.FN
#1     A    20        5        5        5        5
#5     B    40       10       10       10       10
989
  • 12,579
  • 5
  • 31
  • 53