3

I am trying to count a binary character outcome by row in a large data frame:

V1      V2      V3      V4      V5  
Loss    Loss    Loss    Loss    Loss
Loss    Loss    Win     Win     Loss
Loss    Loss    Loss    Loss    Loss

Reprex:

df <- data.frame(
V1=c("Loss", "Loss", "Loss"),
V2=c("Loss", "Loss", "Loss"),
V3=c("Loss", "Win", "Loss"),
V4=c("Loss", "Win", "Loss"),
V5=c("Loss", "Loss", "Loss"))

What I need to know is the frequency of wins and losses by row. This is just a short example (fragment of large simulated output) but for row 1, in five simulations, I have five Losses, row two three loss and two win, etc.

I was hoping to generate either a separate table that shows the frequency of wins/loss by row or, if that won't work, add two new columns: one that provides the number of "Win" and "Loss" for each row.

Each row is a different case, and each column is a replicate of that case. This appears as a data frame of factors with two levels "Loss" "Win".

Susie Derkins
  • 2,506
  • 2
  • 13
  • 21
mike
  • 123
  • 2
  • 4

2 Answers2

13

Here's a quick vectorized solution (assuming your data set called df)

Loss <- rowSums(df == "Loss") # Count the "Loss" per row
cbind(Loss, Wins = ncol(df) - Loss) # Subscribe these from the columns numbers and combine
#      Loss Wins
# [1,]    5    0
# [2,]    3    2
# [3,]    5    0
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • 1
    Thank you so much! I knew it was an easy solution, I was trying to make it much more complicated. That worked nicely! – mike Jan 08 '15 at 22:48
  • 7
    @mike: Consider accepting the answer than (and upvoting those answers which helped). That's the proper way to thank on SO. (might also flag this as obsolete after reading it...) – Deduplicator Jan 09 '15 at 12:05
  • 1
    @mike this works because `==` finds TRUE/FALSE and `rowSums` coerces those values to 1/0, respectively. Try it: `sum(TRUE, TRUE, FALSE)`. – Roman Luštrik Jan 09 '15 at 12:56
  • Neat, but won't work when NAs are present. You'd need `cbind(Loss = rowSums(df=="Loss"), Wins = rowSums(df=="Win"))` – smci May 01 '18 at 10:19
1

Another alternative with base R:

stats = function(u){
    win = sum(u=="Win")
    data.frame(Win=win, Loss=length(u)-win)
}

Reduce(rbind, apply(df, 1, stats))

#  Win Loss
#1   0    5
#2   2    3
#3   0    5

Or even better in one line but non vectorized:

t(apply(df, 1, function(u) table(factor(u, levels=c("Win","Loss")))))

#     Win Loss
#[1,]   0    5
#[2,]   2    3
#[3,]   0    5
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87