1

I have the following dataset

subject     stim          TRBV    
   <chr>    <chr>      <chr>   
 1 HAT-19   (2) 1100-2 TRBV20-1
 2 DL (HC2) (2) 1100-2 TRBV6-1 
 3 MB (HC1) (3) BSV18  TRBV20-1
 4 HAT-19   (2) 1100-2 TRBV7-6 
 5 HAT-001  (2) 1100-2 TRBV15  
 6 HAT-001  (3) BSV18  TRBV6-2 
 7 HAT-19   (2) 1100-2 TRBV6-4 
 8 HAT-001  (3) BSV18  TRBV20-1
 9 MB (HC1) (2) 1100-2 TRBV20-1
10 HAT-001  (2) 1100-2 TRBV6-4 

I want to find out how many times I get a certain "TRBV" value for certain "Subject" and "Stim"

For example , if Subject = "HAT-19" and stim = "(2) 1100-2" and TRBV = "TRBV20-1", I did the following:

x <- my_data[which (my_data$subject == "HAT-19" & my_data$stim == "(2) 1100-2" & my_data$TRBV == "TRBV20-1"),]

y<-x$TRBV
z<-length(y)

This works. But it is getting really monotonous to do all these steps for all the subjects. How can I shorten the steps?

thelatemail
  • 91,185
  • 12
  • 128
  • 188
Taliman
  • 57
  • 7
  • This is essentially a 'group by' or 'aggregate` task. You want to group by `subject` `stim` and `TRBV` (all your variables shown) and get a count. Take a look at something like this question - https://stackoverflow.com/questions/9809166/count-number-of-rows-within-each-group - and pick a solution that speaks to you. – thelatemail Jun 07 '18 at 23:19
  • `aggregate(z~.,cbind(my_data,z=1),length)` – Onyambu Jun 07 '18 at 23:25

1 Answers1

0

If I understand correctly, I think you should be able to do this with dplyr like so:

count(my_data, subject, stim, TRBV)

For more info, see https://dplyr.tidyverse.org/reference/tally.html

burger
  • 5,683
  • 9
  • 40
  • 63