Summing # of occurrences of column values conditional to values in another column

Question

I have a large data frame (g1) consisting of columns count, cdr3, and length, with cdr3 and length being pertinent to my problem.

The column cdr3 consists of a number of strings, with column length giving the length of the string. There are, as such, 70 unique values for length.

What's a quick and clean way to aggregate the number of strings (column cdr3 values) that are of a particular length?

What have you tried? How about making a reproducible example showing where you got stuck? Here are some tips on how to proceed with the example. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Roman Luštrik, Feb 04 '14 at 13:20
Please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Also, what have you tried so far? — BrodieG, Feb 04 '14 at 13:20

TYZ · Answer 1 · 2014-02-04T20:59:33.673

0

If I understand your question correctly..

Element in the cdr column is a string. And length column is the length of that string?

Say if you want to know the occurrence of string of length x.

Something like this should work.

total = 0
i = 1
while (i < length(cdr3)+ 1) {
    i = i + 1
    if (nchar(cdr[i]) == x) {
        total = total + 1
    }
}

edited Feb 04 '14 at 20:59

answered Feb 04 '14 at 14:54

TYZ

8,466
5
29
60

Summing # of occurrences of column values conditional to values in another column

1 Answers1