0

I have a large data frame (g1) consisting of columns count, cdr3, and length, with cdr3 and length being pertinent to my problem.

The column cdr3 consists of a number of strings, with column length giving the length of the string. There are, as such, 70 unique values for length.

What's a quick and clean way to aggregate the number of strings (column cdr3 values) that are of a particular length?

horseoftheyear
  • 917
  • 11
  • 23
  • 2
    What have you tried? How about making a reproducible example showing where you got stuck? Here are some tips on how to proceed with the example. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Feb 04 '14 at 13:20
  • 1
    Please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Also, what have you tried so far? – BrodieG Feb 04 '14 at 13:20
  • See `?table` and try `table(g1$length)`. – fotNelton Feb 04 '14 at 14:20

1 Answers1

0

If I understand your question correctly..

Element in the cdr column is a string. And length column is the length of that string?

Say if you want to know the occurrence of string of length x.

Something like this should work.

total = 0
i = 1
while (i < length(cdr3)+ 1) {
    i = i + 1
    if (nchar(cdr[i]) == x) {
        total = total + 1
    }
}
TYZ
  • 8,466
  • 5
  • 29
  • 60