I would like to subset an object in R according to the suffixes of the barcodes it contains. These end in '-n' where n is a number from 1 to 6. e.g. AAACCGTGCCCTCA-1, GAACCGTGCCCTCA-2, CATGCGTGCCCTCA-5, etc. I would like all the corresponding information about each barcode to be split accordingly as well. Here is some example code of an object, cds.
grp = sub("[A-Z]*[-]","",cds$barcodes)
group1 = cds[,grp==1]
However, when I view group1, I get
> group1$barcode
factor(0)
7047 Levels: AAACATACCAGTTG-3 AAACATACTATGCG-4 AAACATTGAAGCCT-5 AAACATTGGCGAAG-4 AAACATTGTGAAGA-4 ... TTTGCATGGCCAAT-5
and all the barcodes are still there. I also don't want to substitute the barcodes for the number at the end - I just want a way of telling R to locate a specific barcode by the number it ends in, so I can group them, but to keep the barcodes as they are.
For example, I would like group1$barcodes to look something like this:
group1$barcode
1 AAACCGTGCCCTCA-1
2 AAACGCACACGCAT-1
3 AAACGGCTTCCGAA-1
4 AAAGACGAACCCAA-1
5 AAAGACGACTGTTT-1
6 AAAGAGACAAAGCA-1
7 AAAGATCTGGTAAA-1
8 AAAGCAGAGCAAGG-1
9 AAAGCAGATTATCC-1
10 AAAGCCTGATGACC-1
Many thanks!
Abigail