5

I have a data with column- and row-names that have a string with a number that goes from 1 to 100.

I am using grepl to select names that have a specific number (while ignoring the string). Say I have:

a <- matrix(c(1:16), nrow = 4, byrow = TRUE)
colnames(a) <- c("aaa1", "bbb1", "abc11", "ccc100")
rownames(a) <- c("aaa1", "bbb1", "abc11", "ccc100")

giving matrix a

       aaa1 bbb1 abc11 ccc100
aaa1      1    2     3      4
bbb1      5    6     7      8
abc11     9   10    11     12
ccc100   13   14    15     16

I would like to select the rows and column that include a "1" but nothing else. Like this:

     aaa1 bbb1
aaa1    1    2
bbb1    5    6 

But when I use:

a[grepl("1" , rownames(a)) , grepl("1" , colnames(a))]

I get matrix a again. I tried using "^1" but it of course doesn't find any name that's exactly 1. What can I do to solve this? I appreciate any help.

Adrian
  • 791
  • 1
  • 5
  • 15

1 Answers1

2

EDIT

As updated in the question the numbers go from 1-100 and we want to extract only those rows and columns which are exactly 1. We can extract the entire numeric part from the row and column name and then filter only those which are exactly equal to 1.

library(stringr)
a[str_extract(rownames(a), "[0-9]+") == 1, str_extract(colnames(a), "[0-9]+") == 1]

#     aaa1 bbb1
#aaa1    1    2
#bbb1    5    6

Continuing the same logic in grepl we can update the regex and look for characters followed by "1" at the end of the string

a[grepl("[A-Za-z]1$", rownames(a)), grepl("[A-Za-z]1$", colnames(a))]

#     aaa1 bbb1
#aaa1    1    2
#bbb1    5    6

Original Answer

Use "1$" which means strings which ends with "1" and then you can subset.

a[grepl("1$",rownames(a)), grepl("1$",colnames(a))]

#     aaa1 bbb1
#aaa1    1    2
#bbb1    5    6

which is equivalent to

a[endsWith(rownames(a), "1"), endsWith(colnames(a), "1")]

#     aaa1 bbb1
#aaa1    1    2
#bbb1    5    6
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213