2

I am trying to subset a data based on a column value. I am trying to subset if that specific column has only one level information. Here how my data look like.

data <- cbind(v1=c("a", "ab", "a|12|bc", "a|b", "ac","bc|2","b|bc|12"),
            v2=c(1,2,3,5,3,1,2))

> data
     v1        v2 
[1,] "a"       "1"
[2,] "ab"      "2"
[3,] "a|12|bc" "3"
[4,] "a|b"     "5"
[5,] "ac"      "3"
[6,] "bc|2"    "1"
[7,] "b|bc|12" "2"

I want to subset only with the character values that were not including "|", like below:

> data
     v1        v2 
[1,] "a"       "1"
[2,] "ab"      "2"
[3,] "ac"      "3"

basically, I am trying to get rid of two-level (x|y) or three level values (x|y|z). Any thoughts on this?

Thanks!

amisos55
  • 1,913
  • 1
  • 10
  • 21

2 Answers2

4

We can use grep to find the row that have |, use the invert option to get the row index of elements that have no |, use that to subset the rows of the matrix

data[grep("|", data[,1], invert = TRUE, fixed = TRUE), ]
#   v1   v2 
#[1,] "a"  "1"
#[2,] "ab" "2"
#[3,] "ac" "3"

NOTE: The fixed = TRUE is used or else it will check with the regex mode on and | is a metacharacter for OR condition. Other option are to escape (\\|) or place it inside square brackets ([|]) to capture the literal character (when fixed = FALSE)

akrun
  • 874,273
  • 37
  • 540
  • 662
3

Using logical grepl this can be done as follows. I will leave it in two code lines for clarity but it's straightforward to make of it a one-liner.

i <- !grepl("\\|", data[, 1])
data[i, ]
#     v1   v2 
#[1,] "a"  "1"
#[2,] "ab" "2"
#[3,] "ac" "3"
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66