175

In R with a matrix:

     one two three four
 [1,]   1   6    11   16
 [2,]   2   7    12   17
 [3,]   3   8    11   18
 [4,]   4   9    11   19
 [5,]   5  10    15   20

I want to extract the submatrix whose rows have column three = 11. That is:

      one two three four
 [1,]   1   6    11   16
 [3,]   3   8    11   18
 [4,]   4   9    11   19

I want to do this without looping. I am new to R so this is probably very obvious but the documentation is often somewhat terse.

Jaap
  • 81,064
  • 34
  • 182
  • 193
peter2108
  • 5,580
  • 6
  • 24
  • 18
  • 5
    The basic idea in every answer is that if you have a logical vector/matrix (TRUEs and FALSEs) of the same length as some index, you will select only the cases that are TRUE. Run the codes between `[ ]` in the answers and you will see this more clearly. – Sacha Epskamp Mar 22 '11 at 14:36

6 Answers6

189

This is easier to do if you convert your matrix to a data frame using as.data.frame(). In that case the previous answers (using subset or m$three) will work, otherwise they will not.

To perform the operation on a matrix, you can define a column by name:

m[m[, "three"] == 11,]

Or by number:

m[m[,3] == 11,]

Note that if only one row matches, the result is an integer vector, not a matrix.

neilfws
  • 32,751
  • 5
  • 50
  • 63
  • 22
    if you need to keep the matrix, then do `m[m[,3] == 11,,drop=FALSE]` – Joris Meys Mar 22 '11 at 15:58
  • @neilfws What will be the solution if I want to define some values for a range of columns. for example `df <- df[!which(df$ARID3A:df$YY1 == "U"),]`, here I want to remove those rows from my df where a range of columns (ARID3A: YY1) contains the value *U*. – Newbie Jul 22 '16 at 07:53
  • 1
    How does this work if you don't want to specify the column names at all but want to work over all columns in the matrix? – user5359531 Jul 22 '16 at 22:54
  • Hey @neilfws , how can you add && statement to this one? I need to get two columns values at the same time? – Sam Al-Ghammari Oct 19 '17 at 22:33
34

I will choose a simple approach using the dplyr package.

If the dataframe is data.

library(dplyr)
result <- filter(data, three == 11)
Nate
  • 10,361
  • 3
  • 33
  • 40
mavez DABAS
  • 341
  • 3
  • 3
33
m <- matrix(1:20, ncol = 4) 
colnames(m) <- letters[1:4]

The following command will select the first row of the matrix above.

subset(m, m[,4] == 16)

And this will select the last three.

subset(m, m[,4] > 17)

The result will be a matrix in both cases. If you want to use column names to select columns then you would be best off converting it to a dataframe with

mf <- data.frame(m)

Then you can select with

mf[ mf$a == 16, ]

Or, you could use the subset command.

John
  • 23,360
  • 7
  • 57
  • 83
12

Subset is a very slow function , and I personally find it useless.

I assume you have a data.frame, array, matrix called Mat with A, B, C as column names; then all you need to do is:

  • In the case of one condition on one column, lets say column A

    Mat[which(Mat[,'A'] == 10), ]
    

In the case of multiple conditions on different column, you can create a dummy variable. Suppose the conditions are A = 10, B = 5, and C > 2, then we have:

    aux = which(Mat[,'A'] == 10)
    aux = aux[which(Mat[aux,'B'] == 5)]
    aux = aux[which(Mat[aux,'C'] > 2)]
    Mat[aux, ]

By testing the speed advantage with system.time, the which method is 10x faster than the subset method.

apaderno
  • 28,547
  • 16
  • 75
  • 90
Mohamad Elmasri
  • 461
  • 2
  • 5
  • 12
8

If your matrix is called m, just use :

R> m[m$three == 11, ]
juba
  • 47,631
  • 14
  • 113
  • 118
  • @juba What will be the solution if I want to define some values for a range of columns. for example `df <- df[!which(df$ARID3A:df$YY1 == "U"),]`, here I want to remove those rows from my df where a range of columns (ARID3A: YY1) contains the value `U` – Newbie Jul 22 '16 at 07:55
0

If the dataset is called data, then all the rows meeting a condition where value of column 'pm2.5' > 300 can be received by -

data[data['pm2.5'] >300,]
UseR10085
  • 7,120
  • 3
  • 24
  • 54
Anvita Shukla
  • 409
  • 7
  • 7