5

I want to identify which row matches the information in a vector. As an example, I'll use the iris dataset (in tibble format to better approximate my situation): iris %>% as_tibble(). Then I have a tibble with a single row, which came directly from the original dataset:

choice <– structure(list(Sepal.Length = 4.5, Sepal.Width = 2.3, Petal.Length = 1.3, 
    Petal.Width = 0.3, Species = structure(1L, .Label = c("setosa", 
    "versicolor", "virginica"), class = "factor")), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

# A tibble: 1 x 5
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
         <dbl>       <dbl>        <dbl>       <dbl> <fct>  
1          4.5         2.3          1.3         0.3 setosa

I want to identify which row matches this exactly. I think it's better if it's a vector, but that's going to depend on what the function is; if that's the case, than you just have to add an as.numeric() to that choice.

The correct row is 42.

Érico Patto
  • 1,015
  • 4
  • 18
  • Related: [Which lines of a matrix are equal to a certain vector](https://stackoverflow.com/questions/54632037/which-lines-of-a-matrix-are-equal-to-a-certain-vector/54632523#54632523) – Henrik Apr 02 '21 at 17:57

3 Answers3

8

One option is Map. With Map, we compare (==) the corresponding elements (here the unit is a column) of 'iris' and 'choice' (as choice have only a single row, that element is recycled), returning a list of logical vectors which is then Reduced to a single logical vector with & i.e. it checks for the elementwise corresponding elements of the list (columns of iris converted to logical), returns TRUE if all the elements are TRUE), then wrap with which to get the position index of that logical vector

which(Reduce(`&`, Map(`==`, iris, choice)))
#[1] 42

Or another option is to replicate the rows of 'choice' to make the dim same as 'iris', do a ==, use rowSums and check if it is equal to number of columns

library(tidyr)
which(rowSums(iris == uncount(choice, nrow(iris))) == ncol(iris))
#[1] 42

Or this can be done in tidyverse. Create a row number column (row_number()), use filter with if_all to loop across the column names except the 'rn', compare with the extracted corresponding column of 'choice', so that it returns the row only when all the columns for that row are TRUE (if_all, if_any - is either one of them), pull the column 'rn' as a vector

library(dplyr)
iris %>% 
    mutate(rn = row_number()) %>%  
    filter(if_all(all_of(names(choice)), 
            ~ . == choice[[cur_column()]])) %>%
     pull(rn)
#[1] 42
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Yes, that worked! But what does `Reduce` do? (Just out of curiosity...) – Érico Patto Apr 02 '21 at 17:03
  • 1
    BTW, I just haven't accepted yet because StackOverflow is complaining it's too soon. – Érico Patto Apr 02 '21 at 17:04
  • 1
    @ÉricoPatto `Map` does the columnwise comparison for each dataset and returns a `list`. (as choice have only one row, that element is recycled). Then `Reduce`, checks if all the corresponding elements of the `list` are TRUE (with `&`) – akrun Apr 02 '21 at 17:05
  • 1
    Oh, ok, I get it now... Thank you ver much! – Érico Patto Apr 02 '21 at 17:06
5

This is similar to akrun's first solution, I offer the tidyverse version of it:

map2(iris, choice, `==`) %>% 
  reduce(`&`) %>%
  which()

[1] 42

Alvaro Morales
  • 1,845
  • 3
  • 12
  • 21
1

You can try this

> which(do.call(paste, iris) == do.call(paste, choice))
[1] 42

or

> match(data.frame(t(choice)), data.frame(t(iris)))
[1] 42
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81