-5

For instance, let's say I have a dataframe named df with a column "ID" of integers and I want to grab the subset of my dataframe in which the value in "ID" is in the vector [123,198,204,245,87,91,921].

What would the syntax for this be in R?

CodeGuy
  • 28,427
  • 76
  • 200
  • 317

2 Answers2

1

I believe you want the %in% function:

df <- data.frame(ID=1:1000, STUFF=runif(1000))
df2 <- df[df$ID %in% c(123,198,204,245,87,91,921), ]
ddunn801
  • 1,900
  • 1
  • 15
  • 20
0

Plese let me know if it solves your problem.

First, we'll need the which function.

?which

Which indices are TRUE?

Description

Give the TRUE indices of a logical object, allowing for array indices.

i <- 1:10

which(i < 5)

1 2 3 4

We'll also need the %in% operator:

?"%in%"

%in% is a more intuitive interface as a binary operator, which returns a logical vector indicating if there is a match or not for its left operand.

2 %in% 1:5

TRUE

2 %in% 5:10

FALSE

PUTTING AL TOGETHER

# some starting ids
id <- c(123, 204, 11, 12, 13, 15, 87, 123)

# the df constructed with the ids
df <- data.frame(id)

# the valid ids 
valid.ids <- c(123,198,204,245,87,91,921)

# positions is a logical vector which represent for each element if it's a match or not
positions <- df$id %in% valid.ids

positions

[1] TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE

# BONUS
# we can easily count how many matches we have:
sum(positions)

[1] 4

# using the which function we get only the indices 'which' contain TRUE
matched_elements_positions <- which(positions)

matched_elements_positions

1 2 7 8

# last step, we select only the matching rows from our dataframe
df[matched_elements_positions,]

123 204 87 123