0

I want to do something similar to an INDEX/MATCH in Excel, where the last argument in MATCH is flexible enough to return the index of a value less than or greater than the lookup value by using -1 or 1 instead of 0 for exact match. For example my data column is (100, 150, 275) and the table I want to pull from has (100, 200, 300, 400).

I want to create a column where for each of 100, 150, and 275 the output gives the largest value below each input when there is no exact match. So in this case I want the contents of the new column to be (100, 100, 200).

I also want to create a similar case for the smallest value about each input but I believe the solution will be easy to derive from the first problem. In that case the output I'm looking for would be (100, 200, 300).

I know the logic of what I want to do and can easily do it several different ways in Excel but in R I keep getting the error: "longer object length is not a multiple of shorter object length"

where everything I try keeps referring to the entire vector of what I want to lookup instead of applying the function to each individual element of the column.

  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jul 20 '23 at 14:06

1 Answers1

0

A few possibilities, assuming:

data <- c(100, 150, 275)
brk = c(100, 200, 300, 400)

a) if the lookup table is each 100:

floor(data/100)*100
#[1] 100 100 200
ceiling(data/100)*100
#[1] 100 200 300

b) use cut:

cut(data, brk, brk[1:length(brk)-1], right = FALSE)
#[1] 100 100 200
#Levels: 100 200 300

c) use findInterval:

brk[findInterval(data, brk)]
#[1] 100 100 200

d) use dplyr::left_join (perhaps most similar to vlookup, but it expects to work with data frames instead of raw vectors)

left_join(data.frame(data), 
          data.frame(brk), join_by(closest(data >= brk)))
#  data brk
#1  100 100
#2  150 100
#3  275 200

left_join(data.frame(data), 
          data.frame(brk), join_by(closest(data <= brk)))
#  data brk
#1  100 100
#2  150 200
#3  275 300
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Thank you this is really helpful! I'm very inexperienced with R. Yesterday I discovered findinterval and that seems to be working. I basically arranged the lookup table in ascending order and created an index for the lookup table. Then I made 2 indices in the original data where one is the findinterval and the other is +1. when I join the tables on those indices I get the two values that I needed! I'm not entirely sure why it works because I'm seeing findinterval used for other purposes but I'm glad it is. Thanks so much! – stacktuary23 Jul 21 '23 at 15:54