0

Can I pass a custom compare function to order that, given two items, indicates which one is ranked higher?

In my specific case I have the following list.

scores <- list(
    'a' = c(1, 1, 2, 3, 4, 4),
    'b' = c(1, 2, 2, 2, 3, 4),
    'c' = c(1, 1, 2, 2, 3, 4),
    'd' = c(1, 2, 3, 3, 3, 4)
)

If we take two vectors a and b, the index of the first element i at which a[i] > b[i] or a[i] < b[i] should determine what vector comes first. In this example, scores[['d']] > scores[['a']] because scores[['d']][2] > scores[['a']][2] (note that it doesn't matter that scores[['d']][5] < scores[['a']][5]).

Comparing two of those vectors could look something like this.

compare <- function(a, b) {
    # get first element index at which vectors differ
    i <- which.max(a != b)
    if(a[i] > b[i])
        1
    else if(a[i] < b[i])
        -1
    else
        0
}

The sorted keys of scores by using this comparison function should then be d, b, a, c.

From other solutions I've found, they mess with the data before ordering or introduce S3 classes and apply comparison attributes. With the former I fail to see how to mess with my data (maybe turn it into strings? But then what about numbers above 9?), with the latter I feel uncomfortable introducing a new class into my R package only for comparing vectors. And there doesn't seem to be a sort of comparator parameter I'd want to pass to order.

Felix Jassler
  • 1,029
  • 11
  • 22
  • 1
    @akrun I'm not quite sure what you mean by using combinatorics. What I'm aiming for is something frequently found in [Python](https://stackoverflow.com/a/12749495/7669319), [cpp](https://stackoverflow.com/a/16895019/7669319), [Java](https://stackoverflow.com/a/24227670/7669319) or [Go](https://gobyexample.com/sorting-by-functions) where you're able to supply a compare function that simply compares two items. – Felix Jassler Nov 27 '21 at 19:24

3 Answers3

2

Here's an attempt. I've explained every step in the comments.

compare <- function(a, b) {
  
  # subtract vector a from vector b
  comparison <- a - b
  # get the first non-zero result
  restult <- comparison[comparison != 0][1]
  # return 1 if result == 1 and 2 if result == -1 (0 if equal)
  if(is.na(restult)) {return(0)} else if(restult == 1) {return(1)} else {return(2)}
  
}

compare_list <- function(list_) {
  
  # get combinations of all possible comparison
  comparisons <- combn(length(list_), 2)
  # compare all possibilities
  results <- apply(comparisons, 2, function(x) {
    # get the "winner"
    x[compare(list_[[x[1]]], list_[[x[2]]])]
  })
  # get frequency table (how often a vector "won" -> this is the result you want)
  fr_tab <- table(results)
  # vector that is last in comparison
  last_vector <- which(!(1:length(list_) %in% as.numeric(names(fr_tab))))
  # return the sorted results and add the last vectors name
  c(as.numeric(names(sort(fr_tab, decreasing = T))), last_vector)
  
}

If you run the function on your example, the result is

> compare_list(scores)
[1] 4 2 1 3

I haven't dealt with the case that the two vectors are identical, you haven't explained how to deal with this.

sedsiv
  • 531
  • 1
  • 3
  • 15
  • I find it interesting to see `compn` used like this! Definitely learned something new there. My only worry is that the complexity seems to be somewhere in the `O(n^2)` area – Felix Jassler Nov 28 '21 at 17:07
  • 1
    That's true, though you could implement some form of a Monte Carlo Tree Search to eliminate some comparisons beforehand. E.g. compare `d` to only `b` then `c` to only `a` and then `b` to `a`. – sedsiv Nov 28 '21 at 18:53
2

The native R way to do this is to introduce an S3 class.

There are two things you can do with the class. You can define a method for xtfrm that converts your list entries to numbers. That could be vectorized, and conceivably could be really fast.

But you were asking for a user defined compare function. This is going to be slow because R function calls are slow, and it's a little clumsy because nobody does it. But following the instructions in the xtfrm help page, here's how to do it:

scores <- list(
  'a' = c(1, 1, 2, 3, 4, 4),
  'b' = c(1, 2, 2, 2, 3, 4),
  'c' = c(1, 1, 2, 2, 3, 4),
  'd' = c(1, 2, 3, 3, 3, 4)
)

# Add a class to the list

scores <- structure(scores, class = "lexico")

# Need to keep the class when subsetting

`[.lexico` <- function(x, i, ...) structure(unclass(x)[i], class = "lexico")

# Careful here:  identical() might be too strict

`==.lexico` <- function(a, b) {identical(a, b)}

`>.lexico` <- function(a, b) {
  a <- a[[1]]
  b <- b[[1]]
  i <- which(a != b)
  length(i) > 0 && a[i[1]] > b[i[1]]
}

is.na.lexico <- function(a) FALSE

sort(scores)
#> $c
#> [1] 1 1 2 2 3 4
#> 
#> $a
#> [1] 1 1 2 3 4 4
#> 
#> $b
#> [1] 1 2 2 2 3 4
#> 
#> $d
#> [1] 1 2 3 3 3 4
#> 
#> attr(,"class")
#> [1] "lexico"

Created on 2021-11-27 by the reprex package (v2.0.1)

This is the opposite of the order you asked for, because by default sort() sorts to increasing order. If you really want d, b, a, c use sort(scores, decreasing = TRUE.

user2554330
  • 37,248
  • 4
  • 43
  • 90
  • Thank you for the detailed response! Do you have any input on how that affects package development for CRAN, ie. should I be wary of simply adding new classes to lists, should I `#' @export` those `lexico` functions, etc.? – Felix Jassler Nov 28 '21 at 11:21
1

Here's another, very simple solution:

sort(sapply(scores, function(x) as.numeric(paste(x, collapse = ""))), decreasing = T)

What it does is, it takes all the the vectors, "compresses" them into a single numerical digit and then sorts those numbers in decreasing order.

sedsiv
  • 531
  • 1
  • 3
  • 15
  • Oh, I didn't consider the `as.numeric` part when thinking about condensing the vector. Neat! Do you have an idea on how to consider negative numbers as well? (eg. `c(-3, -1, 0, 4)` is greater than `c(-3, 0, 2, 5)`) – Felix Jassler Dec 06 '21 at 10:27
  • Since the vectors are monotone increasing, just add the smallest number to all vectors (in your example add 3 to all vectors). This ensures that all numbers in the vectors are positive. – sedsiv Dec 07 '21 at 16:14