2

I am trying to count the number of discordant pairs. For example:

arg1=c("b","c","a","d")
arg2 = c("b","c","d","a")

There is 1 discordant pair in the above (the pair: "a" and "d")

But when I run:

require(asbio)
sum(ConDis.matrix(arg1,arg2)==-1,na.rm=TRUE)

The answer I receive is: 5 (instead of the correct answer - 1)

I also tried:

require(RankAggreg)
require(DescTools)
xy <- table(arg1,arg2)
cd <- ConDisPairs(xy)
cd$D

the answer is 5 again.

What am I missing?

zx8754
  • 52,746
  • 12
  • 114
  • 209
nafrtiti
  • 176
  • 8

2 Answers2

7

I think you are misunderstanding how ConDis.matrix works.

The pairs it refers to are pairs of indices of elements and the function checks, for each pair, whether they are moving in the same way in both vectors.

So, in your vector, you have indeed 5 discordant pairs, that is (considering letters with an ordered quantitative view):

  1. between obs1 and obs3 ("a" is lower than "b" in arg1 but "d" is higher in arg2)
  2. between obs1 and obs4 ("a" is lower than "b" in arg2 but "d" is higher in arg1)
  3. between obs2 and obs3 ("a" is lower than "c" in arg1 but "d" is higher in arg2)
  4. between obs2 and obs4 ("a" is lower than "c" in arg2 but "d" is higher in arg1)
  5. between obs3 and obs4 ("a" is lower than "d" in arg1 but "d" is higher than "a" in arg2)
Cath
  • 23,906
  • 5
  • 52
  • 86
  • Thanks Cath. I'm still a bit confused. What does it mean "moving in the same way"? – nafrtiti Jul 21 '17 at 06:34
  • 1
    @nafrtiti it means there are going either from low to high or high to low in both vectors: the difference between the 2 elements has the same sign in both vector – Cath Jul 21 '17 at 06:36
2

Based on @Cath's initial comment, converting the character vectors into factors seems like it might provide a workaround by mapping the text values to integers that can then be used in the function. Edit: be aware that reordering the factor levels changes the final result. I don't know enough about the discordance function to say if this is the expected behavior.

# Original Character vectors
arg1 <- c("b","c","a","d")
arg2 <-  c("b","c","d","a")

# Translate character vectors into factors
all_levels <- unique(arg1, arg2)
arg1 <- factor(arg1, levels = all_levels)
arg1
[1] b c a d
Levels: b c a d

arg2 <- factor(arg2, levels = all_levels)
arg2
[1] b c d a
Levels: b c a d

# This maps each text string to a number 
as.numeric(arg1)
[1] 1 2 3 4
as.numeric(arg2)
[1] 1 2 4 3

# Use the underlying numeric data in the function
require(asbio)
sum(ConDis.matrix(as.numeric(arg1), as.numeric(arg2))==-1,na.rm=TRUE)
[1] 1

Edit: sorting the factor levels changes the final output

arg1 <- c("b","c","a","d")
arg2 <- c("b","c","d","a")

all_levels <- sort(unique(arg1, arg2))  # sorted

arg1 <- factor(arg1, levels = all_levels)
arg2 <- factor(arg2, levels = all_levels)

sum(ConDis.matrix(as.numeric(arg1), as.numeric(arg2))==-1,na.rm=TRUE)
[1] 5
Damian
  • 1,385
  • 10
  • 10
  • actually letters are already ordered (you can sort alphabetically in R) so the function works as you can expect, without the need to convert to factor – Cath Jul 21 '17 at 06:37
  • @Damian - if I could mark 2 correct answers I would mark yours as well. Thanks! – nafrtiti Jul 21 '17 at 06:40
  • @Cath - it doesn't work without sorting the factors. I tried `sum(ConDis.matrix(as.numeric(arg1), as.numeric(arg2))==-1,na.rm=TRUE)` and it gives an error message. – nafrtiti Jul 21 '17 at 06:46
  • @nafriti depending on what you call "sorting", maybe this sorting gives you the answer you expected but it still doesn't do what you thought it does. Of course, if you want b>c>d>a, you need to define a specific order because it is not the usual one, but this cannot be made automatic to always give the result you want. To get what you want, you'll have to use another function, probably a custom one (something like `sum(arg1 != arg2)`, which will give 2 but anyway I have to admit I don't really understand the logic behind a result of 1). – Cath Jul 21 '17 at 06:51
  • @nafrtiti `as.numeric(arg1)` will give only NAs, letters are ordered (alphabetically) but they are still `character` and converting to `numeric` can just give `NA` – Cath Jul 21 '17 at 06:57
  • Good point about the sorting (@Cath): setting the factor levels manually would also ensure the desired sequence. Eg: `all_levels <- c('b', 'a', 'd', 'c')`, which might be tedious if there are many levels. About the error: the example with the factors didn't trigger any errors on my machine--not sure what the difference might be. I'm not clear on how to interpret the output either, but I'm glad could help find a workaround – Damian Jul 21 '17 at 14:14
  • The logic of the result of "1" - the objects in vectors arg1 and arg2 do not have ordered values. In other words, in my example b>c>d>a is the correct order. Thanks for the answers and the clarifications, it was a tremendous help!! – nafrtiti Jul 23 '17 at 08:53