0

I am new to R and need to do pairwise comparison formulas across a set of variables. The number of elements to be compared will by dynamic but here is a hardcoded example with 4 elements, each compared against the other:

#there are 4 choices A, B, C, D - 
#they are compared against each other and comparisons are stored:
df1 <- data.frame("A" = c(80),"B" = c(20))
df2 <- data.frame("A" = c(90),"C" = c(10))
df3 <- data.frame("A" = c(95), "D" = c(5))
df4 <- data.frame("B" = c(80), "C" = c(20))
df5 <- data.frame("B" = c(90), "D" = c(10))
df6 <- data.frame("C" = c(80), "D" = c(20))

#show the different comparisons in a matrix
matrixA <- matrix(c("", df1$B[1], df2$C[1], df3$D[1],
                df1$A[1],     "", df4$C[1], df5$D[1],
                df2$A[1], df4$B[1],     "", df6$D[1],
                df3$A[1], df5$B[1], df6$C[1],    ""),
              nrow=4,ncol = 4,byrow = TRUE)
dimnames(matrixA) = list(c("A","B","C","D"),c("A","B","C","D"))

#perform calculations on the comparisons
matrixB <- matrix(
      c(1,              df1$B[1]/df1$A[1], df2$C[1]/df2$A[1], df3$D[1]/df3$A[1], 
        df1$A[1]/df1$B[1],              1, df4$C[1]/df4$B[1], df5$D[1]/df5$B[1],
        df2$A[1]/df2$C[1], df4$B[1]/df4$C[1],              1, df6$D[1]/df6$C[1],
        df3$A[1]/df3$D[1], df5$B[1]/df5$D[1], df6$C[1]/df6$D[1],         1),
              nrow = 4, ncol = 4, byrow = TRUE)
matrixB <- rbind(matrixB, colSums(matrixB)) #add the sum of the colums
dimnames(matrixB) = list(c("A","B","C","D","Sum"),c("A","B","C","D"))

#so some more calculations that I'll use later on
dfC <- data.frame("AB" = c(matrixB["A","A"] / matrixB["A","B"], 
                        matrixB["B","A"] / matrixB["B","B"],
                        matrixB["C","A"] / matrixB["C","B"],
                        matrixB["D","A"] / matrixB["D","B"]),
              "BC" = c(matrixB["A","B"] / matrixB["A","C"],
                        matrixB["B","B"] / matrixB["B","C"],
                        matrixB["C","B"] / matrixB["C","C"],
                        matrixB["D","B"] / matrixB["D","C"]
                        ), 
              "CD" = c(matrixB["A","C"] / matrixB["A","D"],
                        matrixB["B","C"] / matrixB["B","D"],
                        matrixB["C","C"] / matrixB["C","D"],
                        matrixB["D","C"] / matrixB["D","D"]))

dfCMeans <- colMeans(dfC)

#create the normalization matrix
matrixN <- matrix(c(
  matrixB["A","A"] / matrixB["Sum","A"], matrixB["A","B"] / matrixB["Sum","B"], matrixB["A","C"] / matrixB["Sum","C"], matrixB["A","D"] / matrixB["Sum","D"],
  matrixB["B","A"] / matrixB["Sum","A"], matrixB["B","B"] / matrixB["Sum","B"], matrixB["B","C"] / matrixB["Sum","C"], matrixB["B","D"] / matrixB["Sum","D"],
  matrixB["C","A"] / matrixB["Sum","A"], matrixB["C","B"] / matrixB["Sum","B"], matrixB["C","C"] / matrixB["Sum","C"], matrixB["C","D"] / matrixB["Sum","D"],
  matrixB["D","A"] / matrixB["Sum","A"], matrixB["D","B"] / matrixB["Sum","B"],     matrixB["D","C"] / matrixB["Sum","C"], matrixB["D","D"] / matrixB["Sum","D"]
  ), nrow = 4, ncol = 4, byrow = TRUE)

Since R is so concise it seems like there should be a much better way to do this, I would like to know an easier way to figure out these type of calculations using R.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Take a look at `outer()` and see if that's useful. E.g `outer(c(80, 85, 60, 50), c(20, 15, 40, 45), "/")`. I'm not entirely sure what you're trying to do. – AkselA Jul 19 '18 at 00:05
  • You still need to clarify in natural language what is intended. In particular the use of the term "normalization" is highly imprecise. It can mean so many things to various people. And using a comment like "these are things I will use later" is singularly unhelpful. If the answer below was on the mark you should hit the checkmark so people will know that it addressed your question adequately. – IRTFM Jul 19 '18 at 16:58

1 Answers1

1

OK, I might be starting to piece together something here.

We start with a matrix like so:

A <- structure(
  c(NA, 20, 10, 5, 80, NA, 20, 10, 90, 80, NA, 20, 95, 90, 80, NA),
  .Dim = c(4, 4),
  .Dimnames = list(LETTERS[1:4], LETTERS[1:4]))

A
#    A  B  C  D
# A NA 80 90 95
# B 20 NA 80 90
# C 10 20 NA 80
# D  5 10 20 NA

This matrix is the result of a pairwise comparison on a vector of length 4. We know nothing of this vector, and the only thing we know about the function used in the comparison is that it is binary non-commutative, or more precisely: f(x, y) = 100 - f(y, x) and the result is ∈ [0, 100].

matrixB appears to be simply matrixA divided by its own transpose:

B = ATA-1

or if you prefer:

B = (100 - A) / A

Potato patato due to above mentioned properties.

B <- (100 - A) / A
B <- t(A) / A

# fill in the diagonal with 1s
diag(B) <- 1

round(B, 2)
#    A    B    C    D
# A  1 0.25 0.11 0.05
# B  4 1.00 0.25 0.11
# C  9 4.00 1.00 0.25
# D 19 9.00 4.00 1.00

The 'normalized' matrix as you call it seems to be simply each column divided by its sum.

B.norm <- t(t(B) / colSums(B))

round(B.norm, 3)
#       A     B     C     D
# A 0.030 0.018 0.021 0.037
# B 0.121 0.070 0.047 0.079
# C 0.273 0.281 0.187 0.177
# D 0.576 0.632 0.746 0.707
AkselA
  • 8,153
  • 2
  • 21
  • 34
  • I'm not sure if outer() does what I need to. To help clarify the question I'm starting with a matrix: – Christopher Davis Jul 19 '18 at 05:01
  • I'm not sure if outer() does what I need to. To help clarify the question I'm starting with a matrix: A B C D A "" "20" "10" "5" B "80" "" "20" "10" C "90" "80" "" "20" D "95" "90" "80" "" Then I want to create a new matrix by multiplying corresponding elements of the matrix. For example in the new matrix A,B would be A,B / B,A of the first matrix, A,C in the new matrix would be A,C / C,A of the first matrix, etc. Hopefully that makes more sense. – Christopher Davis Jul 19 '18 at 05:11
  • 1
    @ChristopherDavis: I suggest you drop the code and spell out in words what you're trying to do. Also show in a clear form what your input data is, and what you expect the output to look like. – AkselA Jul 19 '18 at 08:48
  • @ChristopherDavis: If you want to do _every_ comparison between vectors, between matrices, between a vector and a 3-dimensional array etc., then `outer()` is a good choice. If you are doing _specific_ comparisons `outer()` might be confusing. It seems like you are looking for specific comparisons, but it's unclear to me precisely which. – AkselA Jul 19 '18 at 08:54
  • [Here is a link to an excel document](https://1drv.ms/x/s!AuSk0mi8WN0UgR1thMI-MHlcHn1u) that has input, output and the formulas I am trying to replicate in R. The inputs are the A-B, A-C, etc comparisons at the top, the output is the normalization matrix at the bottom. – Christopher Davis Jul 19 '18 at 12:33
  • Please create a [minimal, concise and self-contained example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Make it easy for us trying to answer your question, and for others who might have a question similar to yours. – AkselA Jul 19 '18 at 14:00
  • Also to try and explain better: I am starting with a number of choices, in the example above the choices are A,B,C,D. Each choice is rated against the others and each pairwise comparison adds up to 100. So in the example above when A was rated against B, A got a score of 80 and B got 20. When A was rated against C A received 90 and C received 10. All of those values go into a matrix, (matrixA in the example code) and then their ratios are determined by diving corresponding elements in the matrix. So A,B in the second matrix = A,B / B/A in the next matrix. The final matrix normalizes everything – Christopher Davis Jul 19 '18 at 14:28
  • Sorry this is a bit confusing, the code above is a self contained example and the link to the excel document shows the formulas in a spreadsheet if that's easier. – Christopher Davis Jul 19 '18 at 14:31
  • If I need an external source to understand your question, then it isn't self-contained. Also, I don't have access to that OneDrive document. – AkselA Jul 19 '18 at 14:41
  • [Here is another link](https://docs.google.com/spreadsheets/d/1cMzFB1M8UiTopsFnfNT0uDhvbcT5ziE5jSLULeU7MkU/edit?usp=sharing) in google drive, you shouldn't need permissions with the link. – Christopher Davis Jul 19 '18 at 15:01
  • @ChristopherDavis: I revised my answer, hopefully this is closer to what you want, but please, look at your question as someone coming from the outside with no prior knowledge and the links you supplied dead, how helpful would they find it? – AkselA Jul 19 '18 at 17:09
  • Yes, that works - thanks!! One slight correction, the final matrix should be each column divided by its sum, to do that this works: B.norm <- sweep(B,2,colSums(B),`/`) – Christopher Davis Jul 19 '18 at 18:34