-2

Thank you in advance!

I have a matrix in R:

         Room 1     Room 2     Room 3     Room 4     Room 5
House 1  10         30         80         20         10
House 2  20         10         80         10         30
House 3  10         10         20         60         50

I would like to add a colum with the sum of the three highest values:

         Room 1     Room 2     Room 3     Room 4     Room 5     Sum
House 1  10         30         80         20         10         130 (=80+30+20)
House 2  20         10         90         10         30         140 (=90+30+20)
House 3  10         10         20         60         10         90  (=60+20+10)

As you can see in line 3, if there are more than one values with the same height, it should just use 1 of the values.

Can anyone help? Thank you!

I hope this is a reproducible example:

m <- matrix(c(10,30,80,20,10,20,10,80,10,30,10,10,20,60,50),ncol=5,byrow=TRUE)
colnames(m) <- c("Room1","Room2","Room3","Room4","Room5")
rownames(m) <- c("House1","House2","House3")
m <- as.matrix(m)

The original matrix is a large matrix, 100500 elements, 1005 entries and 100 columns:

dput(head(matrixtopicProbabilities,1))
    structure(c(0.000634786288616166, 0.000634786288616166, 0.000634786288616166, 
    0.115742699957681, 0.000634786288616166, 0.000211595429538722, 
    0.00275074058400339, 0.000211595429538722, 0.00698264917477782, 
    0.00359712230215827, 0.000634786288616166, 0.0086754126110876, 
    0.000634786288616166, 0.00232754972492594, 0.14409648751587, 
    0.000211595429538722, 0.000211595429538722, 0.00105797714769361, 
    0.276132035548032, 0.0019043588658485, 0.00148116800677105, 0.00317393144308083, 
    0.00105797714769361, 0.00275074058400339, 0.000634786288616166, 
    0.0158696572154041, 0.00275074058400339, 0.000211595429538722, 
    0.00275074058400339, 0.000634786288616166, 0.000211595429538722, 
    0.000211595429538722, 0.000211595429538722, 0.00909860347016504, 
    0.000211595429538722, 0.00148116800677105, 0.000211595429538722, 
    0.000634786288616166, 0.00105797714769361, 0.00148116800677105, 
    0.00571307659754549, 0.00105797714769361, 0.000211595429538722, 
    0.00105797714769361, 0.00232754972492594, 0.0175624206517139, 
    0.000634786288616166, 0.00105797714769361, 0.000634786288616166, 
    0.00148116800677105, 0.000634786288616166, 0.000634786288616166, 
    0.00148116800677105, 0.000211595429538722, 0.000211595429538722, 
    0.00359712230215827, 0.00528988573846805, 0.000634786288616166, 
    0.000211595429538722, 0.000634786288616166, 0.000634786288616166, 
    0.000634786288616166, 0.0175624206517139, 0.000634786288616166, 
    0.00232754972492594, 0.000211595429538722, 0.00359712230215827, 
    0.0298349555649598, 0.000634786288616166, 0.000211595429538722, 
    0.0019043588658485, 0.000211595429538722, 0.0019043588658485, 
    0.000211595429538722, 0.000211595429538722, 0.0230639018197207, 
    0.00359712230215827, 0.00655945831570038, 0.00317393144308083, 
    0.000634786288616166, 0.00105797714769361, 0.0387219636055861, 
    0.00148116800677105, 0.00402031316123572, 0.00232754972492594, 
    0.000211595429538722, 0.0247566652560305, 0.0217943292424884, 
    0.000211595429538722, 0.0772323317816335, 0.00571307659754549, 
    0.00148116800677105, 0.000211595429538722, 0.00105797714769361, 
    0.000211595429538722, 0.000634786288616166, 0.00105797714769361, 
    0.00148116800677105, 0.000211595429538722, 0.0590351248413034
    ), .Dim = c(1L, 100L), .Dimnames = list("company1", c("V1", "V2", 
    "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", 
    "V13", "V14", "V15", "V16", "V17", "V18", "V19", "V20", "V21", 
    "V22", "V23", "V24", "V25", "V26", "V27", "V28", "V29", "V30", 
    "V31", "V32", "V33", "V34", "V35", "V36", "V37", "V38", "V39", 
    "V40", "V41", "V42", "V43", "V44", "V45", "V46", "V47", "V48", 
    "V49", "V50", "V51", "V52", "V53", "V54", "V55", "V56", "V57", 
    "V58", "V59", "V60", "V61", "V62", "V63", "V64", "V65", "V66", 
    "V67", "V68", "V69", "V70", "V71", "V72", "V73", "V74", "V75", 
    "V76", "V77", "V78", "V79", "V80", "V81", "V82", "V83", "V84", 
    "V85", "V86", "V87", "V88", "V89", "V90", "V91", "V92", "V93", 
    "V94", "V95", "V96", "V97", "V98", "V99", "V100")))
  • 1
    Untested: `apply(m, 1, function(x) sum(sort(unique(x), decreasing = TRUE)[1:3]))` This should give `NA` if there are less than 3 unique values. – Roland Oct 25 '18 at 09:03
  • @Roland but I don't think the `unique` is needed anyway, isn't it? – bobbel Oct 25 '18 at 09:22
  • @Roland, I've tried it, in order to make it work when is less than 3 unique values it's putting `na.rm=TRUE`. `apply(m, 1, function(x) sum(sort(unique(x), decreasing = TRUE)[1:3], na.rm = TRUE))` In this case, it sum the unique values if they're just 2 or 1. – Dave Oct 25 '18 at 09:24
  • @bobbel How to propose to solve "if there are more than one values with the same height, it should just use 1 of the values"? – Roland Oct 25 '18 at 09:24
  • @Dave The required result for this case is not specified. `NA` could be appropriate. – Roland Oct 25 '18 at 09:24
  • You're right! Well, now @GreenPirate has different options, use it wisely! – Dave Oct 25 '18 at 09:26
  • Thank the three of your! Still need some blondie help. :) Some clarification: in line 3 the top 2 values are 60 and 20. The next biggest value is 10, but it occurs 3 times. The sum should be: 60 + 20 + 10, despite the fact that the value 10 occurs 3 times it should only be used once in the sum. – GreenPirate Oct 25 '18 at 09:31
  • @Roland edit: m$top3 <- apply(m, 1, function(x) sum(sort(unique(x), decreasing = TRUE)[1:3], na.rm = TRUE)) yields in "Coercing LHS to a list" so I cannot see if the results are right. Help is highly appreciated! – GreenPirate Oct 25 '18 at 11:46
  • I will gladly help. Just provide a [reproducible example](https://stackoverflow.com/a/5963610/1412059) (by editing your question). – Roland Oct 25 '18 at 11:47
  • @Roland thanks! I edited my question. – GreenPirate Oct 25 '18 at 12:25
  • @Roland any information still missing? Sorry for being a greenhorn. – GreenPirate Oct 25 '18 at 17:35
  • My code above works for both your example datasets. I cannot reproduce the problem. – Roland Oct 26 '18 at 06:07

1 Answers1

0

I would recommend splitting the problem into several parts. First create a function that computes the sum of the three highest values. Then apply the function to the matrix. Finally add the column with the sum values to the matrix. I create some sample code below using your example.

# Sample data
m <- matrix(c(10,30,80,20,10,
              20,10,80,10,30,
              10,10,20,60,50),ncol=5,byrow=TRUE)
colnames(m) <- c("Room1","Room2","Room3","Room4","Room5")
rownames(m) <- c("House1","House2","House3")
print(m)

# Function to sum the three highest values:
SumTopThree <- function(vec) {
  sortVec <- vec[order(vec, decreasing=TRUE)]
  return(sum(sortVec[1:3]))
}

# Apply the function to your matrix
Sum <-  apply(m, 1, SumTopThree)

# Add the column to the matrix
m <- cbind(m, Sum)
print(m)

Hopefully this will get you closer to the solution!

MatAff
  • 1,250
  • 14
  • 26