2

I am trying to solve the equation AX = B in R.

I have two matrices, A and B:

A = matrix(c(1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
         0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,
         0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,
         0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,
         0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,
         0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,
         0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,
         0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,
         1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0), byrow = T, nrow = 10, ncol = 16)

B = matrix(c(1900,2799,3096,3297,3782,4272,7783,10881,7259,30551), nrow = 10, ncol = 1)

My question is, how can I solve AX = B and be guaranteed a non-negative solution? The values I am solving for (X1, X2,...X15, X16) are population figures so they cannot be negative. Ideally, they would be integer values as well but one thing at a time.

Is there an easy way to do this in R?

I have found one way to do it here but it doesn't yield a positive result for all X which is what I'm after.

Community
  • 1
  • 1
  • This appears to be more of a math question than a programming question. Perhaps you should ask this on [Math](http://math.stackexchange.com/) or [CrossValidated](http://stats.stackexchange.com/)? – r2evans Jul 27 '16 at 20:26
  • @r2evans I'm concerned more with the programming than the math itself. I was hoping someone would know how to solve this problem in R. –  Jul 27 '16 at 20:32
  • 1
    I don't understand. If algebra gives you a negative value, it is negative. There is no way to force it to be positive. – Roland Jul 27 '16 at 20:36
  • For the programming side, see `?solve`. For a *"non-negative solution"*, I think a mathematician is needed to answer authoritatively, and both of the sites I suggested are frequented by people with that experience/knowledge. – r2evans Jul 27 '16 at 20:36
  • 1
    @Roland In many cases, there's an infinite number of solutions to a linear system of equations. Thus, there are some solutions with negative values and some with all positive values (which in my case is more desirable). –  Jul 27 '16 at 20:39
  • Okay. So the solution is a line or (hyper)plane and you are looking for a point on it with all positive coordinates. Solve it manually or use a CAS. You can use Ryacas if you need to interface with R. – Roland Jul 27 '16 at 20:43
  • This is done with a generalized linear model (GLM) with Poisson-distributed predictors. This wikipedia article on [Poisson regression](https://en.wikipedia.org/wiki/Poisson_regression) describes it—let me look for an R package that does this. You might consider cross-posting this to http://stats.stackexchange.com/ in case you can’t wait for me. – Ahmed Fasih Jul 27 '16 at 21:26
  • 1
    I found a solution using the `nnls` package. The solution is (1900, 0, 3297, 0, 0, 4272, 10881, 0, 3782, 0, 2799, 0, 109, 2987, 7783, 0). However, I know that roughly speaking `X1` should be ~600 and `X2` should be ~1300. Is there anyway to include this logic in R? (By possibly putting constraints on `X1` (saying it must be >500 let's say) and `X2` (saying it must be >1200)? –  Jul 27 '16 at 21:59
  • 1
    @AhmedFasih . Do not cross post on CV. Cross posting strongly discourage and is grounds for flagging a moderator and having the question removed on one or both sites. See [this link](http://meta.stackexchange.com/questions/64068/is-cross-posting-a-question-on-multiple-stack-exchange-sites-permitted-if-the-qu) for a discussion. – lmo Jul 27 '16 at 22:05
  • @ultimate8 Your comments read like you have priors. You should look into Bayesian methods. – Roland Jul 28 '16 at 06:52
  • 1
    Do you have any more insight into what the solutions should be? Is there a max value that solutions are unlikely to be greater than? – Ahmed Fasih Jul 29 '16 at 09:09
  • @AhmedFasih I took a more simplified approach than the one I was originally trying to do. The initial problem was that I knew the population at two different levels of detail (let's say C1 and C2). However, I was trying to find out the population at C1 X C2. I had an estimate of what the population was for C1 X C2 back in 2010 and assumed that distribution was roughly the same for the current census year. I then multiplied by the C1 totals and then the C2 totals until the totals were consistent at both those levels (i.e. the adjustment factor was 1). –  Jul 29 '16 at 13:54

2 Answers2

1

You will need a error function to optimize the parameters Xs. Then you can use many packages for multivariate optimization. Here I use two functions from BB package, minimizing the sum of squared errors, and the maximum absolute error.

The error is defined as:

$ error = A . X - B $

You could add box constrains with lower and upper parameters defined as constant or vectors.

set.seed(321)
p0 <- abs(runif(16))*1000  #same starting values
library(BB)
se2 <- function(x){ # sum of squared errors
  sum(1000 * (A %*% x - B)^2)
}
se2(p0)

seab <- function(x){ # max error
  max(1000 * abs(A %*% x - B))
}
seab(p0)

s1<-spg(par=p0, fn=se2,lower=0)
s2<-BBoptim(par=p0, fn=se2,lower=0)
s3<-spg(par=p0, fn=seab,lower=0)
s4<-BBoptim(par=p0, fn=seab,lower=0)

linf=c(500,1200,rep(0,length(p0)-2)) #prior inferior
s5<-spg(par=p0, fn=se2,lower=linf)
s6<-BBoptim(par=p0, fn=se2,lower=linf)

round(cbind(s1$par,s2$par,s3$par,s4$par,s5$par,s6$par),2) # Xs

round(c(se2(s1$par),se2(s2$par),seab(s3$par),seab(s4$par),se2(s5$par),se2(s6$par)),3) #functions

The result:

> round(cbind(s1$par,s2$par,s3$par,s4$par,s5$par,s6$par),2) # Xs
         [,1]    [,2]    [,3]    [,4]    [,5]    [,6]
 [1,] 1900.00 1900.00 1334.61 1898.22  700.00  700.00
 [2,]    0.00    0.00  565.39    1.78 1200.00 1200.00
 [3,] 3166.80 3166.80 3211.66 3177.49 3297.00 3297.00
 [4,]  130.20  130.20   85.34  119.51    0.00    0.00
 [5,] 3687.42 3687.42 3839.29 3966.84 3954.88 3954.87
 [6,]  584.58  584.58  432.71  305.16  317.12  317.13
 [7,] 7048.48 7048.48 7200.57 6852.74 7315.90 7315.93
 [8,] 3832.52 3832.52 3680.43 4028.26 3565.10 3565.07
 [9,] 3239.80 3239.80 3356.15 3509.46 3507.25 3507.24
[10,]  542.20  542.20  425.85  272.54  274.75  274.76
[11,] 2799.00 2799.00 2788.24 2796.45 2799.00 2799.00
[12,]    0.00    0.00   10.76    2.55    0.00    0.00
[13,] 3096.00 3096.00 3054.95 2954.85 3096.00 3096.00
[14,]    0.00    0.00   41.05  141.15    0.00    0.00
[15,] 5613.50 5613.50 5765.53 5394.95 5880.97 5880.97
[16,] 2169.50 2169.50 2017.47 2388.05 1902.03 1902.03
> 
> round(c(se2(s1$par),se2(s2$par),seab(s3$par),seab(s4$par),se2(s5$par),se2(s6$par)),3) #functions
[1] 0.000 0.000 1.161 0.000 0.000 0.000
Robert
  • 5,038
  • 1
  • 25
  • 43
1

I'm not aware of any packages that would bring about the result you desire, so I wrote a base R solution below that reduces the matrix to row echelon form. After that, we need to do a little Algebra to obtain the results.

reduceMatrixSO <- function(mat) {
    n1 <- ncol(mat); n2 <- nrow(mat)
    mymax <- 1L
    for (i in 1:(n1-1L)) {
        temp <- which(mat[,i] != 0L)
        t <- which(temp >= mymax)
        if (length(temp)>0L && length(t)>0L) {
            MyMin <- min(temp[t])
            if (!(MyMin==mymax)) {
                vec <- mat[MyMin,]
                mat[MyMin,] <- mat[mymax,]
                mat[mymax,] <- vec
            }
            Coef1 <- mat[mymax, i]
            t <- t[-1]; temp <- temp[t]
            for (j in temp) {
                Coef2 <- mat[j,i]
                mat[j,] <- mat[j,] - mat[mymax,]*Coef2/Coef1
            }
            mymax <- mymax+1L
        }
    }

    if (mymax<n2) {simpMat <- mat[-(mymax:n2),]} else {simpMat <- mat}
    lenSimp <- nrow(simpMat)
    if (is.null(lenSimp)) {lenSimp <- 0L}
    mycols <- 1:n1

    if (lenSimp>1L) {
        ## "Diagonalizing" Matrix
        for (i in 1:lenSimp) {
            if (all(simpMat[i,]==0L)) {simpMat <- simpMat[-i,]; next}
            if (simpMat[i,i]==0L) {
                t <- min(which(simpMat[i,] != 0L))
                vec <- simpMat[,i]; tempCol <- mycols[i]
                simpMat[,i] <- simpMat[,t]; mycols[i] <- mycols[t]
                simpMat[,t] <- vec; mycols[t] <- tempCol
            }
        }

        colnames(simpMat) <- c(sapply(mycols[-n1], function(r) paste(c("x",r),collapse = "")),"B")
        lenSimp <- nrow(simpMat)
        MyFree <- sapply(mycols[which((1:(n1-1L))>lenSimp)], function(r) paste(c("x",r),collapse = ""))

        for (i in 1:lenSimp) {
            temp <- which(simpMat[,i] != 0L)
            t <- which(temp != i)
            if (length(temp)>0L && length(t)>0L) {
                Coef1 <- simpMat[i,i]
                temp <- temp[t]
                for (j in temp) {
                    Coef2 <- simpMat[j,i]
                    simpMat[j,] <- simpMat[j,] - simpMat[i,]*Coef2/Coef1
                }
            }
        }
        list(ReducedMatrix = simpMat, FreeVariables = MyFree)
    } else {
        list(NULL,NULL)
    }
}

The above code is a bit rough on the eyes, but it is really quite simple and pretty fast (I've tested it on much larger matrices and it returns the correct result instantaneously). Below is the result it gives:

NewM <- cbind(A,B)

reduceMatrixSO(NewM)
$ReducedMatrix
     x1 x2 x3 x5 x7 x9 x11 x13 x15 x10 x4 x12 x8 x14 x6 x16     B
[1,]  1  0  0  0  0  0   0   0   0  -1 -1  -1 -1  -1 -1  -1 -5359
[2,]  0  1  0  0  0  0   0   0   0   1  1   1  1   1  1   1  7259
[3,]  0  0  1  0  0  0   0   0   0   0  1   0  0   0  0   0  3297
[4,]  0  0  0  1  0  0   0   0   0   0  0   0  0   0  1   0  4272
[5,]  0  0  0  0  1  0   0   0   0   0  0   0  1   0  0   0 10881
[6,]  0  0  0  0  0  1   0   0   0   1  0   0  0   0  0   0  3782
[7,]  0  0  0  0  0  0   1   0   0   0  0   1  0   0  0   0  2799
[8,]  0  0  0  0  0  0   0   1   0   0  0   0  0   1  0   0  3096
[9,]  0  0  0  0  0  0   0   0   1   0  0   0  0   0  0   1  7783

$FreeVariables
[1] "x10" "x4"  "x12" "x8"  "x14" "x6"  "x16"

This tells us that there are 6 free variables. Since the OP is looking for specific non-negative integer solutions, we can put constraints on each variable (i.e. x1 > 500 and x2 < 1200). This can be done in many ways, and while the general solution (with such constraints) doesn't contain an infinite amount of solutions, there are many (>> 10^12) solutions,and obtaining all of them would be a bit nonsensical. This is why in a Linear Algebra course, the solution is often given with inequalities on each variable (e.g 330 <= x1 <= 738, 1154 <= x2 < 1500, etc. (N.B. these are not actual solutions)). Now, we can easily obtain the desired solution by wisely setting certain values to zero. Let's try the following:

##   x10 = x4 = x12 = x8 = x14 = x6 = 0
##   x1 >= 500  &&  x2 <= 1200

x1 <- -5359 + x16  ## ==>>  -5359 + x16 >= 500  ==>>  x16 >= 5859
x2 <- 7259 - x16   ## ==>>  7259 - x16 >= 1200  ==>>  x16 <= 6059
x3 <- 3297
x5 <- 4272
x7 <- 10881
x9 <- 3782
x11 <- 2799
x13 <- 3096
x15 <- 7783 - x16  ## ==>>  x16 <= 7783

Ultimately 5859 <= x16 <= 6059

Let's try the above solution:

set.seed(5467)
x16 <- sample(5859:6059, 1)
## set variables above using the newly defined x16
X <- c(x1,x2,x3,0,x5,0,x7,0,x9,0,x11,0,x13,0,x15,x16)

all(A %*% X == B)
[1] TRUE

all(X >= 0)   ## all non-negative
[1] TRUE

all(gmp::is.whole(X))   ## all integers
[1] TRUE

Finally, we identify the variables and print the solution:

names(X) <- sapply(1:16, function(r) paste(c("x",r), collapse = "")) 

X
 x1    x2    x3    x4    x5    x6    x7    x8    x9   x10   x11   x12   x13   x14   x15   x16 
590  1310  3297     0  4272     0 10881     0  3782     0  2799     0  3096     0  1834  5949
Joseph Wood
  • 7,077
  • 2
  • 30
  • 65