Constraints for a Linear Model Using Quadprog for Ranges/Limits on Coefficients

Question

I'm doing some experimented, and I'm certainly aware why constraining coefficients is rarely needed, but here goes.

In the following data, I have used quadprog to solve a linear model. Note that X1 is simply the intercept.

X1 <- 1
X2 <- c(4.374, 2.3708, -7.3033, 12.0803, -0.4098, 53.0631, 13.1304, 7.3617, 16.6252, 27.3394)
X3 <- c(2.6423, 2.6284, 36.9398, 15.9278, 18.3124, 54.5039, 3.764, 19.0552, 25.4906, 13.0112)
X4 <- c(4.381, 3.144, 9.506, 15.329, 21.008, 38.091, 22.399, 13.223, 17.419, 19.87)
X <- as.matrix(cbind(X1,X2,X3,X4))
Y <- as.matrix(c(37.7,27.48,24.08,25.97,16.65,73.77,45.10,53.35,61.95,71.15))
M1 <- solve.QP(t(X) %*% X, t(Y) %*% X, matrix(0, 4, 0), c())$solution

The challenge is to subject certain coefficients to constraints. I know that I should be altering the Amet and bvac parameters (according to Linear regression with constraints on the coefficients). However, I'm not sure how to set it up, so that the following constraints are met.

The output reads [1] 37.3468790 1.2872473 -0.0177749 -0.5988443, where the values would be predicted fitted values of X1, X2, X3 and X4.

Constraints (subject to)…

X2 <= .899

0 <= X3 <= .500

0 <= X4 <= .334

Maurits Evers · Accepted Answer · 2018-08-31T11:02:57.617

Just to clarify: You give an expected output "where the values would be predicted fitted values of X1, X2, X3 and X4". I assume you are referring to estimated (not fitted) parameter values.

It's quite straightforward to implement constrained parameters when modelling data in rstan. rstan is an R interface to Stan, which in turn is a probabilistic programming language for statistical Bayesian inference.

Here is an example based on the sample data you provide.

First, let's define our model. Note that we constrain parameters beta2, beta3 and beta4 according to your requirements.

model <- "
data {
    int<lower=1> n;   // number of observations
    int<lower=0> k;   // number of parameters
    matrix[n, k] X;   // data
    vector[n] y;      // response
}

parameters {
    real beta1;                                                  // X1 unconstrained
    real<lower=negative_infinity(), upper=0.899> beta2;          // X2 <= .899
    real<lower=0, upper=0.5> beta3;                              // 0 <= X3 <= 0.5
    real<lower=0, upper=0.334> beta4;                            // 0 <= X4 <= 0.334
    real sigma;                                                  // residuals
}

model {
    // Likelihood
    y ~ normal(beta1 * X[, 1] + beta2 * X[, 2] + beta3 * X[, 3] + beta4 * X[, 4], sigma);
}"

Next, we fit the model to your sample data.

library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
df <- cbind.data.frame(Y, X)
fit <- stan(
    model_code = model,
    data = list(
        n = nrow(df),
        k = ncol(df[, -1]),
        X = df[, -1],
        y = df[, 1]))

Finally, let's print parameter estimates in a tidy way

library(broom)
tidy(fit, conf.int = TRUE)
## A tibble: 5 x 5
#  term  estimate std.error conf.low conf.high
#  <chr>    <dbl>     <dbl>    <dbl>     <dbl>
#1 beta1   29.2      6.53   16.9        42.8
#2 beta2    0.609    0.234   0.0149      0.889
#3 beta3    0.207    0.138   0.00909     0.479
#4 beta4    0.164    0.0954  0.00780     0.326
#5 sigma   16.2      5.16    9.42       29.1

We can also plot parameter estimates including CI.

Note that parameter estimates are consistent with the imposed constraints.

Super helpful. I was able to scale this approach to the full dataset and list of constraints. I am getting an error about a C++ compiler not being install - something that I have tried to install but never seems to work. — Lindon, Sep 07 '18 at 13:54

Constraints for a Linear Model Using Quadprog for Ranges/Limits on Coefficients

1 Answers1