I have been tasked to program a Dantzig Selector using Python, but I was given no guidelines and do not have much experience in linear programming or data science. I cannot find the information I need in LP module manuals, or in other questions on this site.
This is the problem. I am looking for the column vector ˆβ. Sorry that I do not have any code for this part of my program, as I don't know how to approach this problem. I have tried several approaches, but none correctly reflected the problem, so I rejected and deleted them.
min||ˆβ||l1 s.t. ||xT(y-xˆβ(||l(inf) <= δ
It can be rewritten as a linear program.
ˆβ is a kx1 column vector and the Dantzig Selector I am looking for.
- y is a nx1 column vector of observations/responses
- X is a nxk sample matrix, where k >> n
- δ is a noise variable
Here are more details that may be useful.
Here is my working code so far. Data values are all just samples/placeholders. I have already prepared X, y, and some δ values. However, I cannot find the right LP function to give me ˆβ.
import numpy as np
import random
import math
#n = no. runs = 5
n = 5
#k = no. variables = 23
k = 23
#y = vector of observations/responses (nx1, binary decisions)
y = np.array([[1],
[0],
[0],
[1],
[0]])
#X = predictor/sample matrix (nxk)
X = np.array([[1.1, 0, 0.7, 0.8, 0.9, 0.2, 0.3, 0.5, 0.2, 0.2, 1.2, 1.1, 0.5, 0.5, 0.7, 1.2, 1.3, 0.8, 0.9, 1.7, 1.2, 1.9, 0.9],
[0.3, 0.1, 0.7, 0.4, 0.9, 0.9, 0.1, 0.8, 0.1, 0.2, 1.1, 0, 0.9, 0.4, 1.4, 1.4, 0.1, 0.5, 1.8, 1.6, 1.2, 1.8, 0.3],
[0.1, 0.1, 0.3, 0.9, 0.7, 0.8, 0, 0.7, 0.8, 0.2, 1.1, 1.1, 0.5, 0.5, 0.8, 1.5, 0.2, 0.5, 1.6, 1.5, 1.2, 1.7, 0.5],
[1.2, 0.2, 0.9, 0.8, 0.6, 0.2, 0.3, 0.5, 0.3, 0.2, 1.2, 1.1, 0.5, 0, 0.7, 1.2, 1.3, 0.8, 0.9, 1.7, 1.2, 1.9, 0.9],
[0.2, 0.1, 0.6, 0, 0.5, 1.1, 0.2, 0.5, 0.9, 0.2, 1.2, 1.1, 0.8, 1.6, 0.5, 1.3, 0.2, 0.5, 1.7, 1.2, 1.2, 1.9, 0.1]])
#estimate missing data (0)
X_row_minima = np.where(X>0,X,X.max()).min(1)
X[X==0] = X_row_minima/2
#unit length normalize X
X = X/np.linalg.norm(X, ord=2, axis=1, keepdims=True)
#standardize y to zero mean
y = y - np.mean(y) / np.std(y)
#transpose X (kxn)
Xt = np.transpose(X)
#solve d0
Xty = np.matmul(Xt,y)
d0 = max(abs(Xty))
#generate 100 evenly-spaced d values
d = np.linspace(0, d0, 100)
This is my first post on this site. I apologize for the lack of details in the post compared to others.