8

I am running a polynomial regression using scikit-learn. I have a large number of variables (23 to be precise) which I am trying to regress using polynomial regression with degree 2.

interaction_only = True, keeps only the interaction terms such as X1*Y1, X2*Y2, and so on.

I want only the other terms i.e, X1, X12, Y1, Y12, and so on.

Is there a function to get this?

2 Answers2

10

There is no such function, because the transormation can be easily expressed with numpy itself.

X = ... 
new_X = np.hstack((X, X**2))

and analogously if you want to add everything up to degree k

new_X = np.hstack((X**(i+1) for i in range(k)))
lejlot
  • 64,777
  • 8
  • 131
  • 164
  • I was using the following. `poly = PolynomialFeatures(degree=2) X_new = poly.fit_trnasform(X)` Since X was a list of 100 odd lists of size 23 each, X_new would be a matrix with each row corresponding to all combinations leading to degree 2. So, as per your answer, I would need to manually create this matrix by iterating through X and using np.hstack and np.vstack Is my deduction correct? – Harshavardhan Ramanna Aug 04 '16 at 04:45
  • 2
    I'll do you one better: `np.power(x, np.arange(k))` – tiao Apr 04 '21 at 22:31
0

I know this thread is super old. But for folks like me who just getting started can use petsy. Checkout the answer discussed here -> how to the remove interaction-only columns from sklearn.preprocessing.PolynomialFeatures