0

Whenever I am using Sklearn's Polynomial Features and converting 'X' values to make it Polynomial by this code,

Before that My X value are:-

[[ 1 11]
 [ 2 12]
 [ 3 13]
 [ 4 14]
 [ 5 15]
 [ 6 16]
 [ 7 17]
 [ 8 18]
 [ 9 19]
 [10 20]]

Note: It has multiple X values that mean it has more than one independent variable

poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
print(X_poly)

Sklearn is returning this matrix having more columns besides having all Squared values,

[[  1.   1.  11.   1.  11. 121.]
 [  1.   2.  12.   4.  24. 144.]
 [  1.   3.  13.   9.  39. 169.]
 [  1.   4.  14.  16.  56. 196.]
 [  1.   5.  15.  25.  75. 225.]
 [  1.   6.  16.  36.  96. 256.]
 [  1.   7.  17.  49. 119. 289.]
 [  1.   8.  18.  64. 144. 324.]
 [  1.   9.  19.  81. 171. 361.]
 [  1.  10.  20. 100. 200. 400.]]

I have seen this Stackoverflow Answer https://stackoverflow.com/a/51906400/12188405 when I web searched for my issue.

So can anyone please tell me a general formula OR a python code that can return that matrix respective to any degree value? In simple words, I want to make a python program that can do it having one Parameter that is a degree (which can be any value from 0 to infinity) and it will return me that Matrix-like Sklearn gives.

2 Answers2

2

I suggest you read the source code of Sklearn PolynomialFeatures in this link.

It has two different options:

  1. interaction_only=True

    • combinations('ABCD', 2) AB AC AD BC BD CD
  2. interaction_only=False

    • combinations_with_replacement('ABCD', 2) AA AB AC AD BB BC BD CC CD DD

The first one uses the combinations method of itertools package, and the second one uses combinations_with_replacement for creating new features.

Reza Soltani
  • 151
  • 4
0

You could use the get_feature_names() method to check the names of the columns in the returned matrix:

from sklearn.preprocessing import PolynomialFeatures
import numpy as np


X = np.arange(6).reshape(3, 2)

poly = PolynomialFeatures(10)
poly.fit(X)
poly.get_feature_names(['first', 'second'])

which will output

Out[12]:
['1',
 'first',
 'second',
 'first^2',
 'first second',
 'second^2',
 'first^3',
 'first^2 second',
 'first second^2',
 'second^3',
 'first^4',
 'first^3 second',
 'first^2 second^2',
 'first second^3',
 'second^4',
 'first^5',
 'first^4 second',
 'first^3 second^2',
 'first^2 second^3',
 'first second^4',
 'second^5',
 'first^6',
 'first^5 second',
 'first^4 second^2',
 'first^3 second^3',
 'first^2 second^4',
 'first second^5',
 'second^6',
 'first^7',
 'first^6 second',
 'first^5 second^2',
 'first^4 second^3',
 'first^3 second^4',
 'first^2 second^5',
 'first second^6',
 'second^7',
 'first^8',
 'first^7 second',
 'first^6 second^2',
 'first^5 second^3',
 'first^4 second^4',
 'first^3 second^5',
 'first^2 second^6',
 'first second^7',
 'second^8',
 'first^9',
 'first^8 second',
 'first^7 second^2',
 'first^6 second^3',
 'first^5 second^4',
 'first^4 second^5',
 'first^3 second^6',
 'first^2 second^7',
 'first second^8',
 'second^9',
 'first^10',
 'first^9 second',
 'first^8 second^2',
 'first^7 second^3',
 'first^6 second^4',
 'first^5 second^5',
 'first^4 second^6',
 'first^3 second^7',
 'first^2 second^8',
 'first second^9',
 'second^10']
Niko Föhr
  • 28,336
  • 10
  • 93
  • 96