Questions tagged [feature-scaling]
51 questions
3
votes
1 answer
How to implement PySpark StandardScaler on subset of columns?
I want to use pyspark StandardScaler on 6 out of 10 columns in my dataframe. This will be part of a pipeline.
The inputCol parameter seems to expect a vector, which I can pass in after using VectorAssembler on all my features, but this scales all 10…

Insu Q
- 403
- 6
- 13
3
votes
3 answers
Normalize data before removing low variance, makes errors
I'm testing the iris dataset (one can load with the function load_iris() from sklearn.datasets) with the scikit-learn functions normalize and VarianceThreshold.
It seems that if I'm using MinMaxScaler and then run VarianceThreshold - there are no…

Boom
- 1,145
- 18
- 44
2
votes
0 answers
Some columns became NaN after scaling
I'm trying to scale features with such a function
def featureNormalize(X):
'''
This function takes the features as input and
returns the normalized values, the mean, as well
as the standard deviation for each feature.
'''
X_norm = (X -…

Krutch Dd
- 33
- 4
2
votes
1 answer
Applying Feature Scaling in a Neural Network
I have two questions:
Do I have to apply Feature Scaling over ALL features in Neural Network(and Deep Learning too)?
How can I scale categorical features in a dataset for neural network(if needed)?

Andrei
- 73
- 1
- 13
2
votes
1 answer
Data normalization and rescaling value in Python
I have a dataset which contains URLs with publish date (YYYY-MM-DD), visits. I want to calculate benchmark (average) of visits for a complete year. Pages were published on different dates.....e. g. Weightage/contribution of 1st page published in Aug…

ashish1780
- 47
- 1
- 10
2
votes
2 answers
Use same Min and Max Data for Multiple Features in MinMaxScaler
I have a dataset of 5 features. Two of these features are very similar but do not have the same min and max values.
... | feature 2 | feature 3 | ...
--------------------------------
..., 208.429993, 206.619995, ...
..., 207.779999, 205.050003,…

bcsta
- 1,963
- 3
- 22
- 61
1
vote
1 answer
Understanding the Implications of Scaling Test Data Using the Same Scalar Object as Training Data
I am currently working on a machine learning project and have encountered a dilemma regarding the scaling of test data. I understand that when scaling features, we fit the scalar object using the training data and then transform both the training…

hadyaali
- 13
- 3
1
vote
1 answer
Strange results when scaling data using scikit learn
I have an input dataset that has 4 time series with 288 values for 80 days. So the actual shape is (80,4,288). I would like to cluster differnt days. I have 80 days and all of them have 4 time series: outside temperature, solar radiation, electrical…

PeterBe
- 700
- 1
- 17
- 37
1
vote
1 answer
Do features need to be scaled in Logistic Regression?
I have a training set with one feature (credit balance) - numbers varying between 0-20,000. The response is either 0 (Default=No) or 1 (Default=Yes). This was a simulated training set generated using logistic function. For reference it is available…

Anirban Chakraborty
- 539
- 1
- 5
- 15
1
vote
1 answer
Feature Scaling for Time Series Forecasting
I am in the process of conducting a time series analysis, a multivariate time series to be precise and before feeding the inputs to my LSTM model, I have scaled them. The metrics that I am using to evaluate my model are the loss and mean absolute…

Minura Punchihewa
- 1,498
- 1
- 12
- 35
1
vote
1 answer
Invert feature scaling
In my dataset I have a binary Target (0 or 1) variable, and 8 features: nchar, rtc, Tmean, week_day, hour, ntags, nlinks and nex. week_day is a factor while the others are numeric. I built a decision tree classifier, but my question concerns the…

Mark
- 1,577
- 16
- 43
1
vote
1 answer
mysql feature-scaling calculation
I need to formulate a mysql query to select values normalized this way:
normalized = (value-min(values))/(max(values)-min(values))
My attempt looks like this:
select
Measurement_Values.Time,
…

Andrea G
- 63
- 5
1
vote
1 answer
Is it right to use different feature scaling techniques to different features?
I read this post about feature scaling:
all-about-feature-scaling
The two main feature scaling techniques are:
min-max scaler - which responds well for features with distributions which are not Gaussian.
Standard scaler - which responds well for…

user3668129
- 4,318
- 6
- 45
- 87
1
vote
0 answers
Why we just use fit() method at train data in scaling problem?
In feature Scaling, we just use fit() method at train data.
And not using in valid or test data.
Why we dont use mean and sd in test or valid data when we scaling test or valid data?

Taehyun Kim
- 11
- 2
1
vote
0 answers
Should we normalize / standardize / feature-scale a categorical variable?
variable - 'Item_Fat_Content'
values - 'Low Fat', 'Regular', 'High fat', 'No fat'
These values on converting into label will take values of 0,1,2,3. On standardising, they will take up numerical values something like 0.0,0.4,0.5,0.9.
Will python…

harini shre
- 11
- 2