Questions tagged [standardized]

Shifting and rescaling data to assure zero mean and unit variance.

Overview

Specifically, when xi, i =1,..., n is a batch of data, its mean is:

m=∑xi/n

and its variance is:

s2 = ∑(xi−m)2)/ν

where,

v is either n or n-1 (choices vary with application).

Standardization replaces each xi with zi = (xi-m)/s. Do not confuse standardization with normalization.


Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

145 questions
161
votes
9 answers

Can anyone explain me StandardScaler?

I am unable to understand the page of the StandardScaler in the documentation of sklearn. Can anyone explain this to me in simple terms?
nitinvijay23
  • 1,781
  • 3
  • 13
  • 11
56
votes
3 answers

extracting standardized coefficients from lm in R

My apologies for the dumb question...but I can't seem to find a simple solution I want to extract the standardized coefficients from a fitted linear model (in R) there must be a simple way or function that does that. can you tell me what is it? EDIT…
amit
  • 3,332
  • 6
  • 24
  • 32
32
votes
2 answers

Data Standardization vs Normalization vs Robust Scaler

I am working on data preprocessing and want to compare the benefits of Data Standardization vs Normalization vs Robust Scaler practically. In theory, the guidelines are: Advantages: Standardization: scales features such that the distribution is…
Mike
  • 551
  • 1
  • 6
  • 18
32
votes
5 answers

processing strings of text for neural network input

I understand that ANN input must be normalized, standardized, etc. Leaving the peculiarities and models of various ANN's aside, how can I preprocess UTF-8 encoded text within the range of {0,1} or alternatively between the range {-1,1} before it is…
Ælex
  • 14,432
  • 20
  • 88
  • 129
22
votes
3 answers

Python not a standardized language?

I stumbled upon this 'list of programming' languages and found that popular languages like Python are not standardized? Why is that, and what does 'Standardized' mean anyway?
eozzy
  • 66,048
  • 104
  • 272
  • 428
20
votes
4 answers

How to store scaling parameters for later use

I want to apply the scaling sklearn.preprocessing.scale module that scikit-learn offers for centering a dataset that I will use to train an svm classifier. How can I then store the standardization parameters so that I can also apply them to the…
LetsPlayYahtzee
  • 7,161
  • 12
  • 41
  • 65
16
votes
1 answer

How to standardize data with sklearn's cross_val_score()

Let's say I want to use a LinearSVC to perform k-fold-cross-validation on a dataset. How would I perform standardization on the data? The best practice I have read is to build your standardization model on your training data then apply this model to…
als5ev
  • 175
  • 1
  • 5
12
votes
4 answers

Standardize some columns in Python Pandas dataframe?

Python code below only return me an array, but I want the scaled data to replace the original data. from sklearn.preprocessing import StandardScaler df = StandardScaler().fit_transform(df[['cost', 'sales']]) df output array([[ 1.99987622,…
BigData
  • 397
  • 2
  • 3
  • 13
9
votes
3 answers

How to standardize a data frame which contains both numeric and factor variables

My data frame, my.data, contains both numeric and factor variables. I want to standardise just the numeric variables in this data frame. > mydata2=data.frame(scale(my.data, center=T, scale=T)) Error in colMeans(x, na.rm = TRUE) : 'x' must be…
wilga
  • 103
  • 2
  • 7
9
votes
2 answers

Undo L2 Normalization in sklearn python

Once I normalized my data with an sklearn l2 normalizer and use it as training data: How do I turn the predicted output back to the "raw" shape? In my example I used normalized housing prices as y and normalized living space as x. Each used to fit…
5
votes
13 answers

Should developer tools, languages, frameworks, etc. be standardized across an organization?

The organization that I currently work for seems to be heading in the direction of dictating to software developers which tools, languages, frameworks, etc. must be used. However, nobody has convinced me that this is a good thing. The main argument…
rich
  • 989
  • 2
  • 9
  • 17
4
votes
1 answer

Why two different normalized results from Python vs R

Can anyone explain the math behind the scenes? why Python and R return me the different result? which one should I use for real-world business scenario? original data id cost sales item 1 300 50 pen 2 3 88 wf 3 1 …
BigData
  • 397
  • 2
  • 3
  • 13
4
votes
3 answers

R Standardizing numeric variables in dataframe while retaining factor variables

I have a dataframe (dcc) loaded in R which I have narrowed down to complete cases. str(dcc) 'data.frame': 41715 obs. of 9 variables: $ XCoord : num 661382 661412 661442 661472 661502 ... $ YCoord : num …
lambertj
  • 127
  • 1
  • 2
  • 12
4
votes
1 answer

Python (sklearn) - Why am I getting the same prediction for every testing tuple in SVR?

The answers to the similar questions on stackoverflow suggests to change the parameter values in the instance SVR(), but I don't understand how to deal with them. Here's the code that I am using: import json import numpy as np from sklearn.svm…
jatin
  • 635
  • 7
  • 8
4
votes
1 answer

Standardized Error Classification & Handling

I need to standardize on how I classify and handle errors/exceptions 'gracefully'. I currently use a process by which I report the errors to a function passing an error-number, severity-code, location-info and extra-info-string. This function…
slashmais
  • 7,069
  • 9
  • 54
  • 80
1
2 3
9 10