-2

I have a dataset of positive integers that I want to scale so that the output range is [0.0,1.0] and the median maps to 0.5.

  1. Is this possible to do at all?

  2. If so, how can I do it in Python using scipy or sklearn?

andrei
  • 2,053
  • 3
  • 17
  • 16
  • 1
    If you enter "normalize data" into your browser's search window, you will get many useful references -- more than we can -- or should -- provide here. – Prune May 30 '20 at 19:14

2 Answers2

0

This is mathematically impossible to do in general with a linear scale plus translation (that is x[i] = a*x[i] + b).

orlp
  • 112,504
  • 36
  • 218
  • 315
  • Actually, there is something called Normalized Rank, which could give an output in the range of [0.0, 1.0] such the median is always 0.5. Refer [here](https://people.revoledu.com/kardi/tutorial/Similarity/Normalized-Rank.html) for more details. – Pratheek Ponnuru Mar 21 '23 at 16:00
0

Scaling the data to the [0,1] interval is easy. It should be x=(x-min(x))/(max(x)-min(x).

If you just wanted the data to have a median of 0.5 without the [0,1] requirement, you could do x = 0.5*x/median(x).

But if you want both to be true, it can't be done with scaling.

user3433489
  • 911
  • 2
  • 10
  • 24