0

Im using the following code to scale values from one interval to another. "outputs_max", "outputs_min" are numpy arrays, so are (as a result) "slope" and "intercept".

For higher clarity when displaying the result "scaled_outputs", I used pandas to create a DataFrame of the file "out.npy" which I called "output_array". The resulting array "scaled_outputs" hence is displayed in a DataFrame too and later on stored as a numpy file.

 import pandas as pd
 import numpy as np
 output_file = np.load("U:\\out.npy")
 output_array = pd.DataFrame(output_file)

 desired_upper_bound = 1
 desired_lower_bound = 0
 slope = (desired_upper_bound - desired_lower_bound) / (outputs_max - outputs_min)
 intercept = desired_upper_bound - (slope * rounded_outputs_max)
 scaled_outputs = slope * output_array + intercept
 np.save("U:\\scaled_outputs.npy", scaled_outputs)

Am I losing accuracy of the values by creating a DataFrame and passing it into the equation? Would it be better to pass the numpy array "output_file" and creating a DataFrame of "scaled_outputs"?

The result in the console is displayed with 5 decimals at max, which is why I'm asking.

cs95
  • 379,657
  • 97
  • 704
  • 746
Dorian IL
  • 199
  • 2
  • 11
  • Im not asking specifically how to display more values in the console, I just wanted to know if I lose accuracy by passing a DataFrame into the given equation. Thanks for the answer though. – Dorian IL Dec 14 '18 at 14:13

1 Answers1

1

No, you're not losing precision or accuracy by using a dataframe in your equation. What you're seeing on the console is a result of display precision. You can change the display.precision property to see more digits when a dataframe is displayed.

pandas.set_option("display.precision", 10)
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880