0

I am trying to plot the data shown below in a normalised way, in order to have the maximum value on the y-axis equal to 1.

Dataset:

    %_F     %_M     %_C     %_D    Label
0   0.00    0.00    0.08    0.05    0.0
1   0.00    0.00    0.00    0.14    0.0
2   0.00    0.00    0.10    0.01    1.0
3   0.01    0.01    0.07    0.05    1.0
4   0.00    0.00    0.07    0.14    0.0
6   0.00    0.00    0.07    0.05    0.0
7   0.00    0.00    0.05    0.68    0.0
8   0.00    0.00    0.03    0.09    0.0
9   0.00    0.00    0.04    0.02    0.0
10  0.00    0.00    0.06    0.02    0.0

I tried as follows:

cols_to_norm = ["%_F", "%_M", "%_C", "%_D"]
df[cols_to_norm] = df[cols_to_norm].apply(lambda x: (x - x.min()) / (x.max() - x.min()))

but I am not completely sure about the output. In fact, if a plot as follows

df.pivot_table(index='Label').plot.bar() 

I get a different result. I think it is because I am not considering in the first code the index on Label.

LdM
  • 674
  • 7
  • 23

1 Answers1

1
  • there are multiple techniques normalize
  • this shows technique which uses native pandas
import io
df = pd.read_csv(io.StringIO("""    %_F     %_M     %_C     %_D    Label
0   0.00    0.00    0.08    0.05    0.0
1   0.00    0.00    0.00    0.14    0.0
2   0.00    0.00    0.10    0.01    1.0
3   0.01    0.01    0.07    0.05    1.0
4   0.00    0.00    0.07    0.14    0.0
6   0.00    0.00    0.07    0.05    0.0
7   0.00    0.00    0.05    0.68    0.0
8   0.00    0.00    0.03    0.09    0.0
9   0.00    0.00    0.04    0.02    0.0
10  0.00    0.00    0.06    0.02    0.0"""), sep="\s+")

fig, ax = plt.subplots(2, figsize=[10,6])
df2 = (df-df.min())/(df.max()-df.min())
df.plot(ax=ax[0], kind="line")
df2.plot(ax=ax[1], kind="line")

enter image description here

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • Thanks Rob Raymond. May I ask you why you included also Label in the plot? I would like to plot the percentages column grouping by label (using histograms should be better) – LdM Feb 20 '21 at 18:26
  • No reason - didn't know it had a different meaning so just plotted everything – Rob Raymond Feb 20 '21 at 18:41