5

How to normalize data in deedle frame ?

I've tried this approach but one doesn't work

 let iris = Frame.ReadCsv("./iris.csv")
 let keys = iris.ColumnKeys |> Seq.toArray
 let x = iris.Columns.[keys.[0..4]]
 let mu = x |> Stats.mean 
 let std = x |> Stats.stdDev
 //Not working becasue couldnt substract series from frame 
 let norm = (x - mu) / std
baio
  • 1,102
  • 2
  • 12
  • 20

1 Answers1

8

The frame - series overload expects that you are subtracting the series from all columns of the frame, i.e. that the row keys of the frame and the row keys of the series align.

For your use case, you need to align the column keys - there is no single operator for this, but you can do it using the mapRows function:

let x = iris.Columns.[keys.[0..3]]
let mu = x |> Stats.mean 
let std = x |> Stats.stdDev

let norm = 
  x 
  |> Frame.mapRowValues (fun r -> (r.As<float>() - mu) / std)
  |> Frame.ofRows

I also changed your x to be just from keys.[0..3] because otherwise you'd be trying to normalize column of type string, which fails.

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Thanks! Actually I've also tried use mapRowValues and that I fail to understand is that `.As` operator needed for this. Docs little bit unclear about this. – baio Aug 26 '16 at 21:54
  • This is because Deedle sadly does not know that the rows are all numeric (which they are actually not, if you do not drop the fourth column) - calling `row.As()` turns a row of type `Series` into `Series` which supports the `-` operator. – Tomas Petricek Aug 26 '16 at 22:17