1

I am trying to display a scatter chart for two columns in a Deedly data frame, ideally grouped by a third column. And I would like to show a linear regression line on the same chart.

In Python this can be done with seaborn.lmplot https://seaborn.pydata.org/generated/seaborn.lmplot.html

sns.lmplot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species")

enter image description here

I was hoping to do something like that with Plotly.Net, but so far I only got a simple scatterplot:

(
    df.["rating"].Values,
    df.["calories"].Values
)
||> Seq.zip
|> Chart.Point

enter image description here

How do I add a linear regression line similar to seaborn? Do I need to do it manually somehow?

How do I group the points by a third column? This one I may be able to figure out myself, but I wonder if there is a more elegant solution.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Alkasai
  • 3,757
  • 1
  • 19
  • 25
  • Does this article from the Plotly website answer your question? [ML Regression in F#](https://plotly.com/fsharp/ml-regression/) – Brian Berns Nov 14 '22 at 02:08
  • Yes, most definitely! I managed to build a helper that works similar to the `seaborn.lmplot` except the ranges. I'll post an answer shortly. – Alkasai Nov 20 '22 at 17:18
  • Posted https://stackoverflow.com/a/74510382/977406 – Alkasai Nov 20 '22 at 17:30

1 Answers1

1

Thanks to Brian Berns' comment, pointing me to this example, I was able to create a helper function that works similar to Python's seaborn.lmplot function.

Here is the code if anyone wants to use it:

// helpers

let getColVector col (df: Frame<'a, 'b>) =
    vector <| df.[col].Values

let filterByKey fn (df: Frame<'a, 'b>) =
    df.Where(fun (KeyValue(k, _)) -> fn k)

let singleGroupLmplot xCol yCol valuesName df =
    let y = df |> getColVector yCol
    let x = df |> getColVector xCol

    let coefs = OrdinaryLeastSquares.Linear.Univariable.coefficient x y
    let fittinFunc x = OrdinaryLeastSquares.Linear.Univariable.fit coefs x
    let xRange = [for i in Seq.min(x)..Seq.max(x) -> i]
    let yPredicted = [for x in xRange -> fittinFunc x]

    let xy = Seq.zip xRange yPredicted
    [
        Chart.Point(x, y, ShowLegend=true, Name=valuesName)
        |> Chart.withXAxisStyle(TitleText=xCol)
        |> Chart.withYAxisStyle(TitleText=yCol)

        Chart.Line(xy, ShowLegend=true, Name=($"Reg. {valuesName}"))
    ]
    |> Chart.combine

let lmplot xCol yCol hue df =
    match hue with
    | None ->
        [ singleGroupLmplot xCol yCol ($"{xCol} vs {yCol}") df ]
    | Some h ->
        let groupedDf = df |> Frame.groupRowsByString h

        groupedDf.RowKeys
        |> Seq.map (fun (g, _) -> g)
        |> Seq.distinct
        |> List.ofSeq
        |> List.map (fun k ->
            groupedDf
            |> filterByKey (fun (g, _) -> g = k)
            |> singleGroupLmplot xCol yCol k
        )
    |> Chart.combine
    |> Chart.withLegendStyle(Orientation=StyleParam.Orientation.Horizontal)

Example of using it to render a scatter plot with regression line for a dataframe:

df |> lmplot "rating" "calories" None

enter image description here

Example of using it to render a scatter plots with regression lines for a dataframe grouped by a row value:

df |> lmplot "rating" "calories" (Some "healthiness")

enter image description here

Alkasai
  • 3,757
  • 1
  • 19
  • 25