0

I want to get a dataframe containing n numbers of xy-coordinates that describe a linear function between two points P1(x1, y1) and P2(x2, y2) .

My approach would be to find the xy-coordinates of two points and calculate the slope and intercept of this linear equation y = slope * x + intercept.

For simplicity let: P

x1 = 0.1
y1 = 0.2
x2 = 1.1
y2 = 0.6

n = c(1:200) #example length of my data

slope = (y2 - y1) / (x2 - x1)
intercept = y1 - slope * x1

So far so good, but now i would like to compute the individual xy-coordinates of my linear equation into a dataframe baseline with length(n) rows with the columns x_baseline and y_baseline.

How would one go about solving this problem? So far, I could not find an satisfactory answer online, any help is highly appreciated!

For clarification a drawing of my desired output here

Keizei
  • 5
  • 2
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. I don't see how `n` in involved in this equation at all. What are these "baseline" values? – MrFlick Mar 11 '23 at 23:05
  • Sorry for the confusion. Let me elaborate, i want to calcluate slope and intercept from the given values x1, x2, y1 and y2 (in this example arbitrary values). Now that i have my linear eqaution how can i get n individual pairs of xy coordinates from P(x1, y1) to P(x2, y2) into a dataframe. I think that in this exapmle (e.g. to just compute the xy coordinates) no sample data is required. In my actual script, I read out x1, x2, y1 and y2 from my experimental data. For simplifications and reproducibility i choose the numeric values above. I – Keizei Mar 11 '23 at 23:20
  • Do you mean calculating the residuals from a linear regression for your observations? – dufei Mar 11 '23 at 23:24
  • Not exactly i want the xy-coordinates that are one that line. So that i could for example plot them in a scatterplot, giving me line of points with lenght n between P1(x1, y1) and P2(x2, y2). – Keizei Mar 11 '23 at 23:27

3 Answers3

1

If you have x1, y1, x2, y2 you can just use some some algebra to calculate the interpolation between the points

steps <- (n-1)/(max(n)-1)
dd <- data.frame(
  x = x1 + (x2-x1) * steps,
  y = y1 + (y2-y1) * steps
)

And we can plot them with

plot(y~x, dd)
points(c(x1,x2), c(y1, y2), col="green", pch=16)

You could also do interpolation with approx

dd <- data.frame(
  x = approx(c(0,1), c(x1, x2), seq(0, 1, length.out = length(n)))$y,
  y = approx(c(0,1), c(y1, y2), seq(0, 1, length.out = length(n)))$y
)

or if you really wanted to use the slope and intercept you could do

dd <- data.frame(
  x = seq(x1, x2, length.out = length(n))
) |>
  transform(y = intercept + x * slope)
  
MrFlick
  • 195,160
  • 17
  • 277
  • 295
1

If you don't want to do the math yourself, fit a linear model and predict from it:

library(tidyverse)
library(modelr)

baseline <- tibble(
  x = c(0.1, 1.1), 
  y = c(0.2, 0.6)
)

mod <- lm(y ~ x, baseline)

baseline |> 
  data_grid(x = seq_range(x, 200)) |> 
  add_predictions(mod, var = "y_baseline") |> 
  rename(x_baseline = x)
#> # A tibble: 200 × 2
#>    x_baseline y_baseline
#>         <dbl>      <dbl>
#>  1      0.1        0.200
#>  2      0.105      0.202
#>  3      0.110      0.204
#>  4      0.115      0.206
#>  5      0.120      0.208
#>  6      0.125      0.210
#>  7      0.130      0.212
#>  8      0.135      0.214
#>  9      0.140      0.216
#> 10      0.145      0.218
#> # … with 190 more rows

Created on 2023-03-12 with reprex v2.0.2

dufei
  • 2,166
  • 1
  • 7
  • 18
0

Your x values would be given by seq(x1, x2, length = 100), and your new y values given by this vector multiplied by slope plus the intercept:

df <- data.frame(x_baseline = seq(x1, x2, length = 100), 
                 y_baseline = slope * seq(x1, x2, length = 100) + intercept)

Let's plot your original points in red, and the dataframe's points in black:

plot(c(x1, x2), c(y1, y2), col = "red", cex = 2)
points(df$x_baseline, df$y_baseline)

Created on 2023-03-11 with reprex v2.0.2

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87