Perhaps "row" format is not the best way to perform predictions on your data. I'll give an example with dplyr/tidyr for the data in one row case:
The data:
data <- data.frame(Name = c("Peter Parker", "Peter Parker", "Peter Parker",
"Peter Parker", "Peter Parker", "Benny Thompson", "Benny Thompson",
"Benny Thompson", "Benny Thompson", "Benny Thompson", "Mylo Thony"
), Result = c(150L, 155L, 156L, 154L, 158L, 130L, 132L, 138L,
140L, 139L, 177L), Date = c(" 2018-03-03", " 2018-03-04", " 2018-03-05",
" 2018-03-06", " 2018-03-07", " 2018-03-03", " 2018-03-04", " 2018-03-05",
" 2018-03-07", " 2018-03-09", " 2018-03-11"))
data
#> Name Result Date
#> 1 Peter Parker 150 2018-03-03
#> 2 Peter Parker 155 2018-03-04
#> 3 Peter Parker 156 2018-03-05
#> 4 Peter Parker 154 2018-03-06
#> 5 Peter Parker 158 2018-03-07
#> 6 Benny Thompson 130 2018-03-03
#> 7 Benny Thompson 132 2018-03-04
#> 8 Benny Thompson 138 2018-03-05
#> 9 Benny Thompson 140 2018-03-07
#> 10 Benny Thompson 139 2018-03-09
#> 11 Mylo Thony 177 2018-03-11
To filter and transforming the data
library(dplyr)
library(tidyr)
data_in_one_row <- data |> group_by(Name) |>
mutate(count = n(), id = 1:n()) |>
filter(count == 5) |>
pivot_wider(id_cols = c(Name), names_from = id,
values_from = -c(Name, id))
data_in_one_row
#> # A tibble: 2 × 16
#> # Groups: Name [2]
#> Name Resul…¹ Resul…² Resul…³ Resul…⁴ Resul…⁵ Date_1 Date_2 Date_3 Date_4
#> <chr> <int> <int> <int> <int> <int> <chr> <chr> <chr> <chr>
#> 1 Peter Par… 150 155 156 154 158 " 201… " 201… " 201… " 201…
#> 2 Benny Tho… 130 132 138 140 139 " 201… " 201… " 201… " 201…
#> # … with 6 more variables: Date_5 <chr>, count_1 <int>, count_2 <int>,
#> # count_3 <int>, count_4 <int>, count_5 <int>, and abbreviated variable names
#> # ¹Result_1, ²Result_2, ³Result_3, ⁴Result_4, ⁵Result_5
Eventually you can apply a single linear mode prediction, depending on your expectations (on the un transformed dataset).
data <- data |> group_by(Name) |>
mutate(count = n(), id = 1:n()) |>
filter(count == 5)
model <- lm(Result ~ Name + as.POSIXct(Date), data)
pred_dataset <- data |> group_by(Name) |>
summarise(Date = max(as.POSIXct(Date)) + 24 * 3600)
pred_dataset$prediction = predict(model, pred_dataset)
pred_dataset
#> # A tibble: 3 × 3
#> Name Date prediction
#> <chr> <dttm> <dbl>
#> 1 Benny Thompson 2018-03-10 00:00:00 143.
#> 2 Mylo Thony 2018-03-12 00:00:00 179.
#> 3 Peter Parker 2018-03-08 00:00:00 159.