Questions tagged [rsample]
16 questions
3
votes
2 answers
Bootstrap resampling and tidy regression models with grouped/nested data
I am trying to estimate regression slopes and their confidence intervals using bootstrapping. I would like to do it for grouped data. I was following the example at this website (https://www.tidymodels.org/learn/statistics/bootstrap/), but I…

D Kincaid
- 167
- 1
- 13
2
votes
1 answer
How to speed up the tidymodels bootstrapping with parallelization
I have the following code, that performs bootstrapping
and calculates the confidence interval.
library(resample)
ibrary(broom)
library(dplyr)
library(purrr)
library(tibble)
lm_est <- function(split, ...) {
lm(mpg ~ disp + hp, data =…

littleworth
- 4,781
- 6
- 42
- 76
1
vote
1 answer
Using Yardstick to calculate RMSE for aggregate of predictions per group
Sometimes I don't want to assess my models on their performance on predicting single observations, but rather I want to assess how a model performs for predictions in aggregate for groups. The group resampling tools in rsample, like group_vfold_cv,…

Econ_Modeler
- 13
- 3
1
vote
1 answer
calculating bootstrap resampling for grouped variables
I have the following dataset to calculate standardized effect size for Soil_N and Soil_P for which I used the code below for each replicate.
df <- tibble(
Soilwater = rep(rep(c("optimal", "reduced"), each = 5), times = 2),
Diversity =…

Amit
- 37
- 5
1
vote
1 answer
Does rsample::bootstraps store data rather than just row indices?
I'm trying to understand why the rsample::bootstraps function apparently stores the entire data set for each bootstrap sample. I was expecting the function would just store the dataset once, along with the bootstrap indices for each resample. In the…

Robert McDonald
- 1,250
- 1
- 12
- 20
1
vote
1 answer
Unnesting deep lists after applying the rolling_origin function from the rsample package
I have some data which looks like:
head:
dfID date group groupValues
1 df1 2020-03-01 grp1 0.175
2 df1 2020-03-01 grp2 0.150
3 df1 2020-03-01 grp3 0.0509
tail:
dfID date …

user113156
- 6,761
- 5
- 35
- 81
0
votes
0 answers
Cannot Extract Information from glm model using tidy function from rsample package
I have been foll0wing the logistic regression chapter in Hands on Programing with R. As I started all the codes were working fine but then I retarted my R session and when I run this code
tidy(model1)
it throws this error message.
`Error in…

George K-Agyen
- 13
- 3
0
votes
1 answer
Compute Gini Index on a nested/rsplit object
I used rsample::bootstraps function to create a nested object just as follows :
Sampled_Data=bootstraps(credit_data,times = 2,strata="Home",apparent = TRUE)
What I get is as follows :
splits id

WalliYo_
- 173
- 7
0
votes
1 answer
Match Each Winner with a Unique Prize
In a contest, each winner and prize is assigned a random integer [1, 9] called a "ticket" number and a unique "ID" number [1111, 9999]. Each winner receives a unique prize from a limited stock of prizes based on the winner's ticket number…

Tavaro Evanis
- 180
- 1
- 11
0
votes
1 answer
Python how to convert monthly employment data into annual, csv, panda
I've been stuck on this problem for two days. Below is the csv file.
df = pd.read_csv('/14100017.csv')
df = pd.DataFrame(data)
df.head()
df_year = df.groupby('REF_DATE')['REF_DATE'].count()
print(df_year)
This is my code. Could you please tell me…

businesstoe
- 13
- 3
0
votes
1 answer
Select a proportionate stratified random sample, where stratification is based on sites and gender
I have three IDBs and this is the number of people registered from each
Site female Male Total
IDB_A 46 14 60
IDB_B 17 23 40
IDB_C 79 21 100
Total 142 58 200
And this is the sample I want to…

Saed Jama
- 1
- 1
0
votes
1 answer
How does gtsummary produce confidence intervals and standard error statistics for glm models? (Code Examples Included)
Want to preface this with heaps of appreciate for gtsummary -- wonderful package.
After using tidymodels, GLM, and gtsummary for a while, I've been trying to understand gtsummary's computations for GLM model performance and confidence intervals.
Can…

Triage
- 21
- 1
- 3
0
votes
0 answers
set.seed() doesn't create identical outputs across different .rmd files?
I have two .Rmd files that reference the same dataset, but when I use set.seed(), I get different outputs:
library(tidymodels)
# load data and setup
data("ames")
ames_mod <-
ames %>%
select(First_Flr_SF, Sale_Price) %>%
…

Mark Rieke
- 306
- 3
- 13
0
votes
1 answer
How to propotionally split data using initial_split r
I would like to proportionally split the data I have. For example, I have 100 rows and I want to randomly sample 1 row every two rows. Using tidymodels rsample I assumed I would do the below.
dat <- as_tibble(seq(1:100))
split <- inital_split(dat,…

S_Gill
- 27
- 3
0
votes
2 answers
Block Bootstrapping using Tidymodels
I have a monthly (Jan - Dec) data set for weather and crop yield. This data is collected for multiple years (2002 - 2019). My aim is to obtain bootstrapped slope coefficient of the affect of temperature in each month on yield gap. In bootstrapping,…

Mohsin Ramay
- 76
- 8