I am simulating data for a research project that is taking a long time. I would like to run some experiments with my data, however, I do not have enough data simulated for this to be practical. I would like to supplement the data that I do have simulated with random data that is normally distributed.
Thus far, I have a data frame that looks like this:
Training_Data <- data.frame( A = runif(5), B = runif(5), C = runif(5), D = runif(5) )
I then took summary statistics of this data frame as shown:
Training_Data_Sum <- as.data.frame(apply(Training_Data[1:4], 2, summary))
for which I have the min, max, mean, STD, median, etc. for each column of data.
Now, what I would like to do, is to write a function that will use the 5 rows of data in the Training_Data data frame, and expand it to 50 rows of normally distributed data using the min, max, mean, and STD values obtained from the summary statistics of the Training_Data frame.
I am assuming that I would need to use rtruncnorm function as follows:
Training_Data_50A <- rtruncnorm(n=50, A_min, A_max, A_mean, A_std)
Training_Data_50B <- rtruncnorm(n=50, B_min, B_max, B_mean, B_std)
Training_Data_50C <- rtruncnorm(n=50, C_min, C_max=, C_mean, C_std)
Training_Data_50D <- rtruncnorm(n=50, D_min, D_max, D_mean, D_std)
where the min, max, mean, and std values are obtained from the appropriate column.
Could someone point me in the correct direction on how to convert this task into a proper R function?