Here is a base R approach that may work for you. It is not what I would call elegant, but it's relatively easy to understand.
Be sure to replace ~/Stack Overflow/
with whatever directory your train
directory is located in.
In short, we use dir.create
to make the new directories (if they do not exist already). Then we use list.files
to make a list of the files in each of the two training directories. Then we use sample
to take a sample of those files. Lastly we use file.copy
to place them into their new home.
setwd("~/Stack Overflow/")
sample.fraction <- 0.2
train.true.dir <- "train/hot_dog"
train.false.dir <- "train/not_hot_dog"
valid.true.dir <- "validation/hot_dog"
valid.false.dir <- "validation/not_hot_dog"
sapply(c("validation",valid.true.dir,valid.false.dir),function(x){dir.create(x,showWarnings = FALSE)})
true.files <- list.files(train.true.dir)
false.files <- list.files(train.false.dir)
true.sample <- sample(true.files,size = ceiling(length(true.files) * sample.fraction))
false.sample <- sample(false.files,size = ceiling(length(false.files) * sample.fraction))
sapply(true.sample,function(x){file.copy(paste(train.true.dir,x,sep="/"),paste(valid.true.dir,x,sep="/"))})
sapply(false.sample,function(x){file.copy(paste(train.false.dir,x,sep="/"),paste(valid.false.dir,x,sep="/"))})
If you wanted to remove those files afterwards, you could use these two lines.
Please make a backup first.
sapply(true.sample,function(x){file.remove(paste(train.true.dir,x,sep="/"))})
sapply(false.sample,function(x){file.remove(paste(train.false.dir,x,sep="/"))})