l have a massive dataset that l divided into k mini datasets where k=100. Know l want to store these mini datasets in different files. to store my massive dataset l used the following instructions :
using JLD, HDF5
X=rand(100000)
file = jldopen("path to my file/mydata.jld", "w") # the extension of file is jld so you should add packages JLD and HDF5, Pkg.add("JLD"), Pkg.add("HDF5"),
write(file, "X", X) # alternatively, say "@write file A"
close(file)
Know l divided my dataset into k sub dataset where k=100
function get_mini_batch(X)
mini_batches = round(Int, ceil(X / 100))
for i=1:mini_batches
mini_batch = X[((i-1)*100 + 1):min(i*100, end)]
file= jldopen("/path to my file/mydata.jld", "w")
write(file, "mini_batch", mini_batch) # alternatively, say "@write file mini_batch"
lose(file)
end
end
but this function allows to store the different sub dataset in one file which is overwritten at each iteration.
file= jldopen("/path to my file/mydata1.jld", "w") # at each iteration l want to get files : mydata1, mydata2 ... mydata100
file= jldopen("/path to my file/mydata2.jld", "w")
file= jldopen("/path to my file/mydata3.jld", "w")
file= jldopen("/path to my file/mydata4.jld", "w")
.
.
.
file= jldopen("/path to my file/mydata100.jld", "w")
Alternatively l tried out this procedure function get_mini_batch(X)
mini_batches = round(Int, ceil(X / 100))
for i=1:mini_batches
mini_batch[i] = X[((i-1)*100 + 1):min(i*100, end)]
file[i]= jldopen("/path to my file/mydata.jld", "w")
write(file, "mini_batch", mini_batch) # alternatively, say "@write file mini_batch"
lose(file)
end
end
but l don't have the idea of how to make a variable i=1....100 within this line code file[i]= jldopen("/path to my file/mydata(i).jld", "w")