JULIA : How can l write and store files in a loop ?

Question

l have a massive dataset that l divided into k mini datasets where k=100. Know l want to store these mini datasets in different files. to store my massive dataset l used the following instructions :

using JLD, HDF5
    X=rand(100000)
    file = jldopen("path to my file/mydata.jld", "w") # the extension of file is jld so you should add packages JLD and HDF5,  Pkg.add("JLD"), Pkg.add("HDF5"),
    write(file, "X", X)  # alternatively, say "@write file A"
    close(file)

Know l divided my dataset into k sub dataset where k=100

function get_mini_batch(X)

    mini_batches = round(Int, ceil(X / 100))

            for i=1:mini_batches
                mini_batch = X[((i-1)*100 + 1):min(i*100, end)]
                file= jldopen("/path to my file/mydata.jld", "w")
                write(file, "mini_batch", mini_batch)  # alternatively, say "@write file mini_batch"
                 lose(file)
            end
end

but this function allows to store the different sub dataset in one file which is overwritten at each iteration.

file= jldopen("/path to my file/mydata1.jld", "w")  # at each iteration l want to get files : mydata1, mydata2 ... mydata100
file= jldopen("/path to my file/mydata2.jld", "w")
file= jldopen("/path to my file/mydata3.jld", "w")
file= jldopen("/path to my file/mydata4.jld", "w")
.
.
.
file= jldopen("/path to my file/mydata100.jld", "w")

Alternatively l tried out this procedure function get_mini_batch(X)

    mini_batches = round(Int, ceil(X / 100))

            for i=1:mini_batches
                mini_batch[i] = X[((i-1)*100 + 1):min(i*100, end)]
                file[i]= jldopen("/path to my file/mydata.jld", "w")
                write(file, "mini_batch", mini_batch)  # alternatively, say "@write file mini_batch"
                 lose(file)
            end
end

but l don't have the idea of how to make a variable i=1....100 within this line code file[i]= jldopen("/path to my file/mydata(i).jld", "w")

score 5 · Accepted Answer · edited Sep 21 '18 at 06:07

5

You are looking for string formatting.

To create the filenames, you can use @sprintf(). Then you can use these strings to write your objects to disk.

julia> using Printf  # Needed in Julia 1.0.0
julia> @sprintf("myfilename%02.d.jld", 5)
"myfilename05.jld"

Example in a loop:

julia> for i in 1:3
           println(@sprintf("myfilename%03.d.jl", i))
       end
myfilename001.jl
myfilename002.jl
myfilename003.jl

I used %03.d here to show how you can add leading zeros to your file names. This will help later on when it comes to sorting.

edited Sep 21 '18 at 06:07

Julia Learner

2,754
15
35

answered Jun 24 '16 at 12:06

niczky12

4,953
1
24
34

but how can l can open a file , write inside and store it using @sprintf() ? like : file = jldopen("path to my file/mydata.jld", "w") # the extension of file is jld so you should add packages JLD and HDF5, Pkg.add("JLD"), Pkg.add("HDF5"), write(file, "X", X) # alternatively, say "@write file A" – vincet Jun 24 '16 at 12:14
You can use `@spritnf` to specify the filename. For example in your question, second code block replace: `file= jldopen("/path to my file/mydata.jld", "w")` with `file= jldopen(@sprintf("/path to file/mydata%d.jld, i), "w")` where `i` is the number of minibatch you are looping over. – niczky12 Jun 24 '16 at 12:16
My question is related to this topic "http://stackoverflow.com/questions/37989159/how-to-divide-my-data-into-distincts-mini-batches-randomly-julia". l want to create files to store X[P] values. How should l proceed to solve that ? – vincet Jun 27 '16 at 13:48

score 1 · Answer 2 · answered Jun 24 '16 at 13:25

1

I agree with niczky12 that you are looking for string formatting. But I would personally write it this alternative way:

"/path to my file/mydata$i.jld"

instead of using sprintf.

Example:

julia> i = 4
4

julia> "/path/mydata$i.jld"
"/path/mydata4.jld"

answered Jun 24 '16 at 13:25

Fengyang Wang

11,901
2
38
67

Yep, this is the easy way. I just prefer to have the leading zeros in this case, hence I used `@sprintf`. :) – niczky12 Jun 24 '16 at 13:36

JULIA : How can l write and store files in a loop ?

2 Answers2