0

I have a function that is a bit computationally expensive. I expected the function to take a lot of time but it instead appears to run out of memory space. This doesnt make sense though as most of the variables in the function are temporary and the function only returns a single value that is stored in a global array.

Here's the function:

function entropy(img)

    mat = Matrix{Float64}(img)
    mat *= 255
    fx = zeros(size(mat))
    fy = zeros(size(mat))


    fx = Matrix{Float64}(mat[:,3:end] - mat[:,1:end-2])[2:end-1,:]
    fy  = Matrix{Float64}(mat[3:end,:] - mat[1:end-2,:])[:,2:end-1]

    fx = collect(Iterators.flatten(fx))
    fy = collect(Iterators.flatten(fy))
    range = maximum([abs(minimum(fx)),abs(maximum(fx)),abs(minimum(fy)),abs(maximum(fy))])

    bins = 2*range+1

    delDensity,xedges,yedges = hist2D(fx,fy,Int(bins))

    delDensity = delDensity ./ sum(delDensity)

    delDensity = transpose(delDensity)

    p = delDensity[delDensity .!= 0]

    delDensity,mat,fx,fy,range,bins = [0,0,0,0,0,0]
    
    return -0.5*sum(p .* log2.(p))
end

I apply the function over an array of 2000 values.

as = 0:0.0005:1

ses = Dict(lasm=>[],slmm=>[])

for sys in [lasm,slmm]

    systems = [deepcopy(sys) for _ in 1:Threads.nthreads()-1]
    pushfirst!(systems, sys)

    Threads.@threads for i in eachindex(as)
        system = systems[Threads.threadid()]
        set_parameter!(system, 1, as[i])
        shuf,r,c = shuffle(img,sys)
        push!(ses[sys],entropy(shuf))
    end
end

I've even tried deallocating the variables withing the function but it was of no use. The memory usage remains stable for sometime and then just skyrockets till the system kills the kernel.

I'm using VSCode to run the .ipynb file using a Julia kernel.

  • Is this a specific Jupyter issue, i.e. does the code not OOM when you run it in a REPL? – Nils Gudat Aug 02 '23 at 13:36
  • 2
    `systems = [deepcopy(sys) for _ in 1:Threads.nthreads()-1]` you want to reduce this (i.e. reduce # of threads), if you don't have RAM, you have to run them sequencially – jling Aug 02 '23 at 16:57
  • @jling running them sequentially runs into the same problem. – Manav Karthikeyan Aug 05 '23 at 04:15
  • It could be the bug before 1.10 like https://discourse.julialang.org/t/the-improvement-of-memory-management-in-multihtreading-julia-1-10beta-is-amazing/102468 . Is the `push!` thread safe? – xgdgsc Aug 08 '23 at 11:57
  • @xgdgsc It apparently was caused because the hist2D function I was using rendered an image everytime I called it, however in theory it should have not been stored since I'm neither plotting it nor storing it. Maybe it has something to do with the garbage collector? I now use a custom function to calculate the hist. Works fine. – Manav Karthikeyan Aug 09 '23 at 11:07
  • 1
    `hist2D` seems to be from PyPlot. It's possible that Python's GC is the one handling the result here, and doesn't know when to free it (that's just my guess, maybe wrong). StatsBase.jl has a `fit(Histogram, ...)` method which might be useful here, but if your custom function fits your needs then that should be fine. – Sundar R Aug 10 '23 at 21:24

0 Answers0