I'm extremely dissatisfied after translating a program from Python to Julia:
- for small/very small inputs, Python is faster
- for medium inputs, Julia is faster (but not that much)
- for big inputs, Python is faster
I think the reason is that I don't understand how memory allocation works (autodidact here, no CS background). I would post my code here but it is too long and too specific and it would not be beneficial for anybody but me. Therefore I made some experiments and now I have some questions.
Consider this simple script.jl
:
function main()
@time begin
a = [1,2,3]
end
end
main()
When I run it I get:
$ julia script.jl
0.000004 seconds (1 allocation: 96 bytes)
1. Why 96 bytes? When I set a = []
I get 64 bytes (why does an empty array weight so much?). 96 bytes - 64 bytes = 32 bytes. But a
is an Array{Int64,1}
. 3 * 64 bits = 3 * 8 bytes = 24 bytes != 32 bytes.
2. Why do I get 96 bytes even if I set a = [1,2,3,4]
?
3. Why do I get 937.500 KB when I run this:
function main()
@time begin
for _ in 1:10000
a = [1,2,3]
end
end
end
main()
and not 960.000 KB?
4. Why is, for instance, filter()
so inefficient? Take a look at this:
check(n::Int64) = n % 2 == 0
function main()
@time begin
for _ in 1:1000
a = [1,2,3]
b = []
for x in a
check(x) && push!(b,x)
end
a = b
end
end
end
main()
$ julia script.jl
0.000177 seconds (3.00 k allocations: 203.125 KB)
instead:
check(n::Int64) = n % 2 == 0
function main()
@time begin
for _ in 1:1000
a = [1,2,3]
a = filter(check,a)
end
end
end
main()
$ julia script.jl
0.002029 seconds (3.43 k allocations: 225.339 KB)
And if I use an anonymous function (x -> x % 2 == 0
)instead of check inside filter, I get:
$ julia script.jl
0.004057 seconds (3.05 k allocations: 206.555 KB)
Why should I use a built-in function if it is slower and needs more memory?