After seeing a couple tutorials on the internet on Julia parallelism I decided to implement a small parallel snippet for computing the harmonic series.
The serial code is:
harmonic = function (n::Int64)
x = 0
for i in n:-1:1 # summing backwards to avoid rounding errors
x +=1/i
end
x
end
And I made 2 parallel versions, one using @distributed
macro and another using the @everywhere
macro (julia -p 2
btw):
@everywhere harmonic_ever = function (n::Int64)
x = 0
for i in n:-1:1
x +=1/i
end
x
end
harmonic_distr = function (n::Int64)
x = @distributed (+) for i in n:-1:1
x = 1/i
end
x
end
However, when I run the above code and @time
it, I don't get any speedup - in fact, the @distributed
version runs significantly slower!
@time harmonic(10^10)
>>> 53.960678 seconds (29.10 k allocations: 1.553 MiB) 23.60306659488827
job = @spawn harmonic_ever(10^10)
@time fetch(job)
>>> 46.729251 seconds (309.01 k allocations: 15.737 MiB) 23.60306659488827
@time harmonic_distr(10^10)
>>> 143.105701 seconds (1.25 M allocations: 63.564 MiB, 0.04% gc time) 23.603066594889185
What completely and absolutely baffles me is the "0.04% gc time
". I'm clearly missing something and also the examples I saw weren't for 1.0.1 version (one for example used @parallel
).