Julia - the way of kings (generator performance)

Question

I had some python code which I tried to port to Julia to learn this lovely language. I used generators in python. After porting it seems to me (at this moment) that Julia is really slow in this area!

I made part of my code simplified to this exercise:

Think 4x4 chess board. Find every N-moves long path, chess king could do. In this exercise, the king is not allowed to leap twice at the same position in one path. Don't waste memory -> make a generator of every path.

Algorithm is pretty simple:

if we sign every position with numbers:

0  1  2  3
4  5  6  7
8  9  10 11
12 13 14 16

point 0 has 3 neighbors (1, 4, 5). We could find a table for every neighbor for every point:

NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]]

PYTHON

A recursive function (generator) which enlarge given path from the list of points or from a generator of (generator of ...) points:

def enlarge(path):
    if isinstance(path, list):
        for i in NEIG[path[-1]]:
            if i not in path:
                yield path[:] + [i]
    else:
        for i in path:
            yield from enlarge(i)

Function (generator) which give every path with given length

def paths(length):
    steps = ([i] for i in range(16))  # first steps on every point on board
    for _ in range(length-1):
        nsteps = enlarge(steps)
        steps = nsteps
    yield from steps

We could see that there are 905776 paths with length 10:

sum(1 for i in paths(10))
Out[89]: 905776

JULIA (this code was created by @gggg during our discussion here )

const NEIG_py = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
const NEIG = [n.+1 for n in NEIG_py]
function enlarge(path::Vector{Int})
    (push!(copy(path),loc) for loc in NEIG[path[end]] if !(loc in path))
end
collect(enlarge([1]))
function enlargepaths(paths)
    Iterators.Flatten(enlarge(path) for path in paths)
end
collect(enlargepaths([[1],[2]]))
function paths(targetlen)
    paths = ([i] for i=1:16)
    for newlen in 2:targetlen
        paths = enlargepaths(paths)
    end
    paths
end
p = sum(1 for path in paths(10))

benchmark

In ipython we could time it:

python 3.6.3:

%timeit sum(1 for i in paths(10))
1.25 s ± 15.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

julia 0.6.0

julia> @time sum(1 for path in paths(10))
  2.690630 seconds (41.91 M allocations: 1.635 GiB, 11.39% gc time)
905776

Julia 0.7.0-DEV.0

julia> @time sum(1 for path in paths(10))
  4.951745 seconds (35.69 M allocations: 1.504 GiB, 4.31% gc time)
905776

Question(s):

We Julians are saying this: It is important to note that the benchmark codes are not written for absolute maximal performance (the fastest code to compute recursion_fibonacci(20) is the constant literal 6765). Instead, the benchmarks are written to test the performance of identical algorithms and code patterns implemented in each language.

In this benchmark, we are using the same idea. Just simple for cycles over arrays enclosed to generators. (Nothing from numpy, numba, pandas or others c-written and compiled python packages)

Is assumption that Julia's generators are terribly slow right?

What could we do to make it really fast?

`paths(10)` gives `Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Iterators.Flatten{Base.Generator{Base.Generator{UnitRange{Int64},##15#16},#enlarge}},#enlarge}},#enlarge}},#enlarge}},#enlarge}},#enlarge}},#enlarge}},#enlarge}},#enlarge}}(Base.Generator{Base.Iterators.F`... I don't think this looks like its going to be fast. — daycaster, Nov 03 '17 at 08:47
Sorry if this is obvious, but you are running the julia one twice and timing the second run not the first right? Otherwise you are just timing the compile time... — Colin T Bowers, Nov 03 '17 at 10:12
I know it's cheating but the most effective way to get what you want is likely `enumerate_paths` (`.pathcounts`) in the LightGraphs package. — Michael K. Borregaard, Nov 03 '17 at 10:12
Looking at the results of the proposed Julia code, I'm not sure if the numbers are correct. For example, eyeballing: `[i for i in paths(3) if i[1] == 1]` I can't see any paths where the king would return to position 1. So instead of the 15 paths from positon 1 this should be 18, right? I might be misunderstanding the problem proposition here. — niczky12, Nov 03 '17 at 11:34
When you ask "What could we do to make it really fast?" you mean necessarily using generators? — Angel de Vicente, Nov 03 '17 at 14:35
yes, right now generators aren't the best way to program but they work. In any case you need to optimize it, you can use a loop, so this has an easy workaround for the time being. There are a few type-inference and inlining issues that come up with generators that cause this, for now, which hopefully get fixed in a 1.x. — Chris Rackauckas, Nov 03 '17 at 15:30
@daycaster it could look horrible but I think there is just several long names created recursively. — Liso, Nov 03 '17 at 16:58
@ColinTBowers I would be glad to be corrected. But I am afraid it is really so slow in julia. Measured time is just approximation. `@btime` in julia also get different result than `@time`. — Liso, Nov 03 '17 at 17:02
@niczky12 thanks for pointing this out! I was too quick to describe "excercise". We are looking path where king don't leap twice at same position. this part of code `if !(loc in path)` is used for this. I will edit question immediately. — Liso, Nov 03 '17 at 17:06
@AngeldeVicente I meant to help improve generator code. Degrading performance in 0.7 suggest that there could be some "unpolished" part of code. — Liso, Nov 03 '17 at 17:11
@ChrisRackauckas sorry - maybe it is because I am not so good in julia maybe my brain has another issues but I don't see any easy (good readable) workaround. I am a little afraid that problem could be deeper in design of generators and that has to be revised before 1.0. — Liso, Nov 03 '17 at 17:15
@MattB. Sometimes quick and dirty (means for example slow) solution is good if you need to do something in REPL and it is not for production code. I was also thinking about QuickAndDirty package for this purposes :) I don't think that generators are second class citizens in Julia (maybe I am wrong). What do you mean with type instability in this case? Could you be more precise please? BTW `repr(paths(10))==repr(paths(10))` returns true. — Liso, Nov 03 '17 at 17:27
@MichaelK.Borregaard thanks! I am going to look at it! :) Cheating could be disqualified for pure benchmarking but anyway it could fine to see faster julia code solving this problem! :) — Liso, Nov 03 '17 at 17:38

score 6 · Answer 1 · answered Nov 03 '17 at 20:29

6

const NEIG_py = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
const NEIG = [n.+1 for n in NEIG_py];

function expandto(n, path, targetlen)
    length(path) >= targetlen && return n+1
    for loc in NEIG[path[end]]
        loc in path && continue
        n = expandto(n, (path..., loc), targetlen)
    end
    n
end

function npaths(targetlen)
    n = 0
    for i = 1:16
        path = (i,)
        n = expandto(n, path, targetlen)
    end
    n
end

Benchmark (after executing once for JIT-compilation):

julia> @time npaths(10)
  0.069531 seconds (5 allocations: 176 bytes)
905776

which is considerably faster.

answered Nov 03 '17 at 20:29

tholy

11,882
1
29
42

Thanks @tholy! :) But task is generate all path instead of find number of path. Maybe I made it unclear with showing number of path with `sum(1 for path in paths(10))` - sorry for that, but it is really about paths generator. – Liso Nov 03 '17 at 20:58
1

If the task was to create all the paths, what was then the point in using generators 'save memory'? – DNF Nov 03 '17 at 21:06
@DNF sorry for misunderstanding (my english is not very good). Create (or generate) doesn't mean store every path into one big array (which could be bigger than computer's physical memory). But with generator you could check every path one by one (and forgot it immediately if doesn't satisfy some criteria). – Liso Nov 03 '17 at 21:27
3

This does generate all paths, as you can see by looking at how `expandto` works. But since you said "save memory" it also discards the paths immediately once it knows to count them. – tholy Nov 04 '17 at 09:59
1

The low allocation count of the first one is simply a testament to Julia's amazing compiler, which can infer so much that the first one's allocations are almost entirely elided to give you just the answer you want. As a demonstration that this strategy is general, you could replace the `length(path) >= targetlen && return n+1` line with anything you want, e.g., `length(path) >= targetlen && return n + sum(x->x==2, path)` to count the number of paths that visit node 2 (answer turns out to be 588510). This takes more allocations but still runs in 0.14s, which I bet python can't match. – tholy Nov 04 '17 at 10:08

mbauman · Answer 2 · 2017-11-03T21:51:10.487

Julia's "better performance" than Python isn't magical. Most of it stems directly from the fact that Julia can figure out what each variable's type within a function will be, and then compile highly specialized code for those specific types. This even applies to the elements in many containers and iterables like generators; Julia often knows ahead of time what type the elements will be. Python isn't able to do this analysis nearly as easily (or at all, in many cases), so its optimizations have focused on improving the dynamic behaviors.

In order for Julia's generators to know ahead of time what kinds of types they might produce, they encapsulate information about both the operation they perform and the object they iterate over in the type:

julia> (1 for i in 1:16)
Base.Generator{UnitRange{Int64},getfield(Main, Symbol("##27#28"))}(getfield(Main, Symbol("##27#28"))(), 1:16)

That weird ##27#28 thing is the type of an anonymous function that simply returns 1. By the time the generator gets to LLVM, it knows enough to perform quite a large number of optimizations:

julia> function naive_sum(c)
           s = 0
           for elt in c
               s += elt
           end
           s
       end
       @code_llvm naive_sum(1 for i in 1:16)

; Function naive_sum
; Location: REPL[1]:2
define i64 @julia_naive_sum_62385({ { i64, i64 } } addrspace(11)* nocapture nonnull readonly dereferenceable(16)) {
top:
; Location: REPL[1]:3
  %1 = getelementptr inbounds { { i64, i64 } }, { { i64, i64 } } addrspace(11)* %0, i64 0, i32 0, i32 0
  %2 = load i64, i64 addrspace(11)* %1, align 8
  %3 = getelementptr inbounds { { i64, i64 } }, { { i64, i64 } } addrspace(11)* %0, i64 0, i32 0, i32 1
  %4 = load i64, i64 addrspace(11)* %3, align 8
  %5 = add i64 %4, 1
  %6 = sub i64 %5, %2
; Location: REPL[1]:6
  ret i64 %6
}

It may take a minute to parse through the LLVM IR there, but you should be able to see that it's just extracting the endpoints of the UnitRange (getelementptr and load), subtracting them from each other (sub) and adding one to compute the sum without a single loop.

In this case, though, it works against Julia: paths(10) has a ridiculously complicated type! You're iteratively wrapping that one generator in filters and flattens and yet more generators. It becomes so complicated, in fact, that Julia just gives up trying to figure out with it and decides to live with the dynamic behavior. And at this point, it no longer has an inherent advantage over Python — in fact specializing on so many different types as it recursively walks through the object would be a distinct handicap. You can see this in action by looking at @code_warntype start(1 for i in paths(10)).

My rule of thumb for Julia's performance is that type-stable, devectorized code that avoids allocations is typically within an factor of 2 of C, and dynamic, unstable, or vectorized code is within an order of magnitude of Python/MATLAB/other higher level languages. Often it's a bit slower simply because the other higher level languages have pushed very hard to optimize their case, whereas the majority of Julia's optimizations have been focused on the type-stable side of things. This deeply nested construct puts you squarely in the dynamic camp.

So are Julia's generators terribly slow? Not inherently so; it's just when they become so deeply nested like this that you hit this bad case.

Thanks for your explanation! I am still a little suspicious about difficulty for julia. From my point of view 10 is small number. I expect that it has to do 10 times some kind of simple `type "arithmetics"`. I mean something like 10 times `flatten <| gen <| {T}` which has always to stay `generator{T}` like. Am I missing something? — Liso, Nov 04 '17 at 07:05

Angel de Vicente · Answer 3 · 2017-11-03T21:40:32.297

Not following the same algorithm (and don't know how fast Python would be doing it like this), but with the following code Julia is basically the same for solutions of length=10, and much better for solutions of length=16

In [48]: %timeit sum(1 for path in paths(10))                                                                                                          
1.52 s ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)                                                                                    

julia> @time sum(1 for path in pathsr(10))                                                                                                             
  1.566964 seconds (5.54 M allocations: 693.729 MiB, 16.24% gc time)                                                                                   
905776                                                                                                                                                 

In [49]: %timeit sum(1 for path in paths(16))                                                                                                          
19.3 s ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)                                                                                    

julia> @time sum(1 for path in pathsr(16))                                                                                                             
  6.491803 seconds (57.36 M allocations: 9.734 GiB, 33.79% gc time)                                                                                    
343184

Here is the code. I just learnt about tasks/channels yesterday, so probably it can be done better:

const NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], \
[5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];

function enlarger(num::Int,len::Int,pos::Int,sol::Array{Int64,1},c::Channel)
    if pos == len
        put!(c,copy(sol))
    elseif pos == 0
        for j=0:num
            sol[1]=j
            enlarger(num,len,pos+1,sol,c)
        end
        close(c)
    else
        for i in NEIG[sol[pos]+1]
            if !in(i,sol[1:pos])
                sol[pos+1]=i
                enlarger(num,len,pos+1,sol,c)
            end
        end
    end
end

function pathsr(len)
    c=Channel(0)
    sol = [0 for i=1:len]
    @schedule enlarger(15,len,0,sol,c)
    (i for i in c)
end

Thanks! Maybe I am doing something wrong but I get: `ERROR (unhandled task failure): BoundsError: attempt to access 16-element Array{Array{Int64,1},1} at index [17]` for len>=3... — Liso, Nov 03 '17 at 21:13
I didn't modify the NEIG array, so it assumes const NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], \ [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]]; — Angel de Vicente, Nov 03 '17 at 21:36
I will modify the post to include this definition of NEIG to avoid misunderstandings — Angel de Vicente, Nov 03 '17 at 21:39
Nice! :) You could remove `\\` too - it could be simpler to copy+paste+try :) — Liso, Nov 03 '17 at 21:56

score 2 · Answer 4 · answered Nov 04 '17 at 00:11

Following tholy's answer, since tuples seem to be very fast. This is like my previous code, but with the tuple stuff, and it gets substantially better results:

julia> @time sum(1 for i in pathst(10))
  1.155639 seconds (1.83 M allocations: 97.632 MiB, 0.75% gc time)
905776

julia> @time sum(1 for i in pathst(16))
  1.963470 seconds (1.39 M allocations: 147.555 MiB, 0.35% gc time)
343184

The code:

const NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];

function enlarget(path,len,c::Channel)
    if length(path) >= len
        put!(c,path)
    else
        for loc in NEIG[path[end]+1]
            loc in path && continue
            enlarget((path..., loc), len,c)
        end
        if length(path) == 1
            path[1] == 15 ? close(c) : enlarget((path[1]+1,),len,c)
        end
    end
end

function pathst(len)
    c=Channel(0)
    path=(0,)
    @schedule enlarget(path,len,c)
    (i for i in c)
end

Thanks! :) Really nice! There is less 16-step paths than 10-step paths => `Channel` could be hurdle here. By cheating with callback function (counter) instead of Channel I get `0.181510s` instead of `1.345273s` for 10-step paths. — Liso, Nov 04 '17 at 08:00
No surprise there are fewer 16-step paths since some shorter paths will close on themselves and then you cannot extend them further without repeating cells. In any case, the python code (which I took as the yardstick) gives the same number of paths for both 10- and 16-steps. — Angel de Vicente, Nov 04 '17 at 08:43
By the way, I'm just learning Julia myself, so I'm interested in the callback approach you mention. Can you post it here somehow or send it to me privately? Cheers. — Angel de Vicente, Nov 04 '17 at 08:48
I am learning too and it took me some time to do it (and I am pretty sure it is not optimal written). I'll create answer — Liso, Nov 04 '17 at 09:00

Dan Getz · Answer 5 · 2017-11-04T16:37:23.620

Since everybody is writing an answer... here is another version, this time using Iterators, which are kind-of more idiomatic than generators in current Julia (0.6.1). Iterators offer many of the benefits generators have. The iterator definition is in the following lines:

import Base.Iterators: start, next, done, eltype, iteratoreltype, iteratorsize

struct SAWsIterator
    neigh::Vector{Vector{Int}}
    pathlen::Int
    pos::Int
end

SAWs(neigh, pathlen, pos) = SAWsIterator(neigh, pathlen, pos)

start(itr::SAWsIterator) = 
    ([itr.pos ; zeros(Int, itr.pathlen-1)], Vector{Int}(itr.pathlen-1),
     2, Ref{Bool}(false), Ref{Bool}(false))

@inline next(itr::SAWsIterator, s) = 
    ( s[4][] ? s[4][] = false : calc_next!(itr, s) ; 
      (s[1], (s[1], s[2], itr.pathlen, s[4], s[5])) )

@inline done(itr::SAWsIterator, s) = ( s[4][] || calc_next!(itr, s) ; s[5][] )

function calc_next!(itr::SAWsIterator, s)
    s[4][] = true ; s[5][] = false
    curindex = s[3]
    pathlength = itr.pathlen
    path, options = s[1], s[2]
    @inbounds while curindex<=pathlength
        curindex == 1 && ( s[5][] = true ; break )
        startindex = path[curindex] == 0 ? 1 : options[curindex-1]+1
        path[curindex] = 0
        i = findnext(x->!(x in path), neigh[path[curindex-1]], startindex)
        if i==0
            path[curindex] = 0 ; options[curindex-1] = 0 ; curindex -= 1
        else
            path[curindex] = neigh[path[curindex-1]][i]
            options[curindex-1] = i ; curindex += 1
        end
    end
    return nothing
end

eltype(::Type{SAWsIterator}) = Vector{Int}
iteratoreltype(::Type{SAWsIterator}) = Base.HasEltype()
iteratorsize(::Type{SAWsIterator}) = Base.SizeUnknown()

Cut-and-pasting the definition above works. The term SAW was used as an acronym of Self Avoiding Walk, which is sometimes used in mathematics for such a path.

Now, to use/test this iterator, the following code can be executed:

allSAWs(neigh, pathlen) = 
  Base.Flatten(SAWs(neigh,pathlen,k) for k in eachindex(neigh))

iterlength(itr) = mapfoldl(x->1, +, 0, itr)

using Base.Test

const neigh = [[2, 5, 6], [1, 3, 5, 6, 7], [2, 4, 6, 7, 8], [3, 7, 8], 
  [1, 2, 6, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 4, 6, 8, 10, 11, 12], 
  [3, 4, 7, 11, 12], [5, 6, 10, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], 
  [6, 7, 8, 10, 12, 14, 15, 16], [7, 8, 11, 15, 16], [9, 10, 14], 
  [9, 10, 11, 13, 15], [10, 11, 12, 14, 16], [11, 12, 15]]

@test iterlength(allSAWs(neigh, 10)) == 905776

for (i,path) in enumerate(allSAWs(neigh, 10))
    if i % 100_000 == 0
        @show i,path
    end
end

@time iterlength(allSAWs(neigh, 10))

It is relatively readable, and the output looks like this:

(i, path) = (100000, [2, 5, 10, 14, 9, 6, 7, 12, 15, 11])
(i, path) = (200000, [4, 3, 8, 7, 6, 10, 14, 11, 16, 15])
(i, path) = (300000, [5, 10, 11, 16, 15, 14, 9, 6, 7, 3])
(i, path) = (400000, [8, 3, 6, 5, 2, 7, 11, 14, 15, 10])
(i, path) = (500000, [9, 14, 10, 5, 2, 3, 8, 11, 6, 7])
(i, path) = (600000, [11, 16, 15, 14, 10, 6, 3, 8, 7, 12])
(i, path) = (700000, [13, 10, 15, 16, 11, 6, 2, 1, 5, 9])
(i, path) = (800000, [15, 11, 12, 7, 2, 3, 6, 1, 5, 9])
(i, path) = (900000, [16, 15, 14, 9, 5, 10, 7, 8, 12, 11])
  0.130755 seconds (4.16 M allocations: 104.947 MiB, 11.37% gc time)
905776

0.13s is not too bad considering this is not as optimized as @tholy's answer, or some others'. Some tricks used in the other answers are deliberately not used here, specifically:

recursion basically uses the stack as a quick way to allocate.
Using tuples combined with specialization hides some run-time complexity in the first compile of methods for each tuple signature.

An optimization not seen in the answers yet could be important is using an efficient Bool array or Dict to speed up the check if a vertex was already used in the path. In this answer, the findnext triggers an allocation, which can be avoided and then this answer will be closer to the minimal memory allocation count.

Cool! :D Thanks! :) I need a time to study it (and learn). One newbie question: are the first and the last bracket in next function necessary? — Liso, Nov 04 '17 at 18:24
The brackets are necessary because it is a multi-statement expression (notice the semi-colon). — Dan Getz, Nov 04 '17 at 21:54
I just test it and found that it returns also wrong paths with zero index inside. (I also found strange behavior which I like to discuss on discourse) — Liso, Nov 13 '17 at 10:32

Liso · Answer 6 · 2017-11-04T10:50:02.730

This is my quick and dirty cheating experiment (I promised to add it here in comment) where I am trying to speedup Angel's code:

const NEIG_py = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
const NEIG = [n.+1 for n in NEIG_py]

function enlargetc(path,len,c::Function)
    if length(path) >= len
        c(path)
    else
        for loc in NEIG[path[end]]
            loc in path && continue
            enlargetc((path..., loc), len,c)
        end
        if length(path) == 1
            if path[1] == 16 return
            else enlargetc((path[1]+1,),len,c)
            end
        end
    end
end

function get_counter()
    let helper = 0
        function f(a)
            helper += 1
            return helper
        end
        return f
    end
end

counter = get_counter()
@time enlargetc((1,), 10, counter)  # 0.481986 seconds (2.62 M allocations: 154.576 MiB, 5.12% gc time)
counter.helper.contents  # 905776

EDIT: time in comment is without recompilation! After recompilation it was 0.201669 seconds (2.53 M allocations: 150.036 MiB, 10.77% gc time).

Julia - the way of kings (generator performance)

6 Answers6