5

I'm trying to search for the desired pattern (variable template) in the array X. The length of the template is 9.

I'm doing something like:

function check_alloc{T <: ZeroOne}(x :: AbstractArray{T}, temp :: AbstractArray{T})
    s = 0
    for i in 1 : 1000
        myView = view(x, i : i + 9)
        if myView == temp
            s += 1
        end
    end
    return s
end

and obtain unexpected memory allocations (46 Kbytes) in this short loop. Why do it happen and how can I prevent memory allocations and performance degradation?

Kirill Tsar.
  • 293
  • 1
  • 8
  • What is `ZeroOne`? Also, you say the pattern you are searching for has length 9 but you are creating a view `i:i+9`which has length 10. – carstenbauer Dec 01 '17 at 10:45
  • 1
    This is not about `view`, but `==` operation (you can comment-out it and see `@time`). You can rewrite this comparison manually or look to `@edit (==)(AbstractArray[], AbstractArray[])` – Sairus Dec 01 '17 at 10:50
  • Maybe also look at https://stackoverflow.com/questions/36346005/julia-does-an-array-contain-a-specific-sub-array – carstenbauer Dec 01 '17 at 10:56
  • ZeroOne is the Union{Bool, Int8, UInt8}. It is true that view takes no memory to create, but why is (==) operation so slow? Why do it need such a lot of memory? – Kirill Tsar. Dec 01 '17 at 11:52
  • `==` between two arrays (or `.==`) creates a temporary array. You will want to `res .= myView .== temp` to do the computation inplace, or write out the loop. – Chris Rackauckas Dec 01 '17 at 14:10
  • 1
    `==` doesn't create a temporary array, but `.==` does. – tholy Dec 02 '17 at 10:55
  • Related - [Too many allocations when indexing with slices](https://discourse.julialang.org/t/13532). – Royi Aug 17 '18 at 11:31

3 Answers3

19

The reason you're getting allocations is because view(A, i:i+9) creates a small object called a SubArray. This is just a "wrapper" that essentially stores a reference to A and the indices you passed in (i:i+9). Because the wrapper is small (~40 bytes for a one-dimensional object), there are two reasonable choices for storing it: on the stack or on the heap. "Allocations" refer only to heap memory, so if Julia can store the wrapper on the stack it would report no allocations (and would also be faster).

Unfortunately, some SubArray objects currently (as of late 2017) have to be stored on the heap. The reason is because Julia is a garbage-collected language, which means that if A is a heap-allocated object that is no longer in use, then A might be freed from memory. The key point is this: currently, references to A from other variables are counted only if those variables are stored on the heap. Consequently, if all SubArrays were stored on the stack, you would have a problem for code like this:

function create()
    A = rand(1000)
    getfirst(view(A, 1:10))
end

function getfirst(v)
    gc()   # this triggers garbage collection
    first(v)
end

Because create doesn't use A again after that call to getfirst, it's not "protecting" A. The risk is that the gc call could end up freeing the memory associated with A (and thus breaking any usage of entries in v itself, since v relies on A), unless having v protects A from being garbage-collected. But currently, stack-allocated variables can't protect heap-allocated memory: the garbage collector only scans variables that are on the heap.

You can watch this in action with your original function, modified to be slightly less restrictive by getting rid of the (irrelevant, for these purposes) T<:ZeroOne and allowing any T.

function check_alloc(x::AbstractArray{T}, temp::AbstractArray{T}) where T
    s = 0
    for i in 1 : 1000
        myView = view(x, i : i + 9)
        if myView == temp
            s += 1
        end
    end
    return s
end

a = collect(1:1010);      # this uses heap-allocated memory
b = collect(1:10);

@time check_alloc(a, b);  # ignore the first due to JIT-compilation
@time check_alloc(a, b)

a = 1:1010                # this doesn't require heap-allocated memory
@time check_alloc(a, b);  # ignore due to JIT-compilation
@time check_alloc(a, b)

From the first one (with a = collect(1:1010)), you get

julia> @time check_alloc(a, b)
  0.000022 seconds (1.00 k allocations: 47.031 KiB)

(notice this is ~47 bytes per iteration, consistent with the size of the SubArray wrapper) but from the second (with a = 1:1010) you get

julia> @time check_alloc(a, b)
  0.000020 seconds (4 allocations: 160 bytes)

There's an "obvious" fix to this problem: change the garbage collector so that stack-allocated variables can protect heap-allocated memory. That will happen some day, but it's an extremely complex operation to support properly. So for now, the rule is that any object that contains a reference to heap-allocated memory must be stored on the heap.

There's one final subtlety: Julia's compiler is quite smart, and in some cases elides the creation of the SubArray wrapper (basically, it rewrites your code in a way that uses the parent array object and the indices separately so that it never needs the wrapper itself). For that to work, Julia has to be able to inline any function calls into the function that created the view. Unfortunately, here == is slightly too big for the compiler to be willing to inline it. If you manually write out the operations that will be performed, then the compiler will elide the view and you'll also avoid allocations.

tholy
  • 11,882
  • 1
  • 29
  • 42
  • 1
    Thank you for sharing in depth knowledge on what's going on behind the scene. Exposed from - [Too many allocations when indexing with slices](https://discourse.julialang.org/t/13532). – Royi Aug 17 '18 at 11:30
1

This at least works for arbitrary sized temp and x but still has ~KB allocations.

function check_alloc{T}(x :: AbstractArray{T}, temp :: AbstractArray{T})
    s = 0
    pl = length(temp)
    for i in 1:length(x)-pl+1
        @views if x[i:i+pl-1] == temp
            s += 1
        end
    end
    return s
end

EDIT: As suggested by @Sairus in the comments, one can do something in the spirit of this:

function check_alloc2{T}(x :: AbstractArray{T}, temp :: AbstractArray{T})
    s = 0
    pl = length(temp)
    plr = 1:pl
    for i in 1:length(x)-pl+1
        same = true
        for k in plr
            @inbounds if x[i+k-1] != temp[k]
                same = false
                break
            end
        end
        if same
            s+=1
        end
    end
    return s
end

This has no allocations:

julia> using BenchmarkTools

julia> a = collect(1:1000);

julia> b = collect(5:12);

julia> @btime check_alloc2($a,$b);
  1.195 μs (0 allocations: 0 bytes)
carstenbauer
  • 9,817
  • 1
  • 27
  • 40
  • Thank you very much! I was completely puzzled with this situation. Is it really needed for (==) operation to allocate such a lot of memory? – Kirill Tsar. Dec 01 '17 at 11:55
1

As of Julia 1.7.0 (maybe even earlier), the first code from @carstenbauer with a view, does not allocate any more (on the heap):

function check_alloc(x :: AbstractArray{T}, temp :: AbstractArray{T}) where T
     s = 0
     pl = length(temp)
     for i in 1:length(x)-pl+1
          @views if x[i:i+pl-1] == temp
             s += 1
           end
     end
     return s
end
using BenchmarkTools
a = collect(1:1000);
b = collect(5:12);
@btime check_alloc($a,$b);
# returns
#  8.495 μs (0 allocations: 0 bytes)
Alex338207
  • 1,825
  • 12
  • 16