11

I am trying to let a neural net run on metal. The basic idea is that of data duplication. Each gpu thread runs one version of the net for random data points.

I have written other shaders that work fine.

I also tried my code in a c++ command line app. No errors there. There is also no compile error.

I used the apple documentation to convert to metal c++, since not everything from c++11 is supported.

It crashes after it loads the kernel function and when it tries to assign newComputePipelineStateWithFunction to the metal device. This means there is a problem with the code that isn't caught at compile time.

MCVE:

kernel void net(const device float *inputsVector [[ buffer(0) ]], // layout of net *
                uint id [[ thread_position_in_grid ]]) {

    uint floatSize = sizeof(tempFloat);
    uint inputsVectorSize = sizeof(inputsVector) / floatSize;

    float newArray[inputsVectorSize];


    float test = inputsVector[id];

    newArray[id] = test;

}

Update

It has everything to do with dynamic arrays.

Since it fails to create the pipeline state and doesn't crash running the actual shader it must be a coding issue. Not an input issue.

Assigning values from a dynamic array to a buffer makes it fail.

R Menke
  • 8,183
  • 4
  • 35
  • 63
  • 4
    The amount of text in your question is terrifying. I did read it all, and still can't answer the very basic questions: where is the minimum reproducible example of code, and what do you understand by "don't work". – Codeguard Aug 28 '15 at 21:34
  • 1
    You should replace everything with minimum code that reproduces error, and precise description of what the error is. – Codeguard Aug 28 '15 at 21:34
  • Are you familiar with Metal? Then you know it is not possible at the moment to pinpoint the line of code that gives the error, or get a detailed state and find values that break the program. Anyway I asked the question after working on this problem for one day. I kept adding what I tried and found out. Sorry about the amount of text, but since this is all new territory it is better to add more info. – R Menke Aug 28 '15 at 23:27
  • The exact error is clearly stated in the title. The offensive code is found in the example. The complete, but hugely simplified shader is added to make it possible for others to quickly try and reproduce my crashes. Comments are added in the code to show where I found a piece that causes a crash. – R Menke Aug 28 '15 at 23:29
  • p.s. "What is a bounty? How can I start one? If you’ve asked a good question, edited it with status and progress updates, and still are not receiving answers, you can draw attention to your question by placing a bounty on it." – R Menke Aug 29 '15 at 13:19
  • you **can** do it your way, but you will probably get no answers. – Codeguard Aug 29 '15 at 19:06
  • 1
    From your profile I am guessing you don't code for mac. So you don't have any idea what a pain Metal can be. I would appreciate it if we can keep this on topic and not make it about form. When and if ever Apple makes it possible to understand why a certain pipeline failed there will be plenty of questions in good form. Until then it will be messy, frustrating and without a lot of answers. Maybe I get lucky and someone already had a similar issue. – R Menke Aug 29 '15 at 19:25
  • Since I am not familiar with Metal (I have some experience with nVidia CUDA), I can only offer some general suggestions: 1) kernel code like the above can fail for memory overflow from the buffered parameters. In your example above how big are the [[ buffer(?) ]] stores? Are there any differences between your working examples and the failing examples in terms of the way the buffer values are accessed? 2) I have found that it often helps to implement old fashioned debugging using printf on the device. Have you tried this? This often gives me insight into the tom-foolery I have implemented. – William Jones Aug 30 '15 at 11:36
  • thx for the suggestions! 1) The buffers are between 1 and 32 floats at the moment. I have ran buffers with 500.000 values in them. No problem. Also the program fails before actually assigning the buffers. Metal goes through a check when it starts a new function. Obviously this should all be done at compile time resulting in compile errors. 2) using print wouldn't help very much since there is something wrong with actual code and not values. So checking which line he failed on isn't possible, since it only "read" the shader, deemed it not worthy and gave up – R Menke Aug 30 '15 at 12:12
  • @RMenke sounds an awful lot like memory management issues. – user2398029 Aug 30 '15 at 13:20
  • Also "just don't add the actual outcome of the calculations to the output it runs fine" should be your starting point for debugging. – user2398029 Aug 30 '15 at 13:22
  • @louism I don't think it is a memory management issue. At least not a real one. Then it would crash on running the shader. It is possible that Metal thinks the shader might produce memory trouble and therefore refuses to run it. – R Menke Aug 30 '15 at 14:55
  • @louism I tried to explain it better how Metal works and at what point it crashes. Adding more txt, making the entire thing more terrifying and gathering more down votes. But I just want to find a solution to this. – R Menke Aug 30 '15 at 15:03
  • "Assigning values from a dynamic array to a buffer makes it fail" - more evidence that this could be memory management issues? Sorry, I don't have the time to go through the whole thing in detail. It's just a hunch. – user2398029 Aug 31 '15 at 03:05
  • @louism I would normally agree. But is it possible to have memory management issues on code that is not running? On values that are not initialised, not assigned any values? If so, how does the Metal API determine that there will be a memory management issue? – R Menke Aug 31 '15 at 04:05
  • At the moment I am trying something new. Since buffers are basically dynamic arrays, I am removing all arrays created inside the shader and replacing them with buffers. Which means a bit more work for the cpu, but worth it, if it works. – R Menke Aug 31 '15 at 13:52
  • At everyone! Thanks for the help and sorry being a stubborn ass. Turns out it is a memory issue disguised as an initialising issue. It fails to determine how much memory it needs during the init. – R Menke Sep 01 '15 at 16:31

2 Answers2

4

The real problem: It is a memory issue!

To all the people saying that it was a memory issue, you were right! Here is some pseudo code to illustrate it. Sorry that it is in "Swift" but easier to read. Metal Shaders have a funky way of coming to life. They are first initialised without values to get the memory. It was this step that failed because it relied on a later step: setting the buffer.

It all comes down to which values are available when. My understanding of newComputePipelineStateWithFunction was wrong. It is not simply getting the shader function. It is also a tiny step in the initialising process.

class MetalShader {

    // buffers
    var aBuffer : [Float]
    var aBufferCount : Int

    // step One : newComputePipelineStateWithFunction
    memory init() {
        // assign shader memory

        // create memory for one int
        let aStaticValue : Int
        // create memory for one int
        var aNotSoStaticValue : Int // this wil succeed, assigns memory for one int

        // create memory for 10 floats
        var aStaticArray : [Float] = [Float](count: aStaticValue, repeatedValue: y) // this will succeed

        // create memory for x floats
        var aDynamicArray : [Float] = [Float](count: aBuffer.count, repeatedValue: y) // this will fail
        var aDynamicArray : [Float] = [Float](count: aBufferCount, repeatedValue: y) // this will fail

        let tempValue : Float // one float from a loop

    }

    // step Two : commandEncoder.setBuffer()
    assign buffers (buffers) {

        aBuffer = cpuMemoryBuffer

    }

    // step Three : commandEncoder.endEncoding()
    actual init() {
        // set shader values

        let aStaticValue : Int = 0

        var aNotSoStaticValue : Int = aBuffer.count

        var aDynamicArray : [Float] = [Float](count: aBuffer.count, repeatedValue: 1) // this could work, but the app already crashed before getting to this point.

    }

    // step Four : commandBuffer.commit()
    func shaderFunction() {
        // do stuff
        for i in 0..<aBuffer.count {

            let tempValue = aBuffer[i]

        }
    }
}

Fix:

I finally realised that buffers are technically dynamic arrays and instead of creating arrays inside the shader, I could also just add more buffers. This obviously works.

R Menke
  • 8,183
  • 4
  • 35
  • 63
-1

I think your problem is with this line :

uint schemeVectorSize = sizeof(schemeVector) / uintSize;

Here schemeVector is dynamic so as in classic C++ you cannot use sizeof on a dynamic array to get number of elements. sizeof would only work on arrays you would have defined locally/statically in the metal shader code.

Just imagine how it works internally : at compile time, the Metal compiler is supposed to transform the sizeof call into a constant ... but he can't since schemeVector is a parameter of your shader and thus can have any size ...

So for me the solution would be to compute schemeVectorSize in the C++/ObjectiveC/Swift part of your code, and pass it as a parameter to the shader (as a uniform in OpenGLES terminology ...).

Community
  • 1
  • 1
VB_overflow
  • 1,763
  • 11
  • 15
  • I did some research on how the compiler works and how Apple decided to keep it agnostic of many easy to catch bugs. I believe the code is not seen as wrong because you could declare a `uint` statically and use that to set the size of multiple arrays. But I think `newComputePipelineStateWithFunction` does also reserve the memory for for all things declared in the shader. So passing the `schemeVectorSize` as a parameter wouldn't (and doesn't) work. It still would not know how much memory to reserve for the non buffer part of the code. – R Menke Sep 01 '15 at 13:41
  • 1
    What I am saying is that `uint schemeVectorSize = sizeof(schemeVector) / uintSize;`is equivalent to `uint schemeVectorSize = sizeof(const device uint*) / uintSize;` and this will probable give you `schemeVectorSize` being equal to 1 or 2 ... so this is probably not at all what you expect ... Have you tried to use the offline Metal compiler to compile the shader in order to see if you have any error ? – VB_overflow Sep 01 '15 at 15:21
  • And see also [here](http://stackoverflow.com/questions/26461063/can-i-get-the-size-of-a-buffer-from-my-metal-shader) – VB_overflow Sep 01 '15 at 15:38
  • The correct answer there would have been, "yes you can". You can get the size of a buffer and use it for loops or if statements. You can't use it to build dynamic arrays. Because, as far as I can tell, every instance in the shader that is not a buffer has to be created before buffers are assigned. – R Menke Sep 01 '15 at 15:45
  • I always use the offline compiler. No errors there. See my first comment for why. – R Menke Sep 01 '15 at 15:48
  • I do not know what to say to you at this point ... you seem to refuse to see the issue ... Just go to [Apple Metal forum](https://devforums.apple.com/community/ios/graphics/metal) and ask your question there if you need undiscutable answer from Apple engineers. – VB_overflow Sep 01 '15 at 15:49
  • I updated the question with some pseudo code explaining in great detail how I now understand it works. The method described in your answer still causes the same crash. – R Menke Sep 01 '15 at 16:28
  • 1
    Menke, you cannot call sizeof(pointer) inside shader code and expect it to work. You have to pass a constant buffer size into a shader invocation as a constant space value, typically wrapped in a struct. The shader code does not know the length of buffers that will be passed into the shader later on after it was created. – MoDJ Sep 06 '18 at 01:11