2

Using VS 2012, .NET 4.5, 64bit and CUDAfy 1.12 and I have the following proof of concept

using System;
using System.Runtime.InteropServices;
using Cudafy;
using Cudafy.Host;
using Cudafy.Translator;

namespace Test
{
[Cudafy(eCudafyType.Struct)]
[StructLayout(LayoutKind.Sequential)]
public struct ChildStruct
{
    [MarshalAs(UnmanagedType.LPArray)]
    public float[] FArray;
    public long FArrayLength;
}

[Cudafy(eCudafyType.Struct)]
[StructLayout(LayoutKind.Sequential)]
public struct ParentStruct
{
    public ChildStruct Child;
}

public class Program
{
    [Cudafy]
    public static void KernelFunction(GThread gThread, ParentStruct parent)
    {
        long length = parent.Child.FArrayLength;
    }

    public static void Main(string[] args)
    {
        var module = CudafyTranslator.Cudafy(
          ePlatform.x64, eArchitecture.sm_35,
          new[] {typeof(ChildStruct), typeof(ParentStruct), typeof(Program)});
        var dev = CudafyHost.GetDevice();
        dev.LoadModule(module);

        float[] hostFloat = new float[10];
        for (int i = 0; i < hostFloat.Length; i++) { hostFloat[i] = i; }

        ParentStruct parent = new ParentStruct
        {
            Child = new ChildStruct
            {
                FArray = dev.Allocate(hostFloat),
                FArrayLength = hostFloat.Length
            }
        };

        dev.Launch(1, 1, KernelFunction, parent);

        Console.ReadLine();
    }
}
}

When the program runs, I am getting the following error on the dev.Launch:

Type 'Test.ParentStruct' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.

If I remove the float array from the ChildStruct, it works as expected.

Having worked in C/C++/Cli and CUDA C in the past, I am aware of the nature of the error. Some solutions to this error suggest setting the struct size manually using Size parameter of MarshalAs, but this is not possible due to the variety of types within the struct.

I looked at the generated .cu file and it is generating the float array as a float * which is what I expected.

Is there a way to pass an array within a struct to the Kernel? And if there isn't what is the best second alternative? This problem doesn't exist in CUDA C and it only exists because we are marshaling from CLR.

Adam
  • 3,872
  • 6
  • 36
  • 66
  • Does this also mean that List would not be possible with CUDAFY? – Hans Rudel Jun 04 '13 at 16:01
  • 1
    First, it needs to be an array that you are sending, not a list. However, I don't think this is a problem unless if your struct has an array. – Adam Jun 04 '13 at 16:59
  • ok cool. The struct only contains datetime, string decimal etc. I have a feeling the decimal might be an issue though as i didnt see it mentioned in the "CUDA by example". + 1 on ur comment and question btw ;) – Hans Rudel Jun 04 '13 at 17:10

2 Answers2

1

I spent good time reading the source code of CUDAfy to see if there is a solution to this problem.

CUDAfy is trying to make things too simple for .NET developers and shield them away from the IntPtr and other pointer concepts. However, the level of abstraction makes it very hard to think of an answer to this problem without a major refactor to the way this library works.

Not being able to send a float array within a struct is a show stopper. I ended up doing PInvoke to the CUDA Runtime and not using CUDAfy.

Adam
  • 3,872
  • 6
  • 36
  • 66
  • Or you could of sent psuedo ptr through another array, as a C# and C++/ASM dev. I've used it quite often, IMO c# integration now days requires such solutions; decoupled memory management. – Jamie Nicholl-Shelley Sep 21 '16 at 05:25
1

This is a limitation of .NET, not CUDAfy. Data must be blittable and a non-fixed size array is not. This is valid and based on the CUDAfy unit tests on codeplex:

[Cudafy]
[StructLayout(LayoutKind.Sequential, Size=64, CharSet = CharSet.Unicode)]
public unsafe struct PrimitiveStruct
{
    public fixed sbyte Message[32];
    public fixed char MessageChars[16];
}

There is also no reason to store the array length explicitly since you can use the Length property within device code.