0

I am writing a script to perform an FFT using the GPU/CUDA based cuFFT library. CuFFT requires that input data must be in the format specified as "cufftComplex". However my input data is in the numpy.complex64 format. I am using the Python C-API to send data from python to C. How can I convert between the two formats? Currently my code looks like this:

#include<python2.7/Python.h>
#include<numpy/arrayobject.h>
#include<cufft.h>


void compute_BP(PyObject* inputData, pyObject* OutputData, int Nfft)
{
   cuffthandle plan;
   cuFFTPlan1d(&plan, Nfft, CUFFT_C2C, CUFFT_INVERSE);
   cuFFTExecC2C(plan, inputData, OutputData, CUFFT_INVERSE);
   ...
 }

When compiling I get the following error:

Error: argument of type "PyObject *" is incompatible with parameter of type "cufftComplex".

talonmies
  • 70,661
  • 34
  • 192
  • 269
DLH
  • 199
  • 11
  • 1
    cufft is not callable from a `__global__` function (there is no cufft device API), so I doubt your code looks like this. If it does look like this, it won't work regardless of your concerns about data format. – Robert Crovella May 22 '18 at 22:30
  • @RobertRovella: Removing __global__ – DLH May 22 '18 at 22:40
  • `std::complex` layout in C++ should exactly match `cufftDoubleComplex`, and `std::complex` layout in C++ should exactly match `cufftComplex`. With that information, your question effectively becomes how to convert python (numpy) complex types to C++ and vice-versa. For that, [this question](https://stackoverflow.com/questions/44415752/passing-big-complex-arrays-from-python-to-c-whats-my-best-option) provides a suitable answer. – Robert Crovella May 23 '18 at 01:47
  • Your code/thinking here is broken beyond what has already been pointed out. You can't just blindly cast an arbitrary Python object ponter to a C array pointer and you can't pass a host C array pointer to cuFFT. You must use the Python [buffer interface](https://docs.python.org/2/c-api/buffer.html#c.Py_buffer) to access to the array memory and the the appropriate CUDA APIs to allocate and copy that memory to the device – talonmies May 23 '18 at 02:28

1 Answers1

2

borrowing from my answer here, here is a worked example of how you could use ctypes in python to run a function from the cufft library in a python script, using numpy data:

$ cat mylib.cpp
#include <cufft.h>
#include <stdio.h>
#include <assert.h>
#include <cuda_runtime_api.h>
extern "C"
void fft(void *input, void *output, size_t N){

  cufftHandle plan;
  cufftComplex *d_in, *d_out;
  size_t ds = N*sizeof(cufftComplex);
  cudaMalloc((void **)&d_in,  ds);
  cudaMalloc((void **)&d_out, ds);
  cufftResult res = cufftPlan1d(&plan, N, CUFFT_C2C, 1);
  assert(res == CUFFT_SUCCESS);
  cudaMemcpy(d_in, input, ds, cudaMemcpyHostToDevice);
  res = cufftExecC2C(plan, d_in, d_out, CUFFT_FORWARD);
  assert(res == CUFFT_SUCCESS);
  cudaMemcpy(output, d_out, ds, cudaMemcpyDeviceToHost);
  printf("%s\n", cudaGetErrorString(cudaGetLastError()));
  printf("from shared object:\n");
  for (int i = 0; i < N; i++)
    printf("%.1f + j%.1f, ", ((cufftComplex *)output)[i].x, ((cufftComplex *)output)[i].y);
  printf("\n");
}

$ cat t8.py
import ctypes
import os
import sys
import numpy as np

mylib = ctypes.cdll.LoadLibrary('libmylib.so')

N = 4
mydata = np.ones((N), dtype = np.complex64)
myresult = np.zeros((N), dtype = np.complex64)
mylib.fft(ctypes.c_void_p(mydata.ctypes.data), ctypes.c_void_p(myresult.ctypes.data), ctypes.c_size_t(N))
print(myresult)

$ g++ -fPIC -I/usr/local/cuda/include --shared mylib.cpp -L/usr/local/cuda/lib64 -lcufft -lcudart -o libmylib.so
$ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` python t8.py
no error
from shared object:
4.0 + j0.0, 0.0 + j0.0, 0.0 + j0.0, 0.0 + j0.0,
[4.+0.j 0.+0.j 0.+0.j 0.+0.j]
$
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • 1
    This sidesteps the whole need to use the buffer interface by using the baked in ctypes interface in numpy arrays which is pretty common approach – talonmies May 23 '18 at 08:27