Python GPU programming

Question

I am currently working on a project in python, and I would like to make use of the GPU for some calculations.

At first glance it seems like there are many tools available; at second glance, I feel like im missing something.

Copperhead looks awesome but has not yet been released. It would appear that im limited to writing low-level CUDA or openCL kernels; no thrust, no cudpp. If id like to have something sorted, im going to have to do it myself.

That doesnt seem quite right to me. Am I indeed missing something? Or is this GPU-scripting not quite living up to the hype yet?

Edit: GPULIB seems like it might be what I need. Documentation is rudimentary, and the python bindings are mentioned only in passing, but im applying for a download link right now. Anyone has experience with that, or links to similar free-for-academic-use GPU libraries? ReEdit: ok, python bindings are infact nonexistant.

Edit2: So I guess my best bet is to write something in C/CUDA and call that from python?

score 29 · Answer 1 · answered May 10 '11 at 23:31

29

PyCUDA provides very good integration with CUDA and has several helper interfaces to make writing CUDA code easier than in the straight C api. Here is an example from the Wiki which does a 2D FFT without needing any C code at all.

answered May 10 '11 at 23:31

Joseph Lisee

3,439
26
21

Thanks; im well aware of pyCUDA, what I dont get is that a library such as CUDPP has no python bindings. How do I sort a list? – Eelco Hoogendoorn May 11 '11 at 07:40
3

@Eelco Hoogendoorn: The fundamental problem with PyCUDA in the past was that it was built on the CUDA driver API, whereas most of the algorithm libraries (CUBLAS, CUFFT, CUDPP, CUSPARSE) were written for the CUDA runtime API. There was no official interoperability between the two APIs in CUDA until quite recently. That has been fixed, and PyCUDA based bindings for these libraries are slowly appearing. I know that doesn't help you today, but it explains why things are the way they are right now..... – talonmies May 11 '11 at 09:00
Thanks, good to know things are moving. I guess if CUFFT bindings exist already, CUDPP and the rest cant be long in the waiting. Too bad I lack the low level know how myself. – Eelco Hoogendoorn May 11 '11 at 09:24
PyCUDA has a sister project to provide this functionality against OpenCL: https://pypi.python.org/pypi/pyopencl – Javier Apr 12 '15 at 22:11

score 12 · Answer 2 · edited May 14 '23 at 19:29

I will publish here some information that I read on reddit. It will be useful for people who are coming without a clear idea of what different packages do and how they connect CUDA with Python:

From: Reddit

There's a lot of confusion in this thread about what various projects aim to do and how ready they are. There is no "GPU backend for NumPy" (much less for any of SciPy's functionality). There are a few ways to write CUDA code inside of Python and some GPU array-like objects which support subsets of NumPy's ndarray methods (but not the rest of NumPy, like linalg, fft, etc..)

PyCUDA and PyOpenCL come closest. They eliminate a lot of the plumbing surrounding launching GPU kernels (simplified array creation & memory transfer, no need for manual deallocation, etc...). For the most part, however, you're still stuck writing CUDA kernels manually, they just happen to be inside your Python file as a triple-quoted string. PyCUDA's GPUarray does include some limited NumPy-like functionality, so if you're doing something very simple you might get away without writing any kernels yourself.
NumbaPro includes a "cuda.jit" decorator which lets you write CUDA kernels using Python syntax. It's not actually much of an advance over what PyCUDA does (quoted kernel source), it's just your code now looks more Pythonic. It definitely doesn't, however, automatically run existing NumPy code on the GPU.
Theano let you construct symbolic expression trees and then compiles them to run on the GPU. It's not NumPy and only has equivalents for a small subset of NumPy's functionality.
gnumpy is a thinly documented wrapper around CudaMat. The only supported element type is float32 and only a small subset of NumPy is implemented.

Update:

By now (2023) there are many more/better options (not from reddit):

CUDA Python: Low level implementation of CUDA runtime and driver API. It is very similar to PyCUDA but officially maintained and supported by Nvidia like CUDA C++.
Numba CUDA: Same as NumbaPro above, but now part of the Open Source Numba code generation framework. Ideal when you want to write your own kernels, but in a pythonic way instead of the usual C++ dialect.
CuPy: NumPy/SciPy implementation by Preferred Networks, Inc.
cunumeric: NumPy implementation for HPC multi-node, multi-GPU computing by Nvidia. Uses the "Legate" abstraction layer.
RAPIDS: Nvidia's accelerated data science libraries. Includes accelerated Pandas-like data frames (cuDF) and much more.

score 7 · Answer 3 · answered Dec 04 '13 at 17:31

I know that this thread is old, but I think I can bring some relevant information that answers to the question asked.

Continuum Analytics has a package that contains libraries that resolves the CUDA computing for you. Basically you instrument your code that needs to be parallelized (within a function) with a decorator and you need to import a library. Thus, you don't need any knowledge about CUDA instructions.

Information can be found on NVIDIA page

https://developer.nvidia.com/anaconda-accelerate

or you can go directly to the Continuum Analytics' page

https://store.continuum.io/cshop/anaconda/

There is a 30 day trial period and a free licence for academics.

I use this extensively and accelerates my code between 10 to 50 times.

@lemarc: I do numerical integration. As a starter you can look a the anaconda examples and compare for speed the CUDA enabled fractal computations with the non enabled one. You will notice at least a 10 times speed up. — Bogdan, Feb 04 '14 at 21:13
I want to be able to load any python function on to GPU, is that possible to do? — scottydelta, Feb 05 '14 at 17:59

score 5 · Answer 4 · answered May 11 '11 at 09:43

5

Theano looks like it might be what you're looking for. From what I understand, it is very capable of doing some heavy mathematical lifting with the GPU and appears to be actively maintained.

Good luck!

answered May 11 '11 at 09:43

katzenklavier

193
4

Theano is definitely awesome; it is the (very good) reason im currently locked into python. But while it is awesome at what it does, its not a general purpose math or GPU library. Its not going to do my collision detection, or even sort my array; not now, nor in the future I think. – Eelco Hoogendoorn May 11 '11 at 10:01

score 2 · Answer 5 · answered Oct 31 '17 at 18:48

Check this page for a open source library distributed with Anaconda https://www.anaconda.com/blog/developer-blog/open-sourcing-anaconda-accelerate/

" Today, we are releasing a two new Numba sub-projects called pyculib and pyculib_sorting, which contain the NVIDIA GPU library Python wrappers and sorting functions from Accelerate. These wrappers work with NumPy arrays and Numba GPU device arrays to provide access to accelerated functions from: cuBLAS: Linear algebra cuFFT: Fast Fourier Transform cuSparse: Sparse matrix operations cuRand: Random number generation (host functions only) Sorting: Fast sorting algorithms ported from CUB and ModernGPU Going forward, the Numba project will take stewardship of pyculib and pyculib_sorting, releasing updates as needed when new Numba releases come out. These projects are BSD-licensed, just like Numba "

score 1 · Answer 6 · answered May 10 '11 at 23:22

1

Have you taken a look at PyGPU?

http://fileadmin.cs.lth.se/cs/Personal/Calle_Lejdfors/pygpu/

answered May 10 '11 at 23:22

onteria_

68,181
7
71
64

2

Yup; conceptually nice, but dead since 2007, and no documentation whatsoever. – Eelco Hoogendoorn May 11 '11 at 07:45

Moj · Answer 7 · 2012-10-02T19:14:03.160

0

I can recommend scikits.cuda . but for that you need to download CULA full version(free for students.) . Another is CUV . If you are looking for something better and ready to pay for that,you can also take a look at array fire.Write now I am using scikits and quite satisfy so far.

edited Oct 02 '12 at 19:14

answered Oct 02 '12 at 19:10

Moj

6,137
2
24
36

Python GPU programming

7 Answers7

Update:

Linked