2

In the iPython Notebook I am trying to use the notebook function %%cython_pyximport to write a cython function that I can call later on in my notebook.

I want to use this command as opposed to %%cython because there seems to be quite a bit of overhead with it. For example when I profile my code I get this:

168495 function calls in 4.606 seconds

      Ordered by: internal time


   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    3.234    3.234    4.605    4.605 {_cython_magic_0ef63e1ad591c89b73223c7a86d78802.knn_alg}
    11397    0.326    0.000    0.326    0.000 {method 'reduce' of 'numpy.ufunc' objects}
      987    0.152    0.000    0.266    0.000 decomp.py:92(eig)
      987    0.118    0.000    0.138    0.000 function_base.py:3112(delete)

I'm hoping that using %%cython_pyximport will cut down time spent calling this function. If there is a better way please let me know.

So getting to my actual question - When I use %%cython_pyximport I get this error:

ImportError: Building module function failed: ['DistutilsPlatformError: Unable to find vcvarsall.bat\n']

Maybe it's related to something not being on my PATH but I'm not sure. What do I have to do to fix this?

I'm using Windows 7, Python 2.7.6 (Installed with Anaconda), Cython 0.20.1, iPython Notebook 2.1.0

EDIT: So after following @IanH 's suggestion I now have this error:

fatal error: numpy/arrayobject.h: No such file or directory

It seems like additional header files need to be included for numpy to work with pyximport. On this page https://github.com/cython/cython/wiki/InstallingOnWindows there is a mention of this error and how to solve it but am lost at how to apply this so that the %%cython_pyximport command will work in my notebook.

pbreach
  • 16,049
  • 27
  • 82
  • 120

3 Answers3

1

There are two different issues here. I'll first address the one you seem to care about.

Using pyximport instead of the cython magic function should not increase speed at all. Given your profiling results, it appears that the real problem here is that you are calling a NumPy function on the inside of a loop. In Cython you have to keep track of which function calls are done in C, and which are done in Python. Numpy universal functions are Python functions and they require the cost of calling a Python function.

How you would want to fix this depends entirely on what you are doing. If you can cleverly vectorize away the loop using NumPy operations, that is probably the best way, but not all problems can easily be solved that way. There are ways to call LAPACK routines from Cython, as described in this answer. If you are doing simpler operations (like summing along axes, etc), you can write a function that uses cython memoryviews to pass slices around internally in your Cython module. There is some discussion on the proper way to do that in this blog post. Doing these sorts of operations is usually a little harder in Cython, but it is still a very approachable problem.

Now, though I'm not convinced that pyximport will actually do what you want it to, I will still tell you how to get it working. The error you are seeing happens when distutils tries to use the Visual Studio compiler even when you haven't gotten everything set up for it. Anaconda, by default uses MinGW for Cython extensions, but for some reason it isn't set up to use MinGW with pyximport. That's easy to fix though. In your Python installation directory, (probably C:\Anaconda or something along those lines), there should be a file Anaconda\Lib\distutils\distutils.cfg. (Create it if it doesn't exist.) Modify it so that its contents contain both of the following options:

[build]
compiler=mingw32

[build_ext]
compiler = mingw32 

If I remember correctly, the first is already included in Anaconda. As of this writing, the second is not. You will need it there to make pyximport work.

Community
  • 1
  • 1
IanH
  • 10,250
  • 1
  • 28
  • 32
  • I have written cython functions for things like mean, variance, scaling data. The numpy functions that I am using in the loop are dot,array,random.randint,where,cov,cumsum,argmin and eig from scipy.linalg so I don't think I'll be rewriting those in cython anytime soon. – pbreach Jun 10 '14 at 03:24
  • I think you're right about the pyximport option not being the solution. I ran the same code on my computer in Canada remotely and the process took 1.2s with 0.87 for the cython_magic line in the profile. I'm thinking it has to do with my Anaconda installation because before I installed a new copy I had similar results to my computer in Canada. – pbreach Jun 10 '14 at 03:30
  • Back to the actual question again, sorry. I wrote in the `[build_ext]` option you mentined which was not there and now get this error `ImportError: Building module foo failed: ["CompileError: command 'gcc' failed with exit status 1\n"]` which I think is related to mingw? – pbreach Jun 10 '14 at 03:51
  • Hmm. Yes, you're right on that one. The gcc error isn't particularly helpful though. It's a really generic error. If you are using the IPython notebook, look at the output that shows up in the command terminal where you started the notebook. It may show something helpful there. Usually the output from the compiler shows there, so it may give you a better idea of what to do to fix the problem. – IanH Jun 10 '14 at 06:03
  • Well it seems like the numpy/arrayobject.h file can't be found `C:\Users\Patrick\.pyxbld\temp.win-amd64-2.7\Release\pyrex\knnfun.c:346:31: fatal error: numpy/arrayobject.h: No such file or directory` I found something similar to this and a possible solution but I'm not sure if it will be possible to use for `%%cython_pyximport` within the notebook (See edited question) – pbreach Jun 15 '14 at 11:36
  • Yes, judging by the current [source code](https://github.com/ipython/ipython/blob/master/IPython/extensions/cythonmagic.py#L110), that isn't something that can be done without modifying IPython. In the docstring they recommend using the plain `%%cython` magic, so you may want to stick with that if you can. – IanH Jun 16 '14 at 22:55
0

Since I have VS2015 installed, had to add new environment variable

SET VS90COMNTOOLS=%VS140COMNTOOLS%

Source https://stackoverflow.com/a/10558328/625189

Community
  • 1
  • 1
Jaanus
  • 16,161
  • 49
  • 147
  • 202
0

I had this same exact problem, so my solution was instead to use:

[build] compiler=mingw32

[build_ext] compiler = mingw32

In the answer by IanH, I just chose to call gcc and cython directly. I figured I'd literally just circumnavigate around all the possible errors

  1. Firstly, you need to compile cython using:

    cython_commands = ['cython', '-a', '-l', '-p', '-o', c_file_name, file_path]
    cython_feedback = subprocess.call(cython_commands)
    
  2. Then you need to take that .c file and compile it by telling the compiler where to look for the python libraries.

    gcc_commands = ['gcc', '-shared', '-Wall', '-O3', '-I', py_include_dir, '-L', py_libs_dir, '-o', output_name,
                   c_file_name, '-l', a_lib]
    gcc_error = subprocess.call(gcc_commands)
    

py_include_dir: The path to the directory in python installation labeled 'include'

py_libs_dir: The path to the directory in python installation labeled 'libs'

c_file_name: The path to which you wish to ave the middleman c file

a_lib: The name of your python installation (ex. 'python34' or 'python35' or 'python27')

Nick Pandolfi
  • 993
  • 1
  • 8
  • 22