1

This is an example code for the problem I am facing :

import numba, numpy as np

@numba.jit
def f_plain(x):
   return x * (x - 1)

@numba.jit
def integrate_f_numba(a, b, N):
   s = 0
   dx = (b - a) / N
   for i in range(N):
       s += f_plain(a + i * dx)
   return s * dx

@numba.jit
def apply_integrate_f_numba(col_a, col_b, col_N):
   n = len(col_N)
   result = np.empty(n, dtype='float64')
   assert len(col_a) == len(col_b) == n
   for i in range(n):
      result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
   new = result
   #return result
   del result

def compute_numba(df):
   result = apply_integrate_f_numba(df['a'].values, df['b'].values, df['N'].values)
   return Series(result, index=df.index, name='result')

And run it using below commands:

import pandas as pd

from pandas import DataFrame, Series

from numpy.random import randn, randint

import numpy as np

df = DataFrame({'a': randn(1000), 'b': randn(1000),'N': randint(100, 1000, (1000)), 'x': 'x'})
%timeit compute_numba(df)

But I get this error when I have the 'del result' in 'apply_integrate_f_numba' function:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-31-6c46b74dae81> in <module>()
      8 
      9 df = DataFrame({'a': randn(1000), 'b': randn(1000),'N': randint(100, 1000, (1000)), 'x': 'x'})
---> 10 get_ipython().magic(u'timeit compute_numba(df)')

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\interactiveshell.pyc in magic(self, arg_s)
   2203         magic_name, _, magic_arg_s = arg_s.partition(' ')
   2204         magic_name = magic_name.lstrip(prefilter.ESC_MAGIC)
-> 2205         return self.run_line_magic(magic_name, magic_arg_s)
   2206 
   2207     #-------------------------------------------------------------------------

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\interactiveshell.pyc in run_line_magic(self, magic_name, line)
   2124                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2125             with self.builtin_trap:
-> 2126                 result = fn(*args,**kwargs)
   2127             return result
   2128 

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\magics\execution.pyc in timeit(self, line, cell)

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\magic.pyc in <lambda>(f, *a, **k)
    191     # but it's overkill for just that one bit of state.
    192     def magic_deco(arg):
--> 193         call = lambda f, *a, **k: f(*a, **k)
    194 
    195         if callable(arg):

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\IPython\core\magics\execution.pyc in timeit(self, line, cell)
   1011             number = 1
   1012             for _ in range(1, 10):
-> 1013                 if timer.timeit(number) >= 0.2:
   1014                     break
   1015                 number *= 10

C:\Program Files\Python\python-2.7.9.amd64\lib\timeit.pyc in timeit(self, number)
    193         gc.disable()
    194         try:
--> 195             timing = self.inner(it, self.timer)
    196         finally:
    197             if gcold:

<magic-timeit> in inner(_it, _timer)

<ipython-input-30-f288c9b5ebe9> in compute_numba(df)
     25 
     26 def compute_numba(df):
---> 27    result = apply_integrate_f_numba(df['a'].values, df['b'].values, df['N'].values)
     28    return Series(result, index=df.index, name='result')

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\dispatcher.pyc in _compile_for_args(self, *args, **kws)
    151         assert not kws
    152         sig = tuple([self.typeof_pyval(a) for a in args])
--> 153         return self.jit(sig)
    154 
    155     def inspect_types(self, file=None):

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\dispatcher.pyc in jit(self, sig, **kws)
    142         """Alias of compile(sig, **kws)
    143         """
--> 144         return self.compile(sig, **kws)
    145 
    146     def _compile_for_args(self, *args, **kws):

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\dispatcher.pyc in compile(self, sig, locals, **targetoptions)
    277                                           self.py_func,
    278                                           args=args, return_type=return_type,
--> 279                                           flags=flags, locals=locs)
    280 
    281             # Check typing error if object mode is used

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\compiler.pyc in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library)
    550     pipeline = Pipeline(typingctx, targetctx, library,
    551                         args, return_type, flags, locals)
--> 552     return pipeline.compile_extra(func)
    553 
    554 

C:\Program Files\Python\python-2.7.9.amd64\lib\site-packages\numba\compiler.pyc in compile_extra(self, func)
    265             return self.stage_compile_interp_mode()
    266         else:
--> 267             raise res.exception
    268 
    269     def compile_bytecode(self, bc, lifted=(),

NotImplementedError: offset=142 opcode=0x7e opname=DELETE_FAST

Not sure what to do now? I need that del command as in my original code I need to free up some memory as I am dealing with huge datasets.

AdityaGoel
  • 53
  • 1
  • 6
  • 3
    You can't `del` a variable and then return it. You're trying to free up memory that is still being used. – Kevin Aug 06 '15 at 12:38
  • Oops minor error! But it still doesn't work. Same error. – AdityaGoel Aug 07 '15 at 13:38
  • Firstly - it would be a good idea to edit that `del` out of your code in the question, or everyone's going to pick up on it first. Secondly - I suspect you might have a 32 / 64 bit compatibility issue. Try making sure the type of all the numbers that you pass in (i.e. the `df` in your second code block) have type `np.float64`. Just a guess - I'm sorry I don't have time to repro and test it right now. Ping me if you try it and let me know results. – J Richard Snape Aug 07 '15 at 14:02
  • OK - I think I've figured out what you meant to do. Note that editing your code in response to @Kevin's comment would have sped that process up. Please have a look at my answer and see whether it addresses your issue. I have a feeling you might have an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) here and might need to post another question with a more realistic dataset and showing where and why you need to free memory.... – J Richard Snape Aug 11 '15 at 10:12

1 Answers1

3

The problem occurs when the numba just-in-time compiler (jit) tries to compile this function (N.B. edited for what I believe the OP intent to be around the del)

@numba.jit
def apply_integrate_f_numba(col_a, col_b, col_N):
   n = len(col_N)
   result = np.empty(n, dtype='float64')
   assert len(col_a) == len(col_b) == n
   for i in range(n):
      result[i] = integrate_f_numba(col_a[i], col_b[i], col_N[i])
   new_res = result
   del result
   return new_res

Specifically - the issue is with the line

del result

The JIT is actually telling you what the problem is in the error message:

NotImplementedError: offset=142 opcode=0x7e opname=DELETE_FAST

i.e. the numba compiler has not implemented a way to compile the DELETE_FAST python opcode. If you're interested in the code - it looks like it's thrown from here and that file contains a list of the bytecodes that numba can deal with.

You will have noticed, I'm sure, that if you simply return result from apply_integrate_f_numba, everything works fine and you get a ~2x speed up over using that function without numba.jit (assuming you leave the other function with the @numba.jit annotation.

I think you might be trying to achieve something impossible with your del statement. This answer to a more general question explains what del does - it removes the binding (reference) to the object. You seem to be trying to free memory by deleting an object that you have just assigned another name to (i.e. new). So - your code won't actually free up memory, because there's still a reference to it and you need to return that. Note that all the local references to the object will be deleted anyway when you exit the function and result is local to the apply_integrate_f_numba function - so the del is actually redundant.

If you have such a large dataset that memory is an issue, you could del it after you've finished with it completely - e.g. once you've written it out to file, plotted it or whatever else you want to do with it. Simply assigning another name to it and deling the original won't do it - with the added negative side effect that you will get this numba error.

Community
  • 1
  • 1
J Richard Snape
  • 20,116
  • 5
  • 51
  • 79
  • All I want to do is to recover some memory by using the del command. I am using big datasets with pandas. Any way to implement that in numba? If I do not use the del command, I run out of memory. – AdityaGoel Aug 11 '15 at 13:00
  • OK - well the short answer is no. The longer answer is - see my earlier comment on your question - give us the context code and we might be able to suggest something. e.g. Do you run out of memory on a single pass through the function, or is it repeated runs that crash your program? Is the data in the pandas dataframe already? Do you create any copies of it etc, etc. If my answer explains why your error occurs, please vote / accept. If you have a further question - post a new one. As I say in my answer - the `del` in the position you have it will do nothing, with or without `numba` – J Richard Snape Aug 11 '15 at 13:10
  • Also - I see you have edited your question, but now it makes very little sense, because `apply_integrate_f_numba` returns nothing. Which will obviously free the memory (you don't even need the `del` to do that), but means you can't do anything with its result. – J Richard Snape Aug 11 '15 at 13:12