64

I get the following UserWarning when trying to cache results using joblib:

import numpy
from tempfile import mkdtemp
cachedir = mkdtemp()
from joblib import Memory
memory = Memory(cachedir=cachedir, verbose=0)

@memory.cache
def get_nc_var3d(path_nc, var, year):
    """
    Get value from netcdf for variable var for year
    :param path_nc:
    :param var:
    :param year:
    :return:
    """
    try:
        hndl_nc = open_or_die(path_nc)
        val = hndl_nc.variables[var][int(year), :, :]
    except:
        val = numpy.nan
        logger.info('Error in getting var ' + var + ' for year ' + str(year) + ' from netcdf ')
    
    hndl_nc.close()
    return val

I get the following warning when calling this function using parameters:

UserWarning: Persisting input arguments took 0.58s to run.
If this happens often in your code, it can cause performance problems 
(results will be correct in all cases). 
The reason for this is probably some large input arguments for a wrapped function (e.g. large strings).
THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an example so that they can fix the problem.

Input parameters: C:/Users/rit/Documents/PhD/Projects/\GLA/Input/LUWH/\LUWLAN_v1.0h\transit_model.nc range_to_large 1150

How do I get rid of the warning? And why is it happening, since the input parameters are not too long?

wovano
  • 4,543
  • 5
  • 22
  • 49
user308827
  • 21,227
  • 87
  • 254
  • 417
  • Perhaps it is worthwhile using [functools.lru_cache](https://docs.python.org/3/library/functools.html#functools.lru_cache) instead. – jadelord Apr 29 '20 at 04:27
  • 1
    @jadelord My understanding is that `lru_cache` is good for small input/outputs, but `joblib.Memory` is better for large input/outputs. – Michael Jun 04 '21 at 16:06
  • 3
    Are all of your inputs the strings you gave above, or are they more complicated objects? If they're just strings or something simple, joblib doesn't give me that warning message. – K. P. Jan 05 '22 at 07:50
  • does it happen every time or just intermittently? seems like any time it takes longer than 0.5s to persist the json file it will trigger the warning, could be slow disk or other thread causing delay? – Anentropic Apr 20 '22 at 15:18
  • How big is the object you are returning? Maybe joblib tries to cache that and fails because the object is too big? – irdkwmnsb Apr 25 '22 at 13:18
  • It's been a while but, could you share what precise joblib version were you using? I am assuming something in the 0.9 series, so my first guess would be a bug in the very joblib? Could you also share your python version and OS version at the date and tell if the issue can be reproduced with an updated setup? – N1ngu May 22 '22 at 10:55
  • 1
    FYI: related GitHub issues: https://github.com/joblib/joblib/issues?q=%22Persisting+input+arguments+took%22 – wovano Jun 08 '22 at 13:33
  • 4
    what is `open_or_die` which is not defined and also `logger` showing as not defined ? – D.L Aug 16 '22 at 08:54
  • Could you use the netCDF4 library to open the netcdf file instead of the custom open_or_die() function? – James Sep 24 '22 at 23:36
  • The only way I could reproduce this warning was by passing large input arguments. (so it doesn't really depend on the functionality and how long it takes inside the function). I was using the most recent version of joblib, so I think upgrading the library and then debugging to make sure the input is actually the path to the file and not the content would fix the warning. – Mahdi Sadeghi Oct 08 '22 at 16:58
  • 2
    Can you share the exact code you are using to call this function? Would help others replicate the result. – Yaakov Bressler Nov 14 '22 at 02:15
  • 6 years in and this answer has only attracted two low quality questions; one being a copy paste of other answers on suppressing warnings, the other being guess based. – ferreiradev Nov 21 '22 at 16:54
  • When faced with this issue, i pass a unique identifier as an additional parameter to the cached function(s) (I use the uuid4 method from the uuid library). I then update the memory.cache decorator to ignore all parameters except the unique identifier. The documentation on ignoring parameters is here: https://joblib.readthedocs.io/en/latest/memory.html#ignoring-some-arguments Let me know if my explanation isn't clear, i'll try again. – somebody3697 Mar 02 '23 at 01:56
  • Maybe you should check the size of `hndl_nc`. What is the `joblib` version used? – Memristor May 01 '23 at 19:19

5 Answers5

1

I don't have an answer to the "why doesn't this work?" portion of the question. However to simply ignore the warning you can use warnings.catch_warnings with warnings.simplefilter as seen here.

import warnings

with warnings.catch_warnings():
  warnings.simplefilter("ignore")
  your_code()

Obviously, I don't recommend ignoring the warning unless you're sure its harmless, but if you're going to do it this way will only suppress warnings inside the context manager and is straight out of the python docs

General4077
  • 435
  • 1
  • 8
  • 17
1

UserWarning: Persisting input arguments took 0.58s to run. If this happens often in your code, it can cause performance problems (results will be correct in all cases). The reason for this is probably some large input arguments for a wrapped function (e.g. large strings). THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an example so that they can fix the problem.

the warning itself is self explanatory in my humble opinion. it might be in your code issue you can try to decrease the input size,or you can share your report with joblib team so that they can either help to improve joblib or suggest your better approach of usage to avoid this type of performance warnings.

auvipy
  • 769
  • 10
  • 22
1

To get rid of the warning:

memory = Memory(cachedir=cachedir, verbose=0, ignore=["verbose"])
Esmaeli
  • 111
  • 6
1

The warning message you received is indicating that persisting (caching) the input arguments to the function get_nc_var3d took a relatively long time to complete. The warning message suggests that this is likely due to large input arguments, such as large strings, being passed to the function.

One way to get rid of the warning is to pass smaller input arguments to the function, if possible. Alternatively, you could increase the size of the cache to reduce the number of times the input arguments need to be persisted. You can do this by setting the max_size parameter of the Memory object to a larger value. For example:

memory = Memory(cachedir=cachedir, verbose=0, max_size=1024*1024*100)

This sets the maximum size of the cache to 100MB.

It's also possible that the warning message is caused by an issue with joblib itself. In this case, you may need to provide more information to the joblib team in order to help them diagnose and fix the problem. You can do this by submitting an issue on the joblib GitHub repository and providing a minimal example that reproduces the warning.

Chahrazed
  • 11
  • 3
0

The warning occurs because joblib is trying to persist the input arguments to disk for caching purposes, but it's taking too long for this operation. The cause of the issue could be due to some large input arguments, such as a long string, which is taking time to serialize.

To resolve the issue, you can either disable the persist argument of the cache method, which would result in no caching, or you can try to preprocess the input arguments to reduce their size before calling the cache method.

@memory.cache(persist=False)
Girolamo
  • 326
  • 3
  • 11