9

I'm using PySpark 1.5.2. I got UserWarning Please install psutil to have better support with spilling after I issue the command .collect()

Why is this warning showed?

How can I install psutil?

WoodChopper
  • 4,265
  • 6
  • 31
  • 55
wannik
  • 12,212
  • 11
  • 46
  • 58

2 Answers2

17
pip install psutil

If you need to install specifically for python 2 or 3, try using pip2 or pip3; it works for both major versions. Here is the PyPI package for psutil.

Cassidy Laidlaw
  • 1,318
  • 1
  • 14
  • 24
1

y can clone or download the psutil project in the following link: https://github.com/giampaolo/psutil.git

then run setup.py to install psutil

in 'spark/python/pyspark/shuffle.py' y can see the following codes:

def get_used_memory():
    """ Return the used memory in MB """
    if platform.system() == 'Linux':
        for line in open('/proc/self/status'):
            if line.startswith('VmRSS:'):
                return int(line.split()[1]) >> 10

    else:
        warnings.warn("Please install psutil to have better "
                      "support with spilling")**
        if platform.system() == "Darwin":
            import resource
            rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
            return rss >> 20
        # TODO: support windows

    return 0

so i guess if yr os is not a linux, so psutil is suggested.

Dazhuang
  • 56
  • 3