4

I'm trying to install Rapids library with cuDF and cuML to Colab session, and executing code accroding to this example: from Install RAPIDS library on Googe Colab notebook

!wget -nc https://raw.githubusercontent.com/rapidsai/notebooks-contrib/890b04ed8687da6e3a100c81f449ff6f7b559956/utils/rapids-colab.sh
!bash rapids-colab.sh

import sys, os

dist_package_index = sys.path.index("/usr/local/lib/python3.6/dist-packages")
sys.path = sys.path[:dist_package_index] + ["/usr/local/lib/python3.6/site-packages"] + sys.path[dist_package_index:]```
sys.path
if os.path.exists('update_pyarrow.py'): ## This file only exists if you're using RAPIDS version 0.11 or higher
  exec(open("update_pyarrow.py").read(), globals())

during the installation process i got this error:


  - cudf=0.11

Current channels:

  - https://conda.anaconda.org/rapidsai-nightly/label/xgboost/linux-64
  - https://conda.anaconda.org/rapidsai-nightly/label/xgboost/noarch
  - https://conda.anaconda.org/rapidsai-nightly/linux-64
  - https://conda.anaconda.org/rapidsai-nightly/noarch
  - https://conda.anaconda.org/nvidia/linux-64
  - https://conda.anaconda.org/nvidia/noarch
  - https://conda.anaconda.org/conda-forge/linux-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/linux-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/pro/linux-64
  - https://repo.anaconda.com/pkgs/pro/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

I have tried to install cuDF and cuML separately with

conda install -c rapidsai -c nvidia -c conda-forge \
    -c defaults cudf=0.12 python=3.6 cudatoolkit=10.0

but still receiving the error:

ModuleNotFoundError Traceback (most recent call last)

<ipython-input-10-a95ca25217db> in <module>()
----> 1 import cudf
      2 import io, requests
      3 
      4 # download CSV file from GitHub
      5 url="https://github.com/plotly/datasets/raw/master/tips.csv"

ModuleNotFoundError: No module named 'cudf'

how to solve this error?

Sathish Chelladurai
  • 670
  • 1
  • 8
  • 23
try
  • 348
  • 3
  • 8
  • Is your jupyter using the right python kernel, i.e. the anaconda/miniconda one? – FlyingTeller Feb 12 '20 at 12:30
  • i use this code to install Conda: ```!wget -c https://repo.anaconda.com/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh !chmod +x Miniconda3-4.5.4-Linux-x86_64.sh !bash ./Miniconda3-4.5.4-Linux-x86_64.sh -b -f -p /usr/local``` – try Feb 12 '20 at 13:06
  • i only able sucessfully install version rapids / cudf 0.10 – try Feb 12 '20 at 13:10

2 Answers2

5

UPDATE (12/21/2020): to jump right into a GPU powered RAPIDS notebook online you can use BlazingSQL (RAPIDS 0.15+) or continue using Colabratory (RAPIDS 0.14 only)

UPDATE (2/19/2020): Circling back to this question, Colab is working @try. Have fun!

Let us know if you have any other questions. If you need to update your personal Colab notebooks, please use this script to install RAPIDS:

# Install RAPIDS
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh

import sys, os

dist_package_index = sys.path.index('/usr/local/lib/python3.6/dist-packages')
sys.path = sys.path[:dist_package_index] + ['/usr/local/lib/python3.6/site-packages'] + sys.path[dist_package_index:]
sys.path
exec(open('rapidsai-csp-utils/colab/update_modules.py').read(), globals())

Previous response:

we are in the middle or transitioning our Colab scripts to a new repo. We should have all our notebooks updated soon and try to help others migrate as well. Like within 24 hours, if not by EOD today PST.

TaureanDyerNV
  • 1,208
  • 8
  • 9
  • Seems that this is asking for `pyarrow` and `pynvml` first, so I installed it first doing `!conda install --yes --prefix /usr/local pyarrow pynvml`. Maybe would be good to include this on the script. – igorkf Nov 11 '20 at 18:32
  • Following tis directions, I got `pyarrow` installed without adding it - but `cudf` fails to load with a missing shared lib: Could not load shared object file: libllvmlite.so – epifanio Dec 17 '20 at 21:27
  • I somehow fixed by copying it `!cp /usr/local/lib/python3.6/dist-packages/llvmlite/binding/libllvmlite.so .` but then got stuck in a numba imort error: `AttributeError: module 'numba' has no attribute 'core'` (which conda says it is installed correctly and up-to-date) - this is with py3.6 – epifanio Dec 17 '20 at 21:34
  • Same with me - i am also getting this error `AttributeError: module 'numba' has no attribute 'core'`. Tried upgrading numba - making changes in the _init_.py within numba folder but issue doesnt seem to resolve – Vivek Dec 20 '20 at 06:54
  • @epifanio - I went to the __init__.py within numba folder in the python - site_packages folder and converted all the codes that were doing `import numba.core.types` to `from numba.core import types` and additionally also disabled the check - #_ensure_llvm() ....post this i was able to import cudf and also read a csv file using the same. The python init file is in **/usr/local/lib/python3.6/site-packages/numba/** – Vivek Dec 20 '20 at 07:42
  • igorkf @vivek epifanio Right now, Google Colab supports Python 3.6 and RAPIDS .14. For RAPIDS 0.15+ (we're on 0.17 at time of this comment), you can try app.blazingsql.com. They have instances that are RAPIDS ready and require no additional installs. I will update my answer. I will also look into seeing what fixes can be done to the rapids-colab install script. – TaureanDyerNV Dec 21 '20 at 17:33
  • 1
    Thank you @TaureanDyerNV - i have tried BlazingSQL and found it useful given the fact that the environment is already setup – Vivek Dec 22 '20 at 07:24
2

After running the code of @TaureanDyerNV RAPIDS suggested making the following code change. It takes 15 minutes to run.

!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh 0.19

import sys, os, shutil

sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ['CONDA_PREFIX'] = '/usr/local'
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
    fn = 'lib'+so+'.so'
    source_fn = '/usr/local/lib/'+fn
    dest_fn = '/usr/lib/'+fn
    if os.path.exists(source_fn):
        print(f'Copying {source_fn} to {dest_fn}')
        shutil.copyfile(source_fn, dest_fn)
if not os.path.exists('/usr/lib64'):
    os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
    if 'libstdc' in so_file:
        shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
        shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)
Echo9k
  • 554
  • 6
  • 9
  • Given Google Colab limits on GPU usage, how comfortable it is to work with RAPIDS in Colab notebooks? Have you hit any inconveniences? – Sergey Bushmanov Jul 25 '21 at 19:29