0

I'm trying to use Modin on Databricks and getting this error

I've tried both pip install modin[all] and pip install modin[ray]

Firstly, the installation takes 15 minutes, which is weird.

After installing, I'm doing

import modin.pandas as md
df = md.read_parquet('s3://path/to/file')

Getting this error

ModuleNotFoundError: No module named 'ray'

I have also tried setting os.environ["MODIN_ENGINE"] = "ray"

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Vishal Balaji
  • 667
  • 3
  • 11
  • @Ramya Ravi - I don't think we need to add the [intel] tag when there's a specific tag like [intel-modin] that covers what makes the question Intel-related. I follow the [intel] tag because sometimes people tag it instead of [x86] on [assembly] questions, but I don't want to see it tagged on questions where it doesn't need to be, like this or [intel-fortran] or other questions about software Intel happens to make. The Intel collective already includes all the `[intel-whatever]` tags, or should. Is that how you folks working for Intel(?) see the [intel] tag? – Peter Cordes Oct 20 '22 at 13:27

2 Answers2

1

I followed the below steps to install Modin using Ray execution engine. Install Modin dependencies and Ray to run on Ray -

    pip install modin[ray] 

Also, please customize your Ray environment for use in Modin using the below commands.

    import ray
    ray.init()
    import modin.pandas as pd

Please check out Intel Distribution of Modin (https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-of-modin.html#gs.14j7r0) and Modin official page (https://modin.readthedocs.io/en/stable/) for installation issues and to accelerate pandas workflow on Intel architectures.

Ramya R
  • 163
  • 8
0

Try

pip install ray

May be this will help you.

Berlin Benilo
  • 472
  • 1
  • 12