Highest Voted 'modin' Questions

18

votes

3 answers

Cannot install RAY

Ray library from RISE lab (https://rise.cs.berkeley.edu/blog/pandas-on-ray/) I am using Windows 10 Pro, 64-bit and running these scripts from Anaconda prompt. I have tried both pip install ray and pip3 install ray with the same…

asked Feb 08 '19 at 07:55

cube

345
1
2
9

16

votes

2 answers

Comparison between Modin | Dask | Data.table | Pandas for parallel processing and out of memory csv files

What are the fundamental difference and primary use-cases for Dask | Modin | Data.table I checked the documentation of each libraries, all of them seem to offer a 'similar' solution to pandas limitations

python pandas dask modin

asked Jun 06 '19 at 19:31

Shubham Samant

171
1
5

7

votes

4 answers

Error while importing library "modin" in Python 3.6

import modin.pandas as pd I am importing modin.pandas library in my windows 10 machine but getting error "AttributeError: module 'ray' has no attribute 'utils'" Anything missed while installing modin library?

python python-3.x pandas ray modin

asked May 01 '21 at 11:18

Learnings

2,780
9
35
55

7

votes

0 answers

Is modin useful on AWS Lambda

AWS Lambda comes with 6 vCPU. Modin for Pandas promises to use cores to make processing efficient. Does this actually deliver on AWS Lambda, which otherwise does not support multi-threading, multi-processing etc. ? # import pandas as pd import…

pandas aws-lambda modin

asked Mar 29 '21 at 04:36

bonney

537
4
15

5

votes

2 answers

how to load modin dataframe from pyarrow or pandas

Since Modin does not support loading from multiple pyarrow files on s3, I am using pyarrow to load the data. import s3fs import modin.pandas as pd from pyarrow import parquet s3 = s3fs.S3FileSystem( key=aws_key, …

pyarrow modin

asked Sep 02 '20 at 12:23

galinden

610
8
13

4

votes

1 answer

modin pandas read_parquet() failed on ETag KeyError trying to read a partitioned parquet from s3

I created a dataframe from pandas and used to_parquet(...) to write to s3 directly. arguments are: df.to_parquet('s3://bucket/fn.parquet', compression='gzip', engine='fastparquet', partition_cols=['col1']) when I use pandas's…

python dataframe amazon-s3 parquet modin

asked Jun 23 '21 at 15:59

michaelgbj

290
1
10

4

votes

1 answer

Modin is taking more time than pandas for reading CSV

I'm using modin.pandas to scale pandas for large dataset. However, when using pd.read_csv to load a 5 MB csv dataset in jupyter notebook to compare the performance of modin.pandas and pandas, it gives unexpected time duration of…

python pandas parallel-processing jupyter-notebook modin

asked Jan 18 '21 at 14:58

Shradha

2,232
1
14
26

3

votes

1 answer

Ray object store running out of memory using out of core. How can I configure an external object store like s3 bucket?

import ray import numpy as np ray.init() @ray.remote def f(): return np.zeros(10000000) results = [] for i in range(100): print(i) results += ray.get([f.remote() for _ in range(50)]) Normally, when the object store fills up, it begins…

python ray modin

asked Feb 27 '21 at 23:50

testgauss321

77
1
5

3

votes

1 answer

Speeding up reading and operating on 30,000 csv files

I am using Python 3 and pandas(pd.read_csv) to read the files. There are no headers and the separator is ' |, | '. Also, the files are not .csv files and the operating system is CentOS. There are 30,000 files in a folder with a total size of 10GB.…

python pandas modin

asked Sep 22 '20 at 16:25

Adienl

155
6

3

votes

3 answers

Unable to fully install and import Modin Package

I am trying to use the modin package to speed up my pandas dataframe calculations. In short, the installation has not been as straightforward as pip install modin When simply running pip install modin everything seems to be going fine (except for…

python-3.x pandas modin

asked Nov 14 '19 at 02:06

Merv Merzoug

1,149
2
19
33

2

votes

1 answer

modin shown a warning message "Perhaps you already have a cluster running?"

I am using modin to read an sql table, however I am getting this warning import pyodbc import sqlalchemy as sal from sqlalchemy import create_engine import modin.pandas as pd from distributed import Client client = Client() …

python pandas sqlalchemy modin

asked Apr 12 '21 at 09:59

Debayan

572
6
16

2

votes

1 answer

ERROR: No matching distribution found for pandas==1.0.3 (from modin)

I'm trying to speed up my code using parallel processing with the modin library. I tried to do it with the dask engine on my Windows 10 computer but it didn't work, I thought that it is because it is still under development. I read that you can't…

python pandas parallel-processing ray modin

asked Jul 16 '20 at 12:58

Geno

21
1
3

2

votes

1 answer

Faster pandas apply using modin.pandas

Trying to use all cores for this apply function using modin.pandas from nltk.sentiment.vader import SentimentIntensityAnalyzer sid = SentimentIntensityAnalyzer() # sentiment Score of essay data = data.merge(data.essay.apply(lambda s:…

python pandas nlp modin

asked Jan 11 '20 at 07:43

dracarys3

107
2
12

2

votes

1 answer

My code is running properly in pandas, but not in modin

when i use pandas, the code works perfect ( but very slow ), and when use modin, and concat dataframe, shows me an aerror contador = 0 df = pd.DataFrame() data = pd.DataFrame() for file in range(len(files)): usefile = files[file] …

python pandas csv concatenation modin

asked Apr 15 '19 at 19:46

zkittlez

21
3

1

vote

1 answer

import modin.pandas and ray() don't close file

I'm trying to use modin and ray() but I can't move file after read it. In line shutil.move(f"./IMPORT/"+file,f"./IMPORTED/"+file) file is still open, there is some way to close it and move it in other folder? Here is entire code: import os …

python ray modin

asked Feb 16 '23 at 07:29

Angelo Malfitano

27
6

Questions tagged [modin]