Questions tagged [pandarallel]

21 questions
3
votes
2 answers

How Does Python Apply a Method from one Library to the Object of Another?

When using pandarallel to use all cores when running .apply methods on my dataframes, I came across a syntax which I never saw before. Rather, it's a way of using dot syntax that I don't understand. import pandas as pd from pandarallel import…
Alan
  • 1,746
  • 7
  • 21
2
votes
0 answers

Not able to run apply function in parallel processing using python

I have data-frame which has colum named label. The values present in the column is : label [1,2] [0,2,1] I want to create a vector of dimension 240 having value 1 at positions present in label…
MAC
  • 1,345
  • 2
  • 30
  • 60
1
vote
1 answer

Is there a way to speed up this pandas function that extracts from a list by its index position?

I'm using some machine learning from the SBERT python module to calculate the top K most common strings given an input coprus and a target corpus (in this case 100K vs 100K in size). The module is pretty robust and gets the comparison done pretty…
GreenGodot
  • 6,030
  • 10
  • 37
  • 66
1
vote
1 answer

'utf-8' codec can't decode byte 0x80 in position 3131: invalid start byte': while reading xml files

I want to define a function that can be implemented on each xml file in the directory in order to parse it and get the content from the tags in a dataframe. from xml.etree import ElementTree def func(path, filename): for filename in…
i_Tanya
  • 89
  • 1
  • 6
1
vote
0 answers

Getting error from pandarallel with groupby, while trying to parallelize prophet to panel-timeseries

Hi I am trying to parallelize facebook-prophet to panel-timeseries. Each series are independent from each other so there should be no problem fitting all together. What I want to do is fit a prophet model to each series simultaneously. I tried…
CheeseBurger
  • 175
  • 5
1
vote
1 answer

pandarallel widgets don't work on Google Colab

Pandarallel supports nice progress widgets. However, I can't get them to appear when using Google Colab. I get output like this instead: This chunk of code, which is supposed to enable the widgets, runs successfully in my notebook (before I use any…
Brannon
  • 5,324
  • 4
  • 35
  • 83
1
vote
1 answer

Pandarellel not progressing and at deadlock

I am running an apply function on a pandas data frame using pandarallel package with initializing 4 cores. But unfortunately the process os not processing even a single records. Where as the same without Pandarallel parallel functionality taking 3…
Jack Daniel
  • 2,527
  • 3
  • 31
  • 52
0
votes
1 answer

pandarell and lambda function

I'm struggling with the pandarell library. Here is what I'm doing: def ponerfecha(row): import datetime a = datetime.datetime(2023, 9, 10, row['HORA'], row['MINUTO']) return a CargaT['FECHATRX'] = CargaT.parallel_apply(lambda row:…
FG85
  • 41
  • 4
0
votes
0 answers

NameError looking for function when using parallel_apply from pandarallel

So for some reason when I'm trying to use parallel_apply() it would give NameError to the function, even though the function has been declared. Even when setting the axis parameter it says it's not supposed to be there. If I use the normal apply(),…
huhehu
  • 1
  • 1
0
votes
0 answers

Parallel computing on dataset

I'm totally new at python (especially at parallel computing) and recently I've got a task to count all words, count unique words, and find the top10 most frequent words from a given dataset (it contains 3 columns and 10k rows) in a parallel…
0
votes
0 answers

Python: Why is Pandarallel's first worker incredibly slow?

I am using pandarallel to apply a function to my pandas dataframe. Everything works as expected, but the first (out of eight) worker is extremely slow: INFO: Pandarallel will run on 8 workers. INFO: Pandarallel will use standard multiprocessing data…
diggi2395
  • 185
  • 8
0
votes
0 answers

Pandas pandarallel parallel_aply

Here is a simple program that works in parallel. But in has an issue when I want to use a previous result to apply. import pandas as pd import numpy as np from pandarallel import pandarallel pandarallel.initialize(nb_workers=8) #…
0
votes
0 answers

How can I parallelize this codes in Python(Pandas)?

I am trying to merge fuzzy two tables but they are really big and need too much time. Could you help and tell how I can parallelize these codes? Many thanks! for i in list1: mat1.append(process.extract(i, list2, limit=2)) SE['MergeName'] = mat1…
0
votes
0 answers

pandarallel package on windows infinite loop bug

so this is not really a question but rather a bug report for the pandarallel package: this is the end of my code: ... print('Calculate costs NEG...') for i, group in tqdm(df_mol_neg.groupby('DELIVERY_DATE')): srl_slice =…
maxxel_
  • 437
  • 3
  • 13
0
votes
1 answer

How to use apply_parallel on db calls

I was using apply_parallel function from pandarallel library, the below snippet(Function call) iterates over rows and fetches data from mongo db. While executing the same throws me EOFError and a mongo client warning as given below Mongo…
GeekGroot
  • 102
  • 6
1
2