0

I have a pandas DataFrame and would like to find a way to speed up ffill and bfill operations on multiple columns. What methods exist to do this kind of operation on multiple columns in parallel?

  1. One alternative would be using numpy's structured arrays and then JIT'ing the code, operating on each column using numba.prange. This requires writing efficient ffill and bfill operations in numpy.

  2. Is there another way to make this operation parallel using possibly dask or some other parallelization technique?

DonQuixote
  • 441
  • 1
  • 5
  • 9
  • Related - https://stackoverflow.com/questions/41190852 – Divakar Nov 11 '19 at 20:33
  • Have you done any benchmarks? How do you know that `ffill()` and `bfill` are problematic and would benefit from parallelization? – AMC Nov 11 '19 at 21:32

0 Answers0