0

I have a (actually, two) questions regarding Python and showing progress bars. I am writing an algorithm that operates on large datasets (~10^10 elements or more) and the two most time-consuming steps are performing a 2D-FFT (implemented using scipy.fft.fft2) and sorting the flattened array (implemented using numpy.argsort). As such, I would love to have some sort of progress indication (preferably using tqdm), but I am not sure how to implement this on package supplied functions.

As I understand a 2D-FFT is just a 1D-fft on all the rows and then the columns: therefore, I suspect it is possible to show the current row/column on which the FFT is performed? As far as numpy.argsort goes (which I have implemented using the default QuickSort), I am not sure if the runtime can be determined a priori (or estimated) as I am not really familiar with the algorithm. Any solution, tips or feedback is appreciated!

MWE:

import numpy as np
from scipy.fft import fft2
# Import tqdm       << preferably some wrapper for tqdm?

# Create random data
data = np.random.uniform(0, 1E4, (10000, 10000))

# Perform fft2
transform = fft2(data)
print("FFT complete!")

# Sort the array
indices = np.argsort(data.flatten())
print("Sort complete!")

I have not really tried writing my own implementation of 2D-FFT/QuickSort, as that seems like a last resort and has the major disadvantages that subsequent updates/improvements of the former methods are not incorperated.

Ideally, I would (probably) like some kind of wrapper, but I don't know if this is possible without modifying the original methods..

  • 1
    A 2D FFT *can* be implemented via repeated 1D FFTs but specialized 2D implementations are faster so I doubt you get anything good out of this. In contrast, I recommend you switch to [pyFFTW](https://stackoverflow.com/questions/6365623/improving-fft-performance-in-python) if that is license-compatible (GPL) with your application. It is generally faster, especially when you use the parallelized versions of FFTW. – Homer512 Aug 24 '23 at 22:12
  • Thanks, I was guessing that it would be cumbersome (but hoping that it would be circumventable :) And thanks for the tip, I'll definitely check that out! – Bart Wolleswinkel Aug 25 '23 at 09:49
  • 1
    Short of benchmarking and curve-fitting, I don't think estimating the runtime a-priori will work. I mean, everyone knows that sorting and 1D FFT take O(n log(n)) time but on real hardware you'll find discontinuities where implementations change, effects depending on cache sizes, and of course background activity – Homer512 Aug 25 '23 at 10:01

0 Answers0