1

Hi there I'm trying to run a big for loop, 239500 iterations, I have made some tests and I've found that 200 takes me 1 hour, resulting in 2 months of cpu time.

This is the loop:

for i in range(0, MonteCarlo):
    print('Performing Monte Carlo ' + str(i) + '/' + str(MonteCarlo))

    MCR = scramble(YearPos)
    NewPos = reduce(operator.add, YearPos)
    C = np.cov(VAR[NewPos, :], rowvar=0)

    s, eof = eigs(C, k=neof, which='LR')
    sc = (s.real / np.sum(s) * 100)**2
    tcs = np.sum(sc)

    MCH = sc/tcs

    Hits[(MCH >= pcvar)] += 1

    if (Hits >= CL).all():
        print("Number of Hits is greater than 5 !!!")
        break

Where np stands for numpy ans scramble stands for random.shuffle the calculations performed within the for loop are not dependent on each other.

Is there any way to do the loop in parallel, I have 12 cores and only 1 is running.... In Matlab I would make a parfor, is there any thing similar in python?

Thanks in advance

Weather-Pirate
  • 101
  • 1
  • 9
  • May I assume `np` is `numpy`? Maybe provide relevant import lines to bring in more context. – woozyking Jan 17 '14 at 16:28
  • 5
    Take a look at the multiprocessing package. http://docs.python.org/3.3/library/multiprocessing.html – Jason Lv Jan 17 '14 at 16:30
  • Yes There Is A Way `:)` However you're going to run into trouble if you want to write to the global `Hits` array. Also if `YearPos` contains the same data between iterations you use the same `NewPos` each time, because your `reduce` statement is just a sum. Please correct me of I'm wrong. –  Jan 17 '14 at 18:55
  • A running time of 5 *days* is good enough for you? Because, even if we consider 0 overhead due to multiprocessing (which is *utterly* false) and a total running time of 2 months, you cannot expect the code to be faster then that. You should probably change the approach to achieve a better running time. – Bakuriu Jan 17 '14 at 19:16
  • Bakuriu 5 days is perfect when compared to 2 months (I can wait 5 days =) ) – Weather-Pirate Jan 20 '14 at 10:44
  • moarningsun the Hits variable can be a sum of the Hits from the processes. MCR is a n by p matrix that can be generated in each process independently and NewPos is a n*p vector. I use the reduce to reshape MCR (type list) to a vector why is MCR a list? I have daily values for some measurements and to perform a Monte Carlo statistical analysis on a SVD I need to scramble the data by year, maintaining the year variability. I use a list in order to be able to have the position of the year in the original matrix for years with 365 and 366 days in the same object. – Weather-Pirate Jan 20 '14 at 10:54

0 Answers0