How to improve the computational time of a Python script having access to a cluster

Question

I'm fairly new to Python so I apologize in advance for the (numerous) errors that you will see in my code.

I have a quite simple Python script that is causing me some troubles: the code reads an output file coming from a Gate simulation, the content of this file is mostly numerical values in ASCII format. The data in the file is organized in rows and each row has 22 different numbers, the total number of rows depends on the simulation running time and can be quite large (the largest one I have at the moment is aroung 11 Gb). Here's a typical line of the file as reference:

      0       0  -1   0     0    -1    -1   4.99521423437116252053158e-01  7.910e-02  1.347e+01 -1.600e+01 -1.600e+01 -1.347e+01      22     1     0  -1   -1   -1 Compton NULL NULL

What my Python code does is read the numerical values from each row of this file and extract the ones that I'm interested in; some of the extracted values are then analyzed through a for loop and finally a group of these values is written on an output file in ASCII format.

The code is working fine, however as I started to work with longer simulation time (and therefore larger input file) the computational time of the code has become too long, making it inefficient. My goal is to reduce as much as possible the computational time.

Here's part of the code that I'm currently using and is causing the slowdown:

#position of centres
PMTloc =np.array([[-270.00, -156.00], [-270.00, -104.00], [-270.00, -52.00], [-270.00,  0.00], [-270.00,    52.00], [-270.00,   104.00], [-270.00,  156.00], [-225.00,  -182.00], [-225.00, -130.00], [-225.00, -78.00], [-225.00,  -26.00], [-225.00,  26.00], [-225.00,   78.00], [-225.00,   130.00], [-225.00,  182.00], [-180.00,  -208.00], [-180.00, -156.00], [-180.00, -104.00], [-180.00, -52.00], [-180.00,  0.00], [-180.00,    52.00], [-180.00,   104.00], [-180.00,  156.00], [-180.00,  208.00], [-135.00,  -234.00], [-135.00, -182.00], [-135.00, -130.00], [-135.00, -78.00], [-135.00,  -26.00], [-135.00,  26.00], [-135.00,   78.00], [-135.00,   130.00], [-135.00,  182.00], [-135.00,  234.00], [-90.00,   -208.00], [-90.00,  -156.00], [-90.00,  -104.00], [-90.00,  -52.00], [-90.00,   0.00], [-90.00, 52.00], [-90.00,    104.00], [-90.00,   156.00], [-90.00,   208.00], [-45.00,   -234.00], [-45.00,  -182.00], [-45.00,  -130.00], [-45.00,  -78.00], [-45.00,   -26.00], [-45.00,   26.00], [-45.00,    78.00], [-45.00,    130.00], [-45.00,   182.00], [-45.00,   234.00], [0.00, -208.00], [0.00,    -156.00], [0.00,    -104.00], [0.00,    -52.00], [0.00, 0.00], [0.00,   52.00], [0.00,  104.00], [0.00, 156.00], [0.00, 208.00], [45.00,    -234.00], [45.00,   -182.00], [45.00,   -130.00], [45.00,   -78.00], [45.00,    -26.00], [45.00,    26.00], [45.00, 78.00], [45.00, 130.00], [45.00,    182.00], [45.00,    234.00], [90.00,    -208.00], [90.00,   -156.00], [90.00,   -104.00], [90.00,   -52.00], [90.00,    0.00], [90.00,  52.00], [90.00, 104.00], [90.00,    156.00], [90.00,    208.00], [135.00,   -234.00], [135.00,  -182.00], [135.00,  -130.00], [135.00,  -78.00], [135.00,   -26.00], [135.00,   26.00], [135.00,    78.00], [135.00,    130.00], [135.00,   182.00], [135.00,   234.00], [180.00,   -208.00], [180.00,  -156.00], [180.00,  -104.00], [180.00,  -52.00], [180.00,   0.00], [180.00, 52.00], [180.00,    104.00], [180.00,   156.00], [180.00,   208.00], [225.00,   -182.00], [225.00,  -130.00], [225.00,  -78.00], [225.00,   -26.00], [225.00,   26.00], [225.00,    78.00], [225.00,    130.00], [225.00,   182.00]])


#checking number of arguments passed
n = len(sys.argv)
if n<3:
    print("Error number of arguments passed is incorrect. Try again...")

print("argv[0]: {0}".format(argv[0]))
print("argv[1]: {0}".format(argv[1]))
input_file = argv[1] 
output_file = argv[2] 
hits = []
with open(input_file, 'r') as DataIn:
    for line in DataIn:
        hits += [line.split()]



totalPMT = 108
runID = [x[0] for x in hits] 
runid=[int(i) for i in runID]

eventID = [x[1] for x in hits]
eventid=[int(i) for i in eventID]


time = [x[7] for x in hits]
time_fl = [float(i) for i in time]

posX = [x[10] for x in hits]
posx = [float(i) for i in posX]

posY = [x[11] for x in hits]
posy = [float(i) for i in posY]

posZ = [x[12] for x in hits]
posz = [float(i) for i in posZ]

partID = [x[13] for x in hits]
partid = [int(i) for i in partID]


PMTs = [0]*108 #initial output signal
nscinti = 0 
nphoton = 0 
startscinti = 0 
zscinti = 0 
Etres = 100 #energy treshold
numb_22 = 0
numb_0 = 0
eventIDnow = [0]*len(eventid) 
eventidnow = [int(i) for i in eventIDnow]
runIDnow = [0]*len(runid)
runidnow = [int(i) for i in runIDnow]


for i in range(len(eventid)):

    if ((eventid[i] > eventidnow[i]) | (runid[i] > runidnow[i])):
    
        if ((nscinti>0) & (partid[i]==22) & (nphoton>Etres)):  #check if last recorded event should be written 
           
            with open(output_file, 'a+') as DataOut:
                DataOut.write("{0} {1} {2}\n".format(startscinti, zscinti, ' '.join(map(str, PMTs))))

     
            
        
        
    for l in range(len(eventidnow)): #updating values of eventidnow and runidnow
        eventidnow[l] =  eventid[i]
        runidnow[l] = runid[i]

    startscinti = time_fl[i]
    zscinti = posz[i]
    nphoton = 0
    nscinti += 1
    for j in range(len(PMTs)):
        PMTs[j] = 0

#checking where the event get recorded
for k in range(len(PMTs)):
    if ((k<=totalPMT) & ((posx[i]-PMTloc[k][0])*(posx[i]-PMTloc[k][0]) + (posy[i]-PMTloc[k][1])*(posy[i]-PMTloc[k][1])<=23*23)):

        PMTs[k] += 1        
        nphoton += 1
        break

I'm fully aware that this code is far from being perfect and optmized so I'm open to any suggestion on how to improve it.

From the measurements that I have done it takes around 35 minutes for the script to complete its operation on a 224MB input file, with an input file of around 10GB the computational time goes above 100hrs.

I would like to add that I have access to a cluster where I can run this code and I can use up to 12-16 cores, ideally I would like to take advantage of this cores in order to improve the performance of my script. By doing some research I found out about multiprocessing and parallelization but, due to my inexperience, I'm not sure if those methods can be applied to my case nor if they would actually reduce the computational time. Nonetheless I wasn't able to implement them correctly. Any help on how I could possibly use multiprocessing or parallelization would be appreciated.

In any case I'm open to any possible suggestion to improve my code, it doesn't have to take advantage of the cluster if there are better ways to achieve my goal.

Thanks everyone for your time!

UPDATE 19/08

Thanks to the help of @DarrylG I was able to improve the computational time of my script, it is around 1/8 of my initial time, and I'm happy with this result. I'll post here the version that I'm using right now:

import sys
import numpy as np
import time
import numpy as np
import itertools
import sys
from sys import argv
import os
import time
from timeit import default_timer as timer
from datetime import timedelta

def int_or_float(s):
    ' Convert string from to int or float '
    try:
        return int(s)
    except ValueError:
        try:
            return float(s)
        except ValueError:
            return s

def process(hits):

    totalPMT = 108

    runid, eventid, time_fl, posx, posy, posz, partid = zip(*[(x[0], x[1], x[7], x[10], x[11], x[12], x[13]) for x in hits])
    
    runid=[int(i) for i in runid]
    eventid=[int(i) for i in eventid]
    time_fl = [float(i) for i in time_fl]
    posx = [float(i) for i in posx]
    posy = [float(i) for i in posy]
    posz = [float(i) for i in posz]
    partid = [int(i) for i in partid]
    
    

    PMTs = [0]*108 #initial output signal
    nscinti, nscinti, nphoton, startscinti, zscinti = [0]*5

    Etres = 100 #energy treshold
    numb_22, numb_0 = 0, 0
    
    eventIDnow = [0]*len(eventid) 
    runIDnow = [0]*len(runid)

    # Previous value of eventID and runID
    eventIDnow = 0
    runIDnow = 0
    
    results = []
    for i in range(len(eventid)):
        if (eventid[i] > eventIDnow) or (runid[i] > runIDnow):
            if ((nscinti>0) and (partid[i]==22) and (nphoton>Etres)):
                print("Writing output file")
            
                results.append("{0} {1} {2}\n".format(startscinti, zscinti, ' '.join(map(str, PMTs))))


            eventIDnow = eventid[i]
            runIDnow = runid[i]

            startscinti = time_fl[i]
            zscinti = posz[i]
            nphoton = 0
            nscinti += 1
            PMTs = [0]*len(PMTs)
       
        #checking where the event get recorded
        k = next((k for k in range(len(PMTs)) 
                    if ((k<=totalPMT) and 
                        ((posx[i]-PMTloc[k][0])*(posx[i]-PMTloc[k][0]) + 
                         (posy[i]-PMTloc[k][1])*(posy[i]-PMTloc[k][1])<=23*23))), 
                None)
        if k:
            PMTs[k] += 1        
            nphoton += 1
    
    return results
                
#position of centres
PMTloc =np.array([[-270.00, -156.00], [-270.00, -104.00], [-270.00, -52.00], [-270.00,  0.00], [-270.00,    52.00], [-270.00,   104.00], [-270.00,  156.00], [-225.00,  -182.00], [-225.00, -130.00], [-225.00, -78.00], [-225.00,  -26.00], [-225.00,  26.00], [-225.00,   78.00], [-225.00,   130.00], [-225.00,  182.00], [-180.00,  -208.00], [-180.00, -156.00], [-180.00, -104.00], [-180.00, -52.00], [-180.00,  0.00], [-180.00,    52.00], [-180.00,   104.00], [-180.00,  156.00], [-180.00,  208.00], [-135.00,  -234.00], [-135.00, -182.00], [-135.00, -130.00], [-135.00, -78.00], [-135.00,  -26.00], [-135.00,  26.00], [-135.00,   78.00], [-135.00,   130.00], [-135.00,  182.00], [-135.00,  234.00], [-90.00,   -208.00], [-90.00,  -156.00], [-90.00,  -104.00], [-90.00,  -52.00], [-90.00,   0.00], [-90.00, 52.00], [-90.00,    104.00], [-90.00,   156.00], [-90.00,   208.00], [-45.00,   -234.00], [-45.00,  -182.00], [-45.00,  -130.00], [-45.00,  -78.00], [-45.00,   -26.00], [-45.00,   26.00], [-45.00,    78.00], [-45.00,    130.00], [-45.00,   182.00], [-45.00,   234.00], [0.00, -208.00], [0.00,    -156.00], [0.00,    -104.00], [0.00,    -52.00], [0.00, 0.00], [0.00,   52.00], [0.00,  104.00], [0.00, 156.00], [0.00, 208.00], [45.00,    -234.00], [45.00,   -182.00], [45.00,   -130.00], [45.00,   -78.00], [45.00,    -26.00], [45.00,    26.00], [45.00, 78.00], [45.00, 130.00], [45.00,    182.00], [45.00,    234.00], [90.00,    -208.00], [90.00,   -156.00], [90.00,   -104.00], [90.00,   -52.00], [90.00,    0.00], [90.00,  52.00], [90.00, 104.00], [90.00,    156.00], [90.00,    208.00], [135.00,   -234.00], [135.00,  -182.00], [135.00,  -130.00], [135.00,  -78.00], [135.00,   -26.00], [135.00,   26.00], [135.00,    78.00], [135.00,    130.00], [135.00,   182.00], [135.00,   234.00], [180.00,   -208.00], [180.00,  -156.00], [180.00,  -104.00], [180.00,  -52.00], [180.00,   0.00], [180.00, 52.00], [180.00,    104.00], [180.00,   156.00], [180.00,   208.00], [225.00,   -182.00], [225.00,  -130.00], [225.00,  -78.00], [225.00,   -26.00], [225.00,   26.00], [225.00,    78.00], [225.00,    130.00], [225.00,   182.00]])

#checking number of arguments passed
if __name__ == "__main__":
    n = len(sys.argv)
    if n<3:
        print("Error number of arguments passed is incorrect. Try again...")

    print("argv[0]: {0}".format(argv[0]))
    print("argv[1]: {0}".format(argv[1]))
    input_file = argv[1] 
    output_file = argv[2] 
    
    #t0 = time.time()
    start = timer()
    hits = []
    with open(input_file, 'r') as data_in:
        for line in data_in:
            hits += [line.split()]
        
    results = process(hits)

    with open(output_file, 'a') as data_out:
        data_out.writelines(results)
        
    
    end = timer()
    print("Elapsed Time: ", timedelta(seconds = end-start))

I have a doubt about this script however, to be more precise it is related to this part:

hits = []
    with open(input_file, 'r') as data_in:
        for line in data_in:
            hits += [line.split()]

This way of splitting the line of the input file should be very inefficient compared to this one in which I use the function int_or_float(s) :

 with open(input_file, 'r') as data_in:
        hits = [[int_or_float(s) for s in line.split()] for line in data_in]

However after testing with multiple input file (with sizes ranging from just a few MBs to a couple of GBs) I'm getting the best results, computational time wise, with the first method (the inefficient one). As a reference, analyzing the same input file of around 2.3GB the "inefficient" method takes around 50mins while the "function" method around 55mins; in both cases I'm using the same Python code to analyze the input file, the only difference is the method I use, "inefficient" or "function".

Any idea on why this could be happening?

Thank you for your help!

UPDATE 20/08

Thanks to @DarrylG I got a new version of the script that I had the chance to test this morning. I did comparison between my latest version, the one I posted on the 19/08 UPDATE of my original post, and the latest version of DarrylG that can be found in the accepted answer (the version is noted as "Faster"). I compared the time that both versions of the script need to generate a complete output file, the test has been done using input file of different sizes:

Input file size 224MB

Computational time of my version: 4 minutes 10 seconds.
Computational time of DarrylG version: 4 minutes 22 seconds.

Input file size 2.3GB

Computational time of my version: 48 minutes 3 seconds.
Computational time of DarrylG version: 42 minutes 2 seconds.

I also checked the output file that were generated to see if there were any difference, but they're identical.

Although this is far from being an in depth test, I believe it's safe to say that DarrylG latest version is the fastest one, especially when working with input file of large size.

Thanks everyone for your help, in case someone has more questions about the performance of the script feel free to contact me.

Multiprocessing is the way to go. I would be thinking about splitting my input file into manageable chunks. If you're running on a Unix type platform then you could invoke 'split' in a subprocess (check the Unix man page for that). Then run a Process to handle each of the smaller files. That's the simplistic approach. However, if the order in which the files are processed is important then this may not help. Also, you'll need to manage the file-based output to make sure you don't have multiple process trying to write to the output file concurrently — , Aug 18 '21 at 10:56
First, I would fix some of the great inefficiencies in the current code. The biggest inefficiency being this: `for line in DataIn: hits += [line.split()]`, which is very bad for large files (i.e. you're duplicating the data then appending hundreds of thousands of times). Second, multiprocessing may not be possible due to your double for loop near the end (i.e. `for i in range(len(eventid)):...for l in range(len(eventidnow)):` meaning processing is not independent up depends upon processed data from other iterations. — DarrylG, Aug 18 '21 at 11:23
Python logical operators are `and`, `or` (not `&`, `|` which are bitwise operators). — DarrylG, Aug 18 '21 at 11:41
Are you just comparing the times for `hits = [] with open(input_file, 'r') as data_in: for line in data_in: hits += [line.split()]` vs `with open(input_file, 'r') as data_in: hits = [[int_or_float(s) for s in line.split()] for line in data_in]`? — DarrylG, Aug 19 '21 at 13:44
@DarrylG I'm comparing the times of the complete script execution from start to finish, I'm working on an input file and generating the final output aswell. I'm using the code I posted in the updated version of my original post and compared the different result I got by just swapping between "inefficient" and "function" methods. — Cosmic_V, Aug 19 '21 at 13:51
@Cosmic_V--updated my answer which should shed some light on your last question. — DarrylG, Aug 19 '21 at 14:25
@Cosmic_V--updated my answer to make your **UPDATE 19/08** version ~15% faster. — DarrylG, Aug 19 '21 at 16:07
@DarrylG Great! I'll test it tomorrow and let you know the result. — Cosmic_V, Aug 19 '21 at 19:06
@DarrylG I updated the original post with the latest results of my testing, your version is indeed faster especially when working with input file of larger size. Thanks for your help! — Cosmic_V, Aug 20 '21 at 09:49
@Cosmic_V--great. By the way you say: "it is roughly **halved** compared to my starting version". But, since the time has gone from ~35 minutes to ~4 minutes for 225 Mbyte case isn't it closer to 1/8 rather than 1/2 the time? — DarrylG, Aug 20 '21 at 10:08
@DarrylG You're right! I updated the post so it should be correct now. — Cosmic_V, Aug 20 '21 at 13:24

DarrylG · Accepted Answer · 2021-08-19T16:06:05.900

Answer to Updated Post

The reason for int_or_float version taking longer is you are during data conversion twice.

Once with user function int_or_float
Then again using builtin functions int, float on individual columns.

Actually

hits += [line.split()]

And:

hits.append([line.split()])

Both run at the same speed since both are done in place (i.e. my original comment below is incorrect on this). See Why does += behave unexpectedly on lists?.

Faster than Last OP Version

This is a slight speed up to OP revised version (i.e. 19 seconds vs. 22 seconds on my machine).

import sys
import numpy as np
import time
import numpy as np
import itertools
import sys
from sys import argv
import os
import time
from timeit import default_timer as timer
from datetime import timedelta

def get_data(input_file):
    '''
    Uses Numpy loadtxt for loading and data type conversion

    '''
    # Only load the columns of interest
    arr = np.loadtxt(input_file, usecols=[0, 1, 7, 10, 11, 12, 13], converters = {0:int, 1:int, 2:float, 3:float, 
                                                                                   4:float, 5:float, 6:int})
    
    # Place into individual arrays
    runid, eventid, time_fl, posx, posy, posz, partid = [arr[:, i] for i in range(arr.shape[1])]

    runid = runid.astype(int)
    eventid = eventid.astype(int)
    partid = partid.astype(int)

    # Convert from numpy to regular lists
    runid = runid.tolist()
    eventid = eventid.tolist()
    time_fl = time_fl.tolist()
    posx = posx.tolist()
    posy = posy.tolist()
    posz = posz.tolist()
    partid = partid.tolist()
    
    return runid, eventid, time_fl, posx, posy, posz, partid


def process(runid, eventid, time_fl, posx, posy, posz, partid):

    totalPMT = 108

    PMTs = [0]*108 #initial output signal
    nscinti, nscinti, nphoton, startscinti, zscinti = [0]*5

    Etres = 100 #energy treshold
    numb_22, numb_0 = 0, 0
    
    eventIDnow = [0]*len(eventid) 
    runIDnow = [0]*len(runid)

    # Previous value of eventID and runID
    eventIDnow = 0
    runIDnow = 0
    
    results = []
    for i in range(len(eventid)):
        if (eventid[i] > eventIDnow) or (runid[i] > runIDnow):
            if ((nscinti>0) and (partid[i]==22) and (nphoton>Etres)):
                print("Writing output file")
            
                results.append("{0} {1} {2}\n".format(startscinti, zscinti, ' '.join(map(str, PMTs))))


            eventIDnow = eventid[i]
            runIDnow = runid[i]

            startscinti = time_fl[i]
            zscinti = posz[i]
            nphoton = 0
            nscinti += 1
            PMTs = [0]*len(PMTs)
       
        #checking where the event get recorded
        k = next((k for k in range(len(PMTs)) 
                    if ((k<=totalPMT) and 
                        ((posx[i]-PMTloc[k][0])*(posx[i]-PMTloc[k][0]) + 
                         (posy[i]-PMTloc[k][1])*(posy[i]-PMTloc[k][1])<=23*23))), 
                None)
        if k:
            PMTs[k] += 1        
            nphoton += 1
    
    return results
                
#position of centres
PMTloc =np.array([[-270.00, -156.00], [-270.00, -104.00], [-270.00, -52.00], [-270.00,  0.00], [-270.00,    52.00], [-270.00,   104.00], [-270.00,  156.00], [-225.00,  -182.00], [-225.00, -130.00], [-225.00, -78.00], [-225.00,  -26.00], [-225.00,  26.00], [-225.00,   78.00], [-225.00,   130.00], [-225.00,  182.00], [-180.00,  -208.00], [-180.00, -156.00], [-180.00, -104.00], [-180.00, -52.00], [-180.00,  0.00], [-180.00,    52.00], [-180.00,   104.00], [-180.00,  156.00], [-180.00,  208.00], [-135.00,  -234.00], [-135.00, -182.00], [-135.00, -130.00], [-135.00, -78.00], [-135.00,  -26.00], [-135.00,  26.00], [-135.00,   78.00], [-135.00,   130.00], [-135.00,  182.00], [-135.00,  234.00], [-90.00,   -208.00], [-90.00,  -156.00], [-90.00,  -104.00], [-90.00,  -52.00], [-90.00,   0.00], [-90.00, 52.00], [-90.00,    104.00], [-90.00,   156.00], [-90.00,   208.00], [-45.00,   -234.00], [-45.00,  -182.00], [-45.00,  -130.00], [-45.00,  -78.00], [-45.00,   -26.00], [-45.00,   26.00], [-45.00,    78.00], [-45.00,    130.00], [-45.00,   182.00], [-45.00,   234.00], [0.00, -208.00], [0.00,    -156.00], [0.00,    -104.00], [0.00,    -52.00], [0.00, 0.00], [0.00,   52.00], [0.00,  104.00], [0.00, 156.00], [0.00, 208.00], [45.00,    -234.00], [45.00,   -182.00], [45.00,   -130.00], [45.00,   -78.00], [45.00,    -26.00], [45.00,    26.00], [45.00, 78.00], [45.00, 130.00], [45.00,    182.00], [45.00,    234.00], [90.00,    -208.00], [90.00,   -156.00], [90.00,   -104.00], [90.00,   -52.00], [90.00,    0.00], [90.00,  52.00], [90.00, 104.00], [90.00,    156.00], [90.00,    208.00], [135.00,   -234.00], [135.00,  -182.00], [135.00,  -130.00], [135.00,  -78.00], [135.00,   -26.00], [135.00,   26.00], [135.00,    78.00], [135.00,    130.00], [135.00,   182.00], [135.00,   234.00], [180.00,   -208.00], [180.00,  -156.00], [180.00,  -104.00], [180.00,  -52.00], [180.00,   0.00], [180.00, 52.00], [180.00,    104.00], [180.00,   156.00], [180.00,   208.00], [225.00,   -182.00], [225.00,  -130.00], [225.00,  -78.00], [225.00,   -26.00], [225.00,   26.00], [225.00,    78.00], [225.00,    130.00], [225.00,   182.00]])

#checking number of arguments passed
if __name__ == "__main__":
    n = len(sys.argv)
    if n<3:
        print("Error number of arguments passed is incorrect. Try again...")

    print("argv[0]: {0}".format(argv[0]))
    print("argv[1]: {0}".format(argv[1]))
    input_file = argv[1] 
    output_file = argv[2] 

    #t0 = time.time()
    start = timer()
    
    runid, eventid, time_fl, posx, posy, posz, partid = get_data(input_file)
    results = process(runid, eventid, time_fl, posx, posy, posz, partid)

    with open(output_file, 'a') as data_out:
        data_out.writelines(results)
        
    
    end = timer()
    print("Elapsed Time: ", timedelta(seconds = end-start))

@Cosmic_V--forgot that some of the values are strings. I updated int_or_float accordingly. So: `[int_or_float(x) for x in [1, 2.5, 'hello']]` results in `[1, 2, 'hello']`. Let me know of your timing update. I could make more suggestions depending upon your new times. — DarrylG, Aug 18 '21 at 13:27
Thank you for your answer, I'll try to implement what you suggested today! In regards of the possible bug with final k-loop you are right, I made an error when I copied my code here, the k-loop is indeed inside the i-loop. I have a question about the int_or_float(s) function: in my input file in each line the three final elements are string (in the line I posted here as a reference I got the string 'Compton' for instance) wouldn't this cause an error when trying to convert it into a float? Just saw your update, I'll let you know about the times, thanks again! — Cosmic_V, Aug 18 '21 at 13:30
@Cosmic_V--see my comment above regarding int_or_float (i.e. fixed it). — DarrylG, Aug 18 '21 at 13:31
Just finished testing your version and the computational time has improved a lot: it went from around 35 mins with my code to just above 9 mins with yours! I will check if there's any error in the output file but from what I've seen everything seems to be working as intended. I take this opportunity to ask you one more thing: do you think there's still some room for improvement? Once again thank you for your help! — Cosmic_V, Aug 18 '21 at 14:00
@Cosmic_V--can you provide a link to some data I could test to improve the algorithm performance? It doesn't have to be overly huge (i.e. just 10's of Mbytes). Perhaps pastebin or github could be used to provide a link. — DarrylG, Aug 18 '21 at 14:04
@Cosmic_V-confused by `for l in range(len(eventidnow)): eventidnow[l] = eventid[i]`. Except when i is 0, doesn't this `make eventidnow[i] == eventid[i]`? In which case is this ever True: `if ((eventid[i] > eventidnow[i]) or (runid[i] > runidnow[i])):`? — DarrylG, Aug 18 '21 at 15:05
here's a link to a typical input file: https://raw.githubusercontent.com/CosmicVale/Computationaltime/main/test_inputfile.dat. Let me know in case there's any problem. — Cosmic_V, Aug 18 '21 at 15:18
@Cosmic_V--Thaks. Any feedback to my question about `if ((eventid[i] > eventidnow[i]) or (runid[i] > runidnow[i])):` above? — DarrylG, Aug 18 '21 at 15:24
About your question: in the input file I have multiple consecutive lines that might have the same value for eventid and/or runid. In simple terms each run has multiple events and each one of these event has multiple interactions happening. I use the line if ((eventid[i] > eventidnow[i]) or (runid[i] > runidnow[i])) to check if I'm analyzing a different event because I want to record them separately. Ideally a change in the runid also means that I'm working with a different event, it's just an extra check that I was suggested to do. — Cosmic_V, Aug 18 '21 at 15:46
@Cosmic_V--is there any output for the data you posted? Meaning does DataOut.write get executed? I ask since I have a version which executes the file in a few seconds but there is no output. — DarrylG, Aug 18 '21 at 16:30
Both my and your "Revised" versions of the code generate an output file with data inside, I have yet to fully check the output of your revised version but it seems reasonable. Your latest version of the code "Faster" doesn't generate any output, I also tested it on the complete input file but the output file is empty. I can confirm you that this latest version is even faster than your previous one, I'm not sure how much the writing part would influence the computational time, but the complete execution of the code now takes around 5 mins instead of 9! Again thank you for your help! — Cosmic_V, Aug 18 '21 at 19:37
@Cosmic_V--I will double-check for an error in the "Faster Version". Writing should not take that long, but the problem is the "Faster Version" is not finding items to satisfy the conditional. I didn't actually run the revised version since it was taking too long on my machine so I quit before completion. — DarrylG, Aug 18 '21 at 19:52
I will check it aswell because it seems that with your faster version I could reduce the computational time of another 40% which is great! I also double checked if your Revised version and my version of the code would generate an output with the short input file I linked before: I can confirm that both versions generate an output file, I also checked the numerical values written on the output file this time and I can cofirm they're the same. — Cosmic_V, Aug 18 '21 at 20:21
@Cosmic_V--I can't get an output even when I use your original code. Are you sure about the file you uploaded? — DarrylG, Aug 18 '21 at 23:25
Yes I tried again both my version and your with the same short input file. Here's my working version https://raw.githubusercontent.com/CosmicVale/Computationaltime/main/readfullbin108pmts_ascii_speedtest.py and its output https://raw.githubusercontent.com/CosmicVale/Computationaltime/main/outputshort_mine.dat.dat. Your revised version that I'm running https://raw.githubusercontent.com/CosmicVale/Computationaltime/main/readfullbin108pmts_ascii_ver2.py and its output https://raw.githubusercontent.com/CosmicVale/Computationaltime/main/outputshort_revisedcode.dat.dat. I hope it can help. — Cosmic_V, Aug 19 '21 at 06:44
I was able to make your "Faster" version work, there were some problems with the indentation of some of the loops in the code. The computional time is roughly halved compared to my version, that's a great result! I updated the original post with the final version of the code and an analysis about the efficiency of the function `int_or_float(s)`. — Cosmic_V, Aug 19 '21 at 13:13

How to improve the computational time of a Python script having access to a cluster

1 Answers1