I have about 4 input text files that I want to read them and write all of them into one separate file.
I use two threads so it runs faster!
Here is my questions and code in python:
1-Does each thread has its own version of variables such as "lines" inside the function "writeInFile"?
2-Since I copied some parts of the code from Tutorialspoint, I don't understand what is "while 1: pass" in the last line. Can you explain? link to the main code: http://www.tutorialspoint.com/python/python_multithreading.htm
3-Does it matter what delay I put for the threads?
4-If I have about 400 input text files and want to do some operations on them before writing all of them into a separate file, how many threads I can use?
5- If assume I use 10 threads, is it better to have the inputs in different folders (10 folders with 40 input text files each) and for each thread call one folder OR I use what I already done in the below code in which I ask each thread to read one of the 400 input text files if they have not been read before by other threads?
processedFiles=[] # this list to check which file in the folder has already been read by one thread so the other thread don't read it
#Function run by the threads
def writeInFile( threadName, delay):
for file in glob.glob("*.txt"):
if file not in processedFiles:
processedFiles.append(file)
f = open(file,"r")
lines = f.readlines()
f.close()
time.sleep(delay)
#open the file to write in
f = open('myfile','a')
f.write("%s \n" %lines)
f.close()
print "%s: %s" % ( threadName, time.ctime(time.time()) )
# Create two threads as follows
try:
f = open('myfile', 'r+')
f.truncate()
start = timeit.default_timer()
thread.start_new_thread( writeInFile, ("Thread-1", 0, ) )
thread.start_new_thread( writeInFile, ("Thread-2", 0, ) )
stop = timeit.default_timer()
print stop - start
except:
print "Error: unable to start thread"
while 1:
pass