In a script which process 500k links on XML validation and relaxng i tried to count cases in myFunc(). If i use global variables i had to mark them global in myFunc() before i can change them. When i printed out the values of them in myFunc() i can see that the value is changed to 1,2,3,4 and so on. But when i print out the values in run() i get not the changed values. All three variables are 0 in run(), like before changing them in myFunc().
I know that there are serious better ways to do this job. But my question is why the changed globals are not changed anymore in run() and if there is a possibilty to realize this?
Has it to do with the multiprocessing?
valid = 0
excpt = 0
relaxerr = 0
def myFunc(link):
try:
global valid
valid += 1
print valid
doc = etree.parse(urllib2.urlopen(link))
except Exception, e:
global except
excpt += 1
print excpt
with open('log.txt', 'a') as f:
f.write('%s\n' % e)
return
if not RELAXNG.validate(doc):
global relaxerr
relaxerr += 1
print relaxerr
with open('log.txt', 'a') as f:
f.write('%s\n' % RELAXNG.error_log)
return
....
do stuff for valid ....
def run():
...
pool.map_async(myFunc, links, 64)
pool.wait()
print valid
print excpt
print relaxerr