I'm trying to write a simple function that reads a series of files and performs some regex search (or just a word count) on them and then return the number of matches, and I'm trying to make this run in parallel to speed it up, but so far I have been unable to achieve this.
If I do a simple loop with a math operation I do get significant performance increases. However, a similar idea for the grep function doesn't provide speed increases:
function open_count(file)
fh = open(file)
text = readall(fh)
length(split(text))
end
tic()
total = 0
for name in files
total += open_count(string(dir,"/",name))
total
end
toc()
elapsed time: 29.474181026 seconds
tic()
total = 0
total = @parallel (+) for name in files
open_count(string(dir,"/",name))
end
toc()
elapsed time: 29.086511895 seconds
I tried different versions but also got no significant speed increases. Am I doing something wrong?