I parse a big source code directory (100k files). I traverse every line in every file and do some simple regex matching. I tried threading this task to multiple threads but didn't get any speedup. Only multiprocessing managed to cut the time by 70%. I'm aware of the GIL death grip, but aren't threads supposed to help with IO bound access?
If the disk access is serial, how come several processes finish the job quicker?