I can see two possible approaches here.
First, split your problem into two. 1 - work out what to process, 2 - do the processing. Part 1 probably has to run on its own so you end up with a 100% accurate list of what needs processing. Then you can implement fancy (or not very fancy) logic as regards splitting the list and introducing multiple threads.
Second, do something similar to what @CarlosGrappa suggests. So essentially you create each thread with its own "pre-programmed" filter. It could be the year, as Carlos suggests. Or, you could create 24 threads, one for each hour of the file timestamp. Or 60 threads, each looking at a particular minute past the hour. It can basically be anything which gives you a definite criterion for (a) splitting the load as evenly as possible, and (b) for guaranteeing that a data file is processed once-and-only-once.
Clearly the second of these approaches would run more quickly, but you'd have to put some extra thought into how you split the files up. With the first method, once you've got the full list, you could basically chuck 100, or 1000, or 10000 etc. files at a time at your processors without being overly smart about how you do it.