I'm having a few million/billion (10^9) data-input-sets, that need to be processed. They are quiet small < 1kB. And they need about 1 second to be processed.
I have read a lot about Apache Hadoop, Map Reduce and StarCluster. But I am not sure what the most efficient and fastest way is, to process it?
I am thinking of using Amazon EC2 or a similar cloud service.