I need the SQL equivalent of an AUTO_INCREMENT
id in hadoop.
When my reduce task identifies a new item, those items needs a unique ID assigned.
How can I share an atomic counter across the cluster? The reporter counters seem to be just increment counters, there's no getAndIncrement feature that I see.
How can I set that counter before the map/reduce phase of the job starts?