8

I have written a custom partitioner. When I have number of reduce tasks greater than 1, the job is failing. This is the exception which I'm getting:

 java.io.IOException: Illegal partition for weburl_compositeKey@804746b1 (-1)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:930)
 at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:499)

The code which I have written is

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return (key.hashCode()) % numPartitions;
}

This the key.hashCode() equals -719988079 and mod of this value is returning -1.

Appreciate your help on this. Thanks.

harpun
  • 4,022
  • 1
  • 36
  • 40
Maverick
  • 484
  • 2
  • 9
  • 20

3 Answers3

24

The calculated partition number by your custom Partitioner has to be non-negative. Try:

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return (key.hashCode() & Integer.MAX_VALUE) % numPartitions;
}
harpun
  • 4,022
  • 1
  • 36
  • 40
4

A warning about using:

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return Math.abs(key.hashCode()) % numPartitions;
}

If you hit the case where the key.hashCode() is equal to Integer.MIN_VALUE you will still get a negative partition value. It is an oddity of Java, but Math.abs(Integer.MIN_VALUE) returns Integer.MIN_VALUE ( as in -2147483648). You are safer taking the absolute value of the modulus, as in:

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return Math.abs(key.hashCode() % numPartitions);
}
starkadder
  • 311
  • 3
  • 3
2

Or you can use

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return (key.hashCode() & Integer.MAX_VALUE) % numPartitions;
}
Tanveer
  • 890
  • 12
  • 22