2

I'm trying to understand how to write and read an int array into/from a file in the HDFS.. since a int[] array isn't a Writable object, I'm actually using the class org.apache.hadoop.io.ObjectWritable

So the write task boils down to the following call:

new ObjectWritable(int[].class, array).write(arg0);

Instead, the read task causes the following:

int[] array = {};
new ObjectWritable(int[].class, array).readFields(arg0);

I'm not so sure about the last code snippet. In fact if I try to execute it, I get a NullPointerException on the second line.

How can I perform correctly a read of an int[] array?

harpun
  • 4,022
  • 1
  • 36
  • 40
Matteo
  • 276
  • 1
  • 4
  • 18

2 Answers2

3

For arrays of objects you should use the built-in class ArrayWritable. As the javadoc states you have to subclass it and create a new type like IntArrayWritable, which sets the proper class type of the array elements.

Take a look at an example showing how to populate an IntArrayWritable in the mapper.

Community
  • 1
  • 1
harpun
  • 4,022
  • 1
  • 36
  • 40
0

Even though the solution of harpun works properly, another question has came to my mind.. This question is more on performances, seen as my application do tons of int[] writes and reads..

I think should be more performant the following solution..

WRITE:

WritableUtils.writeVInt(out, array.length);
for(int i=0; i<array.length; i++)
   WritableUtils.writeVInt(out, array[i]);

READ:

int[] array = new array[WritableUtils.readVInt(in)];
for(int i=0; i<array.length; i++)
   array[i] = WritableUtils.readVInt(in);

Instead of wrapping it into a IntArrayWritable every time..

WRITE:

IntWritable[] a = new IntWritable[array.length];
for(int i=0; i<a.length; i++)
   a[i] = new IntWritable(array[i]);
IntArrayWritable arrayWritable = new IntArrayWritable();
arrayWritable.set(a);
arrayWritable.write(arg0);

READ:

IntArrayWritable arrayWritable = new IntArrayWritable();
arrayWritable.readFields(arg0);
Writable[] a = arrayWritable.get();
int[] array = new int[a.length];
for(int i=0; i<array.length; i++)
   array[i] = ((IntWritable)a[i]).get();

Isn't it? What do you think about that?

Matteo
  • 276
  • 1
  • 4
  • 18