I new in Hadoop . I want to know the following code :
DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);
my means :=> arg[0].toUri()
and about the "addCachFile"
Thanks
I new in Hadoop . I want to know the following code :
DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);
my means :=> arg[0].toUri()
and about the "addCachFile"
Thanks
adCacheFile() method of Distributed cache takes URI of the file to be added to Distributed cache, new Path(args[0]) whatever this path is from input arguments its converted to URI and in turn this URI is used to add the file to distributed cache of hadoop.
Path - Can be name of a file or a directory.
When this file is added to Distributed cache the file is available to all the mappers, this is one of the optimization techniques in hadoop if you have a file which is small in size. You can make it accessible to all the nodes for faster access of data.
For more details you can check this :-
https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/fs/Path.html
Thanks Paritosh Ahuja ,
I have two txt File about Polygon : my complete code for this
public class OverlayPhase2 extends Configured implements Tool
{
public int run(String[] args) throws IOException
{
JobConf conf = new JobConf( getConf(), OverlayPhase2.class);
if (conf == null) {
return -1;
}
conf.setOutputKeyClass(IntWritable.class);
conf.setOutputValueClass(Text.class);
conf.setMapperClass(OverlayPhase2Mapper.class);
conf.setReducerClass(OverlayPhase2Reducer.class);
conf.setNumMapTasks(2);
conf.setNumReduceTasks(8);
DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);
Path inp1 = new Path(arg[1]);
Path inp2 = new Path(arg[2]);
Path out1 = new Path(arg[3]);
FileInputFormat.setInputPaths(conf, inp1 );
FileInputFormat.setInputPaths(conf, inp2 );
FileInputFormat.setOutputPath(conf, out1 );
JobClient.runJob(conf);
return 0;
}
public static void main(String[] args) throws Exception
{
int exitCode = ToolRunner.run(new OverlayPhase2(), args);
System.exit(exitCode);
}
and I set for arg[1] , arg[2] , arg[3] to this:
arg[1] =/home/mostafa/Desktop/b1.txt
arg[2] = /home/mostafa/Desktop.b2.txt
arg[3] = /home/mostafa/Desktaop/output
well , arg[0] :?
Best Regards, Mostafa