0

I new in Hadoop . I want to know the following code :

DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);

my means :=> arg[0].toUri()

and about the "addCachFile"

Thanks

2 Answers2

0

adCacheFile() method of Distributed cache takes URI of the file to be added to Distributed cache, new Path(args[0]) whatever this path is from input arguments its converted to URI and in turn this URI is used to add the file to distributed cache of hadoop.

Path - Can be name of a file or a directory.

When this file is added to Distributed cache the file is available to all the mappers, this is one of the optimization techniques in hadoop if you have a file which is small in size. You can make it accessible to all the nodes for faster access of data.

For more details you can check this :-

https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/fs/Path.html

Confusion about distributed cache in Hadoop

Community
  • 1
  • 1
Paritosh Ahuja
  • 1,239
  • 2
  • 10
  • 19
0

Thanks Paritosh Ahuja ,

I have two txt File about Polygon : my complete code for this

public class OverlayPhase2  extends Configured implements Tool
{
    public int run(String[] args) throws IOException
    {
    JobConf conf = new JobConf( getConf(), OverlayPhase2.class);
    if (conf == null) {
    return -1;
    }
    conf.setOutputKeyClass(IntWritable.class);
    conf.setOutputValueClass(Text.class);
    conf.setMapperClass(OverlayPhase2Mapper.class);

    conf.setReducerClass(OverlayPhase2Reducer.class);
    conf.setNumMapTasks(2);
    conf.setNumReduceTasks(8);

    DistributedCache.addCacheFile(new Path(args[0]).toUri(), conf);

    Path inp1 = new Path(arg[1]);
    Path inp2 = new Path(arg[2]);
   Path out1 = new Path(arg[3]);
   FileInputFormat.setInputPaths(conf, inp1 );
   FileInputFormat.setInputPaths(conf, inp2 );
   FileInputFormat.setOutputPath(conf, out1 );   
   JobClient.runJob(conf);
    return 0;
}

public static void main(String[] args) throws Exception
{
    int exitCode = ToolRunner.run(new OverlayPhase2(), args);
    System.exit(exitCode);
}

and I set for arg[1] , arg[2] , arg[3] to this:

arg[1] =/home/mostafa/Desktop/b1.txt
arg[2] = /home/mostafa/Desktop.b2.txt
arg[3] = /home/mostafa/Desktaop/output

well , arg[0] :?

Best Regards, Mostafa