Questions tagged [mrv2]

Next Generation MapReduce Architecture.

'Next Generation MR' or 'NextGen MR' or 'MRv2' or 'MR2' is a major revamp of the MapReduce engine and will part of the 0.23 release. MRv1 or the old MapReduce engine will be not be supported in 0.23 release. The underlying engine has been revamped in 0.23, but the API to interface with the engine remains the same. So, the existing MapReduce code for MRv1 engine should run without modifications on MRv2.

16 questions
95
votes
9 answers

Container is running beyond memory limits

In Hadoop v1, I have assigned each 7 mapper and reducer slot with size of 1GB, my mappers & reducers runs fine. My machine has 8G memory, 8 processor. Now with YARN, when run the same application on the same machine, I got container error. By…
Lishu
  • 1,438
  • 1
  • 13
  • 14
4
votes
2 answers

New MapReduce Architecture and Eclipse

Some major re-factoring is happening Hadoop around MapReduce. Details about the same can be found in the below JIRA. https://issues.apache.org/jira/browse/MAPREDUCE-279 It has ResourceManager, NodeManager and HistoryServer daemons. Has anyone tried…
Praveen Sripati
  • 32,799
  • 16
  • 80
  • 117
4
votes
0 answers

Hadoop Parameter explanation

Hadoo-2.6 has following parameters as given in the documentation mapreduce.job.max.split.locations (The max number of block locations to store for each split for locality calculation. How does it use this in locality…
Novice
  • 155
  • 3
  • 14
3
votes
1 answer

Hive running in local mode, taking excessive amount of /tmp local disk space

I'm running a complex query in hive which, when ran, starts using a huge amount of local disk space in /tmp folder and eventually ends with a space error as the /tmp folder fills up completely with the intermediate map-reduce results because of the…
user5092078
  • 51
  • 1
  • 5
2
votes
2 answers

Yarn NodeManager and ResourceManager in the same node

(By default) Is there a "node manager" in the same node with "resource manager" in Hadoop Yarn? If not, is it possible to run them on the same node?
polerto
  • 1,750
  • 5
  • 29
  • 50
1
vote
1 answer

Regarding Hadoop secondarynamenode concept

As per the documnetation (http://hadoop.apache.org/common/docs/r0.20.203.0/hdfs_user_guide.html) secondarynamenode is deprecated in hadoop0.20.203.0 release onwards and replaced by checkpointnode and backupnode. But in cluster set up doc…
MRK
  • 573
  • 2
  • 8
  • 21
1
vote
1 answer

Understanding mapreduce.framework.name wrt Hadoop

I am learning Hadoop and came to know that that there are two versions of the framework viz: Hadoop1 and Hadoop2. If my understanding is correct, in Hadoop1, the execution environment is based on two daemons viz TaskTracker and JobTracker whereas in…
CuriousMind
  • 8,301
  • 22
  • 65
  • 134
1
vote
1 answer

How to submit a Hadoop streaming job and check execution history with Hadoop 2.x

I am newbie to Hadoop. In Hadoop 1.X, I can submit a hadoop streaming job from master node and check the result and execution time from the namenode web. The following is the sample code for hadoop streaming in Hadoop 1.X: $HADOOP_HOME/bin/hadoop …
user3713489
  • 53
  • 1
  • 8
1
vote
2 answers

Hadoop / Yarn (v0.23.3) Psuedo-Distributed Mode setup :: No job node

I just setup Hadoop/Yarn 2.x (specifically, v0.23.3) in Psuedo-Distributed mode. I followed the instructions of a few blogs & websites which, more-or-less provide the same prescription for setting it up. I also followed the 3rd-Edition of…
NYCeyes
  • 5,215
  • 6
  • 57
  • 64
0
votes
2 answers

Hadoop cluster set up with 0.23 release (MRv2 or NextGen MR)

AS i see the latest stable release of hadoop is 0.20.x. And latest release is 0.23.. Seems there are lot of chanages from .20. to 0.23.x. We are able to set up small cluster with stable relase(0.20.2) and practicising mapreduce programming. We have…
MRK
  • 573
  • 2
  • 8
  • 21
0
votes
2 answers

YARN and MapReduce Framework

I am aware of the basics of YARN framework, however I still feel lack of some understanding, in regards to MapReduce. With YARN, I have read that MapReduce is just one of the applications which can run on top of YARN; for example, with YARN, on same…
CuriousMind
  • 8,301
  • 22
  • 65
  • 134
0
votes
1 answer

YARN MRv2 JobClient equivalent

I'm unable to find a JobClient (Java, MRv1) equivalent for MRv2. I'm trying to read MR job status, counters etc for a running job. I'd have to get the information from he resource manager I believe (since the History server wouldn't have the…
Praneeth
  • 309
  • 4
  • 14
0
votes
1 answer

YARN: Controlling concurrency of jobs

I've been trying to make use YARNs resource queues to control contention by controlling the number of jobs (I only have MR jobs, no other YARN applications) at any given time. The situation I have is - I have a service that accepts requests from…
Praneeth
  • 309
  • 4
  • 14
0
votes
1 answer

Determining optimal number of reducers in Yarn

In MRv1 we had the below two configurable parameters to set the number of Map and reduce slots per Node. mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum Also it was advisable to have number of Map slots little higher…
vin15
  • 23
  • 4
0
votes
1 answer

MRv1 and MRv2 parameters

Entire list of parameter( for Hadoop-2.6) is given on the link But you can execute a job in either MRv1 or MRv2 style. I think there are some parameters that are only applicable for MRv1 like mapreduce.tasktracker.map.tasks.maximum, is this true ?…
Novice
  • 155
  • 3
  • 14
1
2