Use for questions specific to Apache Hadoop 3.0 features (i.e. Erasure Coding, YARN Timeline Service v2, Opportunistic Containers, 3+ NameNode fault-tolerance). For general questions related to Apache Hadoop use the tag [hadoop].
Questions tagged [hadoop3]
112 questions
36
votes
5 answers
HDFS_NAMENODE_USER, HDFS_DATANODE_USER & HDFS_SECONDARYNAMENODE_USER not defined
I am new to hadoop.
I'm trying to install hadoop in my laptop in Pseudo-Distributed mode.
I am running it with root user, but I'm getting the error below.
root@debdutta-Lenovo-G50-80:~# $HADOOP_PREFIX/sbin/start-dfs.sh
WARNING: HADOOP_PREFIX has…

Sujata Roy
- 427
- 1
- 6
- 8
16
votes
6 answers
Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
Contents of mapred-site.xml :
mapreduce.framework.name
yarn
yarn.app.mapreduce.am.env
…

CuriousCoder
- 291
- 2
- 5
- 14
9
votes
1 answer
Spark and Hive in Hadoop 3: Difference between metastore.catalog.default and spark.sql.catalogImplementation
I'm working on a Hadoop cluster (HDP) with Hadoop 3. Spark and Hive are also installed.
Since Spark and Hive catalogs are separated, it's a bit confusing sometimes, to know how and where to save data in a Spark application.
I know, that the property…

D. Müller
- 3,336
- 4
- 36
- 84
9
votes
8 answers
Hadoop : start-dfs.sh Connection refused
I have a vagrant box on debian/stretch64
I try to install Hadoop3 with documentation
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.htm
When I run start-dfs.sh
I have this message
vagrant@stretch:/opt/hadoop$…

Bob's Jellyfish
- 353
- 1
- 3
- 8
8
votes
1 answer
java.net.ConnectException: Your endpoint configuration is wrong;
I am running word count program from my windows machine on hadoop cluster which is setup on remote linux machine.
Program is running successfully and I am getting output but I am getting following exception and my waitForCompletion(true) is not…

CuriousCoder
- 291
- 2
- 5
- 14
6
votes
0 answers
how to integrate spark 2.2 with hadoop 3.1 manually?
I want to use Spark version 2.2 and Hadoop latest version 3.1. Can I integrate Spark and Hadoop manually?
I have already installed Spark 2.2 with Hadoop 2.6 or later but I want to update Hadoop. Is it possible to find Hadoop directory in Spark with…

griez007
- 439
- 1
- 5
- 11
6
votes
4 answers
Hadoop Error starting ResourceManager and NodeManager
I'm trying to setup Hadoop3-alpha3 with a Single Node Cluster (Psuedo-distributed) and using the apache guide to do so. I've tried running the example MapReduce job but every time the connection is refused. After running sbin/start-all.sh I've been…

user2361174
- 1,872
- 4
- 33
- 51
5
votes
4 answers
Hadoop-3.1.2: Datanode and Nodemanager shuts down
I am trying to install Hadoop(3.1.2) on Windows-10, but data node and node manager shuts down.
I have tried downloading and placing the winutils.exe and hadoop.dll files under bin directory. I have also tried changing the permissions of the files…

Stuxen
- 708
- 7
- 21
5
votes
1 answer
YARN FairScheduler configuration
Resource model in Hadoop 3 allows us to define custom resource types. I did some googling but couldn't find anything that would tell how can the YARN FairScheduler be configured to distribute/isolate these resources among its pools.

mazaneicha
- 8,794
- 4
- 33
- 52
5
votes
1 answer
"start-all.sh" and "start-dfs.sh" from master node do not start the slave node services?
I have updated the /conf/slaves file on the Hadoop master node with the hostnames of my slave nodes, but I'm not able to start the slaves from the master. I have to individually start the slaves, and then my 5-node cluster is up and running. How can…

ingmid
- 61
- 3
4
votes
0 answers
Run Docker container through Oozie
I'm trying to build an Oozie workflow to execute everyday a python script which needs specific libraries to run.
At the moment I created a python virtual environment (using venv) on a node of my cluster (consisting of 11 nodes).
Through Oozie I saw…

AGL
- 116
- 1
- 7
4
votes
1 answer
Pig is not running in mapreduce mood (hadoop 3.1.1 + pig 0.17.0)
I am very new to Hadoop. My hadoop version is 3.1.1 and pig version is 0.17.0.
Everything is working as expected by running this script in local mode
pig -x local
grunt> student = LOAD '/home/ubuntu/sharif_data/student.txt' USING PigStorage(',') as…

sharif2008
- 2,716
- 3
- 20
- 34
4
votes
2 answers
If I already have Hadoop installed, should I download Apache Spark WITH Hadoop or WITHOUT Hadoop?
I already have Hadoop 3.0.0 installed. Should I now install the with-hadoop or without-hadoop version of Apache Spark from this page?
I am following this guide to get started with Apache Spark.
It says
Download the latest version of Apache Spark…

JBel
- 329
- 1
- 5
- 19
4
votes
1 answer
Error applying authorization policy on hive configuration: Couldn't create directory ${system:java.io.tmpdir}\${hive.session.id}_resources
I run Hadoop 3.0.0-alpha1 on windows and added Hive 2.1.1 to it. When I try to open the hive beeline with the hive command I get an error:
Error applying authorization policy on hive configuration:
Couldn't create directory…

Benvorth
- 7,416
- 8
- 49
- 70
3
votes
1 answer
Where would namenode and datanode be installed if not defined in hdfs-site.xml?
My hdfs-site.xml has ONLY the following:
dfs.replication
1
Question.
Where would the NameNode and DataNode be installed?
I am using…

Gautam De
- 39
- 4