I'm using a Docker stack that implements, in the same machine, an Hadoop Namenode, two Datanodes, two Node Managers, a Resource Manager, a History Server, and other technologies.
I encountered an issue related to the HDFS Configured Capacity that is shown in the HDFS UI.
I'm using a machine with 256GB capacity, and I'm using the two datanodes implementation mentioned above. Instead of distributing the total capacity between the two nodes, HDFS duplicates the capacity of the entire machine by giving 226.87GB to each datanode. As you can see here.
Any thoughts on how to make HDFS show the right capacity?
Here is the portion of the docker compose that implements the hadoop technologies mentioned above.
services:
# Hadoop master
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
container_name: namenode
ports:
- 9870:9870
- 8020:8020
volumes:
- ./namenode/home/${ADMIN_NAME:?err}:/home/${ADMIN_NAME:?err}
- ./namenode/hadoop-data:/hadoop-data
- ./namenode/entrypoint.sh:/entrypoint.sh
- hadoop-namenode:/hadoop/dfs/name
env_file:
- ./hadoop.env
- .env
networks:
- hadoop
resourcemanager:
restart: always
image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8
container_name: resourcemanager
ports:
- 8088:8088
environment:
SERVICE_PRECONDITION: "namenode:9870 datanode1:9864"
env_file:
- ./hadoop.env
networks:
- hadoop
# Hadoop slave 1
datanode1:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode1
volumes:
- hadoop-datanode-1:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
env_file:
- ./hadoop.env
networks:
- hadoop
nodemanager1:
image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
container_name: nodemanager1
volumes:
- ./nodemanagers/entrypoint.sh:/entrypoint.sh
environment:
SERVICE_PRECONDITION: "namenode:9870 datanode1:9864 resourcemanager:8088"
env_file:
- ./hadoop.env
- .env
networks:
- hadoop
# Hadoop slave 2
datanode2:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode2
volumes:
- hadoop-datanode-2:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
env_file:
- ./hadoop.env
networks:
- hadoop
nodemanager2:
image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
container_name: nodemanager2
volumes:
- ./nodemanagers/entrypoint.sh:/entrypoint.sh
environment:
SERVICE_PRECONDITION: "namenode:9870 datanode2:9864 resourcemanager:8088"
env_file:
- ./hadoop.env
- .env
networks:
- hadoop
historyserver:
image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8
container_name: historyserver
ports:
- 8188:8188
environment:
SERVICE_PRECONDITION: "namenode:9870 datanode1:9864 datanode2:9864 resourcemanager:8088"
volumes:
- hadoop-historyserver:/hadoop/yarn/timeline
env_file:
- ./hadoop.env
networks:
- hadoop