-2

I'm using hadoop 1.0.3 with java-oracle 7 - when I run the word count code in big data nearly 1.5GB size , it take long time in reducing reach to 10 hours or more just in copying step. the system with 16 node ;one naster and 15 slave each node have : The cluster summary are as follows:

Configured Capacity: 2.17TB
DFS Used: 4.23GB
Non DFS USed:193.74GB
DFS Remaining: 1.98TB
DFS Used%: .19%
DFS Remaining%: 91.09%
Live Nodes: 16
Dead Nodes: 0
Decomissioned Nodes: 0
Number of Under Replicated Blocks: 0

the reducer output

I try it with 29 mapper and 1 reducer,16 reducer,35 reducer ,56 reducer... the problem is the same and error appear "too many fetch failer "

seso
  • 31
  • 1
  • 1
  • 8

1 Answers1

0

How many mappers and reducers are being used?
Looks like you are using a very low number of reducers.
Given a low number of reducers, you will observe poor performance.
You need to configure the mappers and reducers according to your context and number of available worker nodes.

Leet-Falcon
  • 2,107
  • 2
  • 15
  • 23
  • If you have 16 nodes, than it would make sense that you have at least 16 reducers. The defaults provided for you are not always good. – Leet-Falcon Dec 25 '15 at 16:28
  • I increase it to 16 and 35,56 according to the equation (nodes * mapred.tasktracker.tasks.maximum)*{.95 or 1.75}.. but "too many fetch failer " error appear – seso Dec 25 '15 at 16:34
  • You have a network misconfiguration. see http://stackoverflow.com/questions/27978936/too-many-fetch-faliuers – Leet-Falcon Dec 25 '15 at 16:36
  • I see this link before ,but I can't understand the point very will ,or at least how to solve it . my /etc/hosts is like this : 127.0.1.1 justcbuser-virtual-machine 10.242.21.160 master 10.242.21.161 slave1 10.242.21.162 slave2 10.242.21.163 slave3 10.242.21.164 slave4 10.242.21.165 slave5 – seso Dec 25 '15 at 16:40
  • execute "ping localhost" and set the result name instead of 127.0.0.1 – Leet-Falcon Dec 25 '15 at 16:43
  • I try this "ping localhost" the result gives "PING localhost (127.0.0.1) 56(84) bytes of data.",this means that the localhost is the same nome "localhost" – seso Dec 25 '15 at 17:07
  • Then replace with the actual IP address – Leet-Falcon Dec 25 '15 at 17:11