Big Data Hadoop 1st Generation. I am very new to Apache Hadoop. I just got a doubt may be my question is irrelevant.
Problem : Word count problem (Dry debug).
Example :
File Name : test.txt
File Size : 120 MB
Default Block size : 64 MB
File Content :
Hello StackOverflow
Hi StackOverflow
Hola StackOverflow
Mushi Mushi StackOverflow
.....
.....
.....
Mushi Mushi StackOverflow
Number of blocks will be : 2 (64 MB + 56 MB)
Block 1 contains :
Hello StackOverflow
Hi StackOverflow
Hola StackOverflow
Mushi Mus
Block 2 contains :
hi StackOverflow
.....
.....
.....
Mushi Mushi StackOverflow
NOTE : Here Mushi word splits between block 1 and block 2, because at word "Mus" block size became 64 MB, remaining word "hi" went into Block 2.
Now my question's are : Q1) Is it possible scenario ?
Q2) If No Why ?
Q3) If Yes, then what will be the word count output.
Q4) What will be the Mapper's output for both blocks.