New to hadoop, only setup a 3 debian server cluster for practice.
I was researching best practices on hadoop and came across: JBOD no RAID Filesystem: ext3, ext4, xfs - none of that fancy COW stuff you see with zfs and btrfs
So I raise these questions...
Everywhere I read JBOD is better then RAID in hadoop, and that the best filesystems are xfs and ext3 and ext4. Aside from the filesystem stuff which totally makes sense why those are the best... how do you implement this JBOD? You will see my confusion if you do the google search your self, JBOD alludes to a linear appendage or combination of just a bunch of disks kind of like a logical volume well at least thats how some people explain it, but hadoop seems to want a JBOD that doesnt combine. No body expands on that...
- Question 1) What does everyone in the hadoop world mean by JBOD and how do you implement that?
- Question 2) Is it as simple as mounting each disk to a different directory is all?
- Question 3) Does that mean that hadoop runs best on a JBOD where each disk is simply mounted to a different directory?
Question 4) And then you just point hadoop at those data.dirs?
Question5) I see JBODS going 2 ways, either each disk going to a seperate mount, or a linear concat of disks, which can be done mdadm --linear mode, or lvm i bet can do it too, so I dont see the big deal with that... And if thats the case, where mdadm --linear or lvm can be used because the JBOD people are refering to is this concat of disks, then which is the best way to "JBOD" or linearly concat disks for hadoop?
This is off topic, but can someone verify if this is correct as well? Filesystems that use cow, copy on write, like zfs and btrfs just slow down hadoop but not only that the cow implementation is a waste with hadoop.
Question 6) Why is COW and things like RAID a waste on hadoop? I see it as if your system crashes and you use the cowness of if to restore it, by the time you restored your system, there have been so many changes to hdfs it will probably just consider that machine as faulty and it would be better to rejoin it from scratch (bring it up as a fresh new datanode)... Or how will the hadoop system see the older datanode? My guess is it wont think its old or new or even a datanode, it will just see it as garbage... Idk...
Question 7) What happens if hadoop sees a datanode that fell off the cluster and then the datanode comes back online with data slightly older? Is there an extent to how old the data has to be ??? how does this topic?
REASKING QUESTION 1 THRU 4
I just realized my question is so simple yet it's so hard for me to explain it that I had to split it up into 4 questions, and i still didn't get the answer I'm looking for, from what sounds like very smart individuals, so i must re-ask differently..
On paper I could easily or with a drawing... I'll attempt with words again..
If confused on what I am asking in the JBOD question...
** just wondering what kind of JBOD everyone keeps referring to in the hadoop world is all **
JBODs are defined differently with hadoop then in normal world and I want to know how the best way to implement hadoop is it on a concat of jbods(sda+sdb+sdc+sdd) or just leave the disks alone(sda,sdb,sdc,sdd)
I think the graphical representation below explain what I am asking best
(JBOD METHOD 1)
normal world: jbod is a concatination of disks - then if you were to use hadoop you would overlay the data.dir (Where hdfs virtualy sites) onto a directory inside this concat of disks, ALSO all of the disks would appear as 1... so if you had sda and sdb and sdc as your data disks in your node, you would make em appear as some entity1 (either with the hardware of the motherboard or mdadm or lvm) which is a linear concat of sda and sdb and sdc. you would then mount this entity1 to a folder in the Unix namespace like /mnt/jbod/ and then setup hadoop to run with in it.
TEXT SUMMARY: if disk 1 and disk2 and disk 3 were each 100gb and 200gb and 300gb big respectively then this jbod would be 600gb big, and hadoop from this node would gain 600gb in capacity
* TEXTO-GRAPHICAL OF LINEAR CONCAT OF DISKS BEING A JBOD:
* disk1 2 and 3 used for datanode for hadoop
* disk1 is sda 100gb
* disk2 is sdb 200gb
* disk3 is sdc 300gb
* sda + sdb + sdc = jbod of name entity1
* JBOD MADE ANYWAY - WHO CARES - THATS NOT MY QUESTION: maybe we made the jbod of entity1 with lvm, or mdadm using linear concat, or hardware jbod drivers which combine disks and show them to the operating system as entity1, it doesn't matter, either way its still a jbod
* This is the type of JBOD I am used to and I keep coming across when I google search JBOD
* cat /proc/partitions would show sda,sdb,sdc and entity1 OR if we used hardware jbod maybe sda and sdb and sdc would not show and only entity1 would show, again who cares how it shows
* mount entity1 to /mnt/entity1
* running "df" would show that entity1 is 100+200+300=600gb big
* we then setup hadoop to run its datanodes on /mnt/entity1 so that datadir property points at /mnt/entity1 and the cluster just gained 600gb of capacity
..the other perspective is this..
(JBOD METHOD 2)
in hadoop it seems to me they want every disk seperate. So I would mount disk sda and sdb and sdc in the unix namespace to /mnt/a and /mnt/b and /mnt/c... it seems from reading across the web lots of hadoop experts classify jbods as just that just a bunch of disks so to unix they would look like disks not a concat of the disks... and then of course i can combine then to become one entity either with logical volume manager (lvm) or mdadm (in a raid or linear fashion, linear prefered for jbod) ...... but...... nah lets not combine them because it seems in the hadoop world jbod is just a bunch of disks sitting by them selves...
if disk 1 and disk2 and disk 3 were each 100gb and 200gb and 300gb big respectively then each mount disk1->/mnt/a and disk2->/mnt/b and disk3->/mnt/c would each be 100gb and 200gb and 300gb big respectively, and hadoop from this node would gain 600gb in capacity
TEXTO-GRAPHICAL OF LINEAR CONCAT OF DISKS BEING A JBOD
* disk1 2 and 3 used for datanode for hadoop
* disk1 is sda 100gb
* disk2 is sdb 200gb
* disk3 is sdc 300gb
* WE DO NOT COMBINE THEM TO APPEAR AS ONE
* sda mounted to /mnt/a
* sdb mounted to /mnt/b
* sdc mounted to /mnt/c
* running a "df" would show that sda and sdb and sdc have the following sizes: 100,200,300 gb respectively
* we then setup hadoop via its config files to lay its hdfs on this node on the following "datadirs": /mnt/a and /mnt/b and /mnt/c.. gaining 100gb to the cluster from a, 200gb from b and 300gb from c... for a total gain of 600gb from this node... nobody using the cluster would tell the difference..
SUMMARY OF QUESTION
** Which method is everyone referring to is BEST PRACTICE for hadoop this combination jbod or a seperation of disks - which is still also a jbod according to online documentation? **
- Both cases would gain hadoop 600gb... its just 1. looks like a concat or one entity that is a combo of all the disks, which is what I always thought was a jbod... Or it will be like 2 where each disk in the system is mounted to different directory, end result is all the same to hadoop capacity wise... just wondering if this is the best way for performance