1) Will this change affect the existing data in HDFS
No, it will not. It will keep the old block size on the old files. In order for it to take the new block change, you need to rewrite the data. You can either do a hadoop fs -cp
or a distcp
on your data. The new copy will have the new block size and you can delete your old data.
2) Do I need to propogate this change to all he nodes in Hadoop cluster or only on the NameNode is sufficient?
I believe in this case you only need to change the NameNode. However, this is a very very bad idea. You need to keep all of your configuration files in sync for a number of good reasons. When you get more serious about your Hadoop deployment, you should probably start using something like Puppet or Chef to manage your configs.
Also, note that whenever you change a configuration, you need to restart the NameNode and DataNodes in order for them to change their behavior.
Interesting note: you can set the blocksize of individual files as you write them to overwrite the default block size. E.g., hadoop fs -D fs.local.block.size=134217728 -put a b