67

How does someone fix a HDFS that's corrupt? I looked on the Apache/Hadoop website and it said its fsck command, which doesn't fix it. Hopefully someone who has run into this problem before can tell me how to fix this.

Unlike a traditional fsck utility for native file systems, this command does not correct the errors it detects. Normally NameNode automatically corrects most of the recoverable failures.

When I ran bin/hadoop fsck / -delete, it listed the files that were corrupt or missing blocks. How do I make it not corrupt? This is on a practice machine so I COULD blow everything away but when we go live, I won't be able to "fix" it by blowing everything away so I'm trying to figure it out now.

Ani Menon
  • 27,209
  • 16
  • 105
  • 126
Classified
  • 5,759
  • 18
  • 68
  • 99

4 Answers4

108

You can use

  hdfs fsck /

to determine which files are having problems. Look through the output for missing or corrupt blocks (ignore under-replicated blocks for now). This command is really verbose especially on a large HDFS filesystem so I normally get down to the meaningful output with

  hdfs fsck / | egrep -v '^\.+$' | grep -v eplica

which ignores lines with nothing but dots and lines talking about replication.

Once you find a file that is corrupt

  hdfs fsck /path/to/corrupt/file -locations -blocks -files

Use that output to determine where blocks might live. If the file is larger than your block size it might have multiple blocks.

You can use the reported block numbers to go around to the datanodes and the namenode logs searching for the machine or machines on which the blocks lived. Try looking for filesystem errors on those machines. Missing mount points, datanode not running, file system reformatted/reprovisioned. If you can find a problem in that way and bring the block back online that file will be healthy again.

Lather rinse and repeat until all files are healthy or you exhaust all alternatives looking for the blocks.

Once you determine what happened and you cannot recover any more blocks, just use the

  hdfs fs -rm /path/to/file/with/permanently/missing/blocks

command to get your HDFS filesystem back to healthy so you can start tracking new errors as they occur.

VeLKerr
  • 2,995
  • 3
  • 24
  • 47
mobileAgent
  • 1,621
  • 1
  • 12
  • 7
  • 7
    Thx for your reply. I'll try your suggestion the next time the HDFS has issues. Somehow, it fixed itself when I ran `bin/hadoop fsck / -delete`. After that, the HDFS was no longer corrupted and some files ended up in /lost+found. It didn't do that before when I stopped the HDFS and restarted several times. I upvoted and accepted your answer =) Thx again. – Classified Oct 14 '13 at 20:19
  • 16
    But if a file is replicated 3 times in the cluster, can't I just get it back from another node? I know I had some data loss on one machine, but isn't the whole point of HDFS that this shouldn't matter? – Marius Soutier Aug 05 '14 at 19:44
  • I have done this for numerous time and didn't get the issue resolved. But I am aware that there is no other option to recover the corrupted or lost data in hdfs. Still I can see the corrupted blocks issue even though cleared the data from all the data nodes. – S.K. Venkat May 30 '16 at 17:36
  • Name node might be have some way to get this resolved. If anybody have idea about that, please share the answer. Thanks in Advance... – S.K. Venkat May 30 '16 at 17:38
  • 1
    Having had a problem with only one node (it crashed and got some of its files lost), the easiest solution was the one suggested by @Classified, simply execute `hadoop fsck / -delete` – sofia Jul 21 '16 at 08:43
  • I had to run `hdfs dfsadmin -safemode leave` before `hadoop fsck -delete` would work. Otherwise it just kept cryptically saying `fsck failed on '/'` – makhdumi Apr 24 '17 at 16:33
  • 4
    Wouldn't deleting the missing blocks cause data loss? hdfs fs -rm /path/to/file/with/permanently/missing/blocks @mobileAgent – spark_dream Apr 18 '18 at 22:13
  • 1
    There are often times when applications write intermediate data that is temporary stuff that can easily be re-generated on failure and therefore are stored with a replication factor of 1. If these types of applications crash for any reason and do not clean up, they will leave behind this data. If at some point in the future the DataNode(s) with the one replica crashes, you will see corrupt blocks. This happens every so often and isn't a big deal. This data can safely be removed to restore the health of the cluster. – davidemm Jul 02 '20 at 20:25
  • Best comment from @davidemm, we ran series of test scenarios for our client and one of them was datanode failure. Since the only available data were application logs and terasort tests, which are stored with replication factor 1 by default, there were plenty of missing blocks. The only resolution was to delete given files or restore failed datanodes. – 32cupo Jun 22 '23 at 09:43
33

If you just want to get your HDFS back to normal state and don't worry much about the data, then

This will list the corrupt HDFS blocks:

hdfs fsck -list-corruptfileblocks

This will delete the corrupted HDFS blocks:

hdfs fsck / -delete

Note that, you might have to use sudo -u hdfs if you are not the sudo user (assuming "hdfs" is name of the sudo user)

PradeepKumbhar
  • 3,361
  • 1
  • 18
  • 31
2

the solution here worked for me : https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html

su - <$hdfs_user>

bash-4.1$ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files 

-bash-4.1$ for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ;  hadoop fs -setrep 3 $hdfsfile; done
abc123
  • 527
  • 5
  • 16
  • I also had to flip my primary name node before I ran the above commands because it had entered SAFE MODE. Flipping set made the stand by node to become Active and I could run the above commands and got rid of corrupt blocks :) – abc123 Jul 19 '18 at 21:42
-6

start all daemons and run the command as "hadoop namenode -recover -force" stop the daemons and start again.. wait some time to recover data.