HBase cluster with corrupt region file on HDFS

Question

We have this HBase cluster: 30+ nodes, 48 tables, 40+TB on HDFS level, replication factor 2. Due to disk failure on two nodes, we have a corrupt file on HDFS.

Current HDFS status

Excerpt of hdfs fsck / output, which shows a corrupt HBase region file:

/user/hbase/table_foo_bar/295cff9c67379c1204a6ddd15808af0b/n/ae0fdf7d0fa24ad1914ca934d3493e56: 
 CORRUPT blockpool BP-323062689-192.168.12.45-1357244568924 block blk_9209554458788732793
/user/hbase/table_foo_bar/295cff9c67379c1204a6ddd15808af0b/n/ae0fdf7d0fa24ad1914ca934d3493e56:
 MISSING 1 blocks of total size 134217728 B

  CORRUPT FILES:        1
  MISSING BLOCKS:       1
  MISSING SIZE:         134217728 B
  CORRUPT BLOCKS:       1

The filesystem under path '/' is CORRUPT

The lost data is not recoverable (the disks are broken).

Current HBase status

According to HBase on the other hand, everything is fine and dandy

hbase hbck says:

Version: 0.94.6-cdh4.4.0
...
 table_foo_bar is okay.
   Number of regions: 1425
   Deployed on:  ....
...
0 inconsistencies detected.
Status: OK

Moreover, it seems that we can still query data from the non-lost blocks of the corrupt region file (as far as I think I was able to check based on the start and end row key of the region).

Next steps

Because the file block data is not recoverable, it seems the only option is to remove the complete corrupt file (with hadoop fs -rm or hadoop fsck -delete /). This will "fix" corruption at the HDFS level.
However, I'm afraid removing the HDFS file will introduce corruption at the HBase level as a complete region file will be gone
I considered hadoop fsck -move / to move the corrupt file to /lost+found and see how HBase would take that, but moving to /lost+found is not as reversible as it seems, so I'm hesitant about that as well

Concrete questions:

Should I just remove the file? (Losing the data corresponding to that region is reasonably fine for us.) What bad things happen when you manually remove a HBase region file in HDFS? Does it just remove the data or would it introduce ugly metadata corruption in HBase that also have to be taken care of?

Or can we actually leave the situation as-is, which seems to work at the moment (HBase is not complaining about/seeing corruption)?

So the problem with hdfs is that a single block is corrupt but does that mean hbase should have problems? You most probably would have replication in hdfs >1. So just fix the corrupt blocks? See https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hadoop-hdfs?rq=1 — FUD, Jun 24 '15 at 03:40
@FUD: Thanks for your response, but as I noted in my original post, we have replication factor 2, and the lost block is not recoverable. Also note that the main question is not about the HDFS issue on itself, but about the implications at the HBase level — Stefaan, Jun 24 '15 at 15:46

Dan M · Answer 1 · 2015-06-26T23:57:49.430

We had a similar situations: 5 missing blocks, 5 corrupted files for an HBase table.
HBase version: 0.94.15
distro: CDH 4.7
OS: CentOS 6.4

Recovery instructions:

switch to hbase user: su hbase
hbase hbck -details to understand the scope of the problem
hbase hbck -fix to try to recover from region-level inconsistencies
hbase hbck -repair tried to auto-repair, but actually increased number of inconsistencies by 1
hbase hbck -fixMeta -fixAssignments
hbase hbck -repair this time tables got repaired
hbase hbck -details to confirm the fix

At this point, HBase was healthy, added additional region, and de-referenced corrupted files. However, HDFS still had 5 corrupted files. Since they were no longer referenced by HBase, we deleted them:

switch to hdfs user: su hdfs
hdfs fsck / to understand the scope of the problem
hdfs fsck / -delete remove corrupted files only
hdfs fsck / to confirm healthy status

NOTE: it is important to fully stop the stack to reset caches
(stop all services thrift, hbase, zoo keeper, hdfs and start them again in a reverse order).

[1] Cloudera page for hbck command:
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/admin_hbck_poller.html

**@dan m**, Had the same problem two years ago and couldn't stabilize the cluster due to under replicated blocks issue. However we have formatted the HDFS and reloaded the data to resolve the problem. Nice to see that you found a solution. — S.K. Venkat, Jul 06 '16 at 18:12

score 3 · Answer 2 · answered Jun 24 '15 at 21:48

FYI: I decided to bite the bullet and manually deleted the corrupt file from HDFS with:

hdfs dfs -rm /user/hbase/table_foo_bar/295cff9c67379c1204a6dd....

(hdfs fsck -move did not work for me, not sure why)

After that, I checked HBase's health with hbck, but no inconsistencies were detected

$ hbase hbck
...
0 inconsistencies detected.
Status: OK

So in our case the manual deletion of the region file did not introduce HBase corruption, if I understand correctly, which is nice, but confusing. (I hope this does not backfire and the corruption doesn't manifest itself at a later point in time)

issue closed

Your mileage may vary.

score 3 · Answer 3 · answered Aug 24 '17 at 05:29

if region-level inconsistencies are found, use the -fix argument to direct hbck to try to fix them. The following sequence of steps is followed:

$ ./bin/hbase hbck -fix

-fix includes

The standard check for inconsistencies is run.
If needed, repairs are made to tables
If needed, repairs are made to regions. Regions are closed during repair.

So before running -fix if want to fix individual region-level inconsistencies separately

-fixAssignments (equivalent to the 0.90 -fix option) repairs unassigned, incorrectly assigned or multiply assigned regions.

-fixMeta which removes meta rows when corresponding regions are not present in HDFS and adds new meta rows if they regions are present in HDFS while not in META.

-fix includes {-fixAssignments & -fixMeta }

 $ ./bin/hbase hbck -fixAssignments
 $ ./bin/hbase hbck -fixAssignments -fixMeta

There are a few classes of table integrity problems that are low risk repairs. The first two are degenerate (startkey == endkey) regions and backwards regions (startkey > endkey). These are automatically handled by sidelining the data to a temporary directory (/hbck/xxxx). The third low-risk class is hdfs region holes. This can be repaired by using the:

-fixHdfsHoles option for fabricating new empty regions on the file system. If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.

 $ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles

-repairHoles inclues {-fixAssignments -fixMeta -fixHdfsHoles }

 $ ./bin/hbase hbck -repairHoles

HBase cluster with corrupt region file on HDFS

Current HDFS status

Current HBase status

Next steps

3 Answers3