4

In a small HBase cluster, all the slave nodes got restarted. When I started HBase services, one of the tables (test) became inconsistent.

In HDFS some blocks were missing(hbase blocks). So it was in safe mode. I gave safemode -leave command.

Then HBase table (test) became inconsistent.

I performed below mentioned actions:

  1. I executed "hbase hbck" several times. 2 inconsistencies found for table "test".

    ERROR: Region { meta=>test,1m\x00\x03\x1B\x15,1393439284371.4c213a47bba83c47075f21fec7c6d862., hdfs => hdfs://master:9000/hbase/test/4c213a47bba83c47075f21fec7c6d862, deployed => } not deployed on any region server.

  2. hbase hbck -fixMeta -fixAssignments HBaseFsckRepair: Region still in transition, waiting for it to become assigned:

    {NAME => 'test,1m\x00\x03\x1B\x15,1393439284371.4c213a47bba83c47075f21fec7c6d862.', STARTKEY => '1m\x00\x03\x1B\x15', ENDKEY => '', ENCODED => 4c213a47bba83c47075f21fec7c6d862,}

  3. hbase hbck -repair HBaseFsckRepair: Region still in transition, waiting for it to become assigned:

    {NAME => 'test,1m\x00\x03\x1B\x15,1393439284371.4c213a47bba83c47075f21fec7c6d862.', STARTKEY => '1m\x00\x03\x1B\x15', ENDKEY => '', ENCODED => 4c213a47bba83c47075f21fec7c6d862,}

  4. I checked datanode logs in parallel.

    Logs:

    org.apache.hadoop.hdfs.server.datanode.DataNode: opReadBlock BP-1015188871-192.168.1.11-1391187113543:blk_7616957984716737802_27846 received exception java.io.EOFException WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.1.12, storageID=DS-831971799-192.168.1.12-50010-1391193910800, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=CID-7f99a9de-258c-493c-9db0-46b9e84b4c12;nsid=1286773982;c=0):Got exception while serving BP-1015188871-192.168.1.11-1391187113543:blk_7616957984716737802_27846 to /192.168.1.12:36127

  5. Checked Namenode logs

    ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:ubuntu (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /hbase/test/4c213a47bba83c47075f21fec7c6d862/C 2014-02-28 14:13:15,738 
    INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from
    10.10.242.31:42149: error: java.io.FileNotFoundException: File does not exist: /hbase/test/4c213a47bba83c47075f21fec7c6d862/C java.io.FileNotFoundException: File does not exist: /hbase/test/4c213a47bba83c47075f21fec7c6d862/C at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1301)
    

But, I am able to browse and download the file from HDFS. How can recover the data?

How can I make the "test" table consistent?

Manjunath Ballur
  • 6,287
  • 3
  • 37
  • 48
Paul
  • 1,176
  • 3
  • 12
  • 27

3 Answers3

7

In HBase 2.0 (and possibly in previous versions), "not deployed on any region server" is typically solved by getting the region assigned.

  1. Authenticate if you're on a secured cluster. You are on a secured cluster, aren't you? ;)

    kinit [keytab] [principal]
    
  2. Run HBase check to see which regions specifically are unassigned

    hbase hbck -details
    
  3. If you see an error like this:

    ERROR: Region { 
        meta => my.tablename,,1500001112222.abcdef123456789abcdef12345678912., 
        hdfs => hdfs://cluster/apps/hbase/data/data/default/my.tablename/abcdef123456789abcdef12345678912,
        deployed => ,
        replicaId => 0 
    } not deployed on any region server.
    

    (the key being "not deployed on any region server"), then you should assign the region. This, it turns out, is pretty simple. Proceed to step 4.

  4. Open an hbase shell

    hbase shell
    
  5. Assign the region by passing the encoded regionname to the assign method. As noted in the help documentation, this should not be called without the previous due diligence as this command will do a force reassign. The docs say, and I caution: for experts only.

    hbase(main):001:0> assign 'abcdef123456789abcdef12345678912'
    
  6. Double-check your work by running hbase check for your table that had the unassigned regions.

    hbase hbck my.tablename 
    

    If you did everything correctly and if there's no underlying HDFS issue, you should see this message near the bottom of the hbck output:

    0 inconsistencies detected.
    Status: OK
    
WattsInABox
  • 4,548
  • 2
  • 33
  • 43
0

In Hbase 2.0.2 version there is no repair option to recover inconsistencies.

  1. Run hbase hbck command.
  2. If the error mesaage look like mentioned below:
ERROR: Region { meta => EMP_NMAE,\x02\x00\x00\x00\x00,1571419090798.054b393c37a80563ae1aa60f29e3e4df., hdfs => hdfs://node1:8020/apps/hbase/data/data/LEVEL_RESULT/054b393c37a80563ae1aa60f29e3e4df, deployed => , replicaId => 0 } not deployed on any region server.
ERROR: Region { meta => TABLE_3,\x02174\x0011100383\x00496\x001,1571324271429.6959c7157693956825be65676ced605c., hdfs => hdfs://node1:8020/apps/hbase/data/data/TABLE_NAME/6959c7157693956825be65676ced605c, deployed => , replicaId => 0 } not deployed on any region server.
  1. copy this error inconsistancy to an file and pull the alphanumeric value by using the below command.

If our inconsistancy count is less we can take value manually if the number is more it would be hectic to retrive the entire value. so use the below command to narrow down to alphanemeric alone which can be copied and put in hbase shell at a stretch.

cat inconsistant.out|awk -F'.' '{print $2}'
  1. Open hbase hbase shell and assign these consistancy manually. LIKE BELOW:
assign '054b393c37a80563ae1aa60f29e3e4df'
assign '6959c7157693956825be65676ced605c'
assign '7058dfe0da0699865a5b63be9d3799ab'
assign 'd25529539bae49eb078c7d0ca6ce84e4'
assign 'e4ad94f58e310a771a0f5a1eade884cc'

once the assigning is completed run the hbase hbck command again

Hakan Dilek
  • 2,178
  • 2
  • 23
  • 35
0

I had the same problem. It turned out there were regions overlappings. How I fixed:

  1. Try to assign region which is not deployed in hbase shell: assign 'Abcd...'
  2. Check HBase Master log for ERROR AssingmentManager [something like that: Trying to assign region {ENCODED => Abcd..., NAME => ..., ts=1591351130943, server=server1,6020,1581641930622}]
  3. Turn off region server on server1
  4. Run hbase hbck -repair my_table
  5. Repeat for every undeployed region

Or you can just restart hbase and run 'hbase hbck -repair'

pyOwner
  • 612
  • 5
  • 11