Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
202
votes
17 answers

When to use Hadoop, HBase, Hive and Pig?

What are the benefits of using either Hadoop or HBase or Hive ? From my understanding, HBase avoids using map-reduce and has a column oriented storage on top of HDFS. Hive is a sql-like interface for Hadoop and HBase. I would also like to know how…
Khalefa
  • 2,294
  • 3
  • 14
  • 12
150
votes
5 answers

Difference between HBase and Hadoop/HDFS

This is kind of naive question but I am new to NoSQL paradigm and don't know much about it. So if somebody can help me clearly understand difference between the HBase and Hadoop or if give some pointers which might help me understand the…
Dhaval Shah
  • 1,515
  • 2
  • 10
  • 5
117
votes
18 answers

How to delete all data from solr and hbase

How do I delete all data from solr by command? We are using solr with lily and hbase. How can I delete data from both hbase and solr? http://lucene.apache.org/solr/4_10_0/tutorial.html#Deleting+Data
XMen
  • 29,384
  • 41
  • 99
  • 151
87
votes
3 answers

Large scale data processing Hbase vs Cassandra

I am nearly landed at Cassandra after my research on large scale data storage solutions. But its generally said that Hbase is better solution for large scale data processing and analysis. While both are same key/value storage and both are/can run…
Gary Lindahl
  • 5,341
  • 2
  • 19
  • 18
58
votes
5 answers

Command like SQL LIMIT in HBase

Does HBase have any command that works like SQL LIMIT query? I can do it by setStart and setEnd, but I do not want to iterate all rows.
Mohammad
  • 1,474
  • 2
  • 11
  • 20
57
votes
7 answers

How does Hive compare to HBase?

I'm interested in finding out how the recently-released (http://mirror.facebook.com/facebook/hive/hadoop-0.17/) Hive compares to HBase in terms of performance. The SQL-like interface used by Hive is very much preferable to the HBase API we have…
mrhahn
  • 607
  • 1
  • 7
  • 7
57
votes
11 answers

Scalable Image Storage

I'm currently designing an architecture for a web-based application that should also provide some kind of image storage. Users will be able to upload photos as one of the key feature of the service. Also viewing these images will be one of the…
b_erb
  • 20,932
  • 8
  • 55
  • 64
56
votes
12 answers

Hbase quickly count number of rows

Right now I implement row count over ResultScanner like this for (Result rs = scanner.next(); rs != null; rs = scanner.next()) { number++; } If data reaching millions time computing is large.I want to compute in real time that i don't want to…
cldo
  • 1,735
  • 6
  • 21
  • 26
52
votes
6 answers

Hive load CSV with commas in quoted fields

I am trying to load a CSV file into a Hive table like so: CREATE TABLE mytable ( num1 INT, text1 STRING, num2 INT, text2 STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; LOAD DATA LOCAL INPATH '/data.csv' OVERWRITE INTO TABLE mytable; …
Martijn Lenderink
  • 535
  • 1
  • 5
  • 5
47
votes
4 answers

How to read from hbase using spark

The below code will read from the hbase, then convert it to json structure and the convert to schemaRDD , But the problem is that I am using List to store the json string then pass to javaRDD, for data of about 100 GB the master will be loaded with…
madan ram
  • 1,260
  • 2
  • 19
  • 26
44
votes
7 answers

Scan with filter using HBase shell

Does anybody know how to scan records based on some scan filter i.e.: column:something = "somevalue" Something like this, but from HBase shell?
Gandalf StormCrow
  • 25,788
  • 70
  • 174
  • 263
42
votes
1 answer

Why HBase is a better choice than Cassandra with Hadoop?

Why is using HBase a better choice than using Cassandra with Hadoop? Can anyone please give a detailed explanation on this? Thanks
Niladri Biswas
  • 4,153
  • 2
  • 17
  • 24
41
votes
5 answers

How to connect to remote HBase in Java?

I have a standlone HBase server. This is my hbase-site.xml: hbase.rootdir file:///hbase_data I am trying to write a Java program to manipulate the data…
leon
  • 10,085
  • 19
  • 60
  • 77
39
votes
1 answer

HBase REST Filter ( SingleColumnValueFilter )

I cannot figure out how to use filters in the HBase REST interface (HBase 0.90.4-cdh3u3). The documentation just gives me a schema definition for a "string", but doesn't show how to use it. So, I'm able to do this: curl -v -H 'Content-Type:…
Mario
  • 1,801
  • 3
  • 20
  • 32
39
votes
6 answers

storing massive ordered time series data in bigtable derivatives

I am trying to figure out exactly what these new fangled data stores such as bigtable, hbase and cassandra really are. I work with massive amounts of stock market data, billions of rows of price/quote data that can add up to 100s of gigabytes every…
Shahbaz
  • 10,395
  • 21
  • 54
  • 83
1
2 3
99 100