5

I am bit new in MapR but i am aware about hbase. I was going through one of the video where I found that Mapr-DB is a NoSQL DB in MapR and it similar to Hbase. In addition to this Hbase can also be run on MapR. I am confused between MapR-Db and Hbase. What is the exact difference between them ?

When to use Mapr-DB and when to use Hbase?

Basically I have one java code which do bulk load in Hbase on MapR , Now here if I use same code that i have used for Apache hadoop , will that code work here?

Please help me to avoid this confusion.

Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
Shashi
  • 2,686
  • 7
  • 35
  • 67

2 Answers2

8

They are both NOSQL, wide column stores.

HBase is open source and can be installed as a part of a Hadoop installation.

MapR-DB is a proprietary (not open source) NOSQL database that MapR offers. A core difference that MapR will detail with MapR-DB (along with their file system (they do not use HDFS)) is that MapR-DB offers significant performance and scalability over HBase (unlimited tables, columns, re-architecture to name a few).

MapR maintains that you can use MapR-DB or HBase interchangeably. I suggest testing on both extensively before committing to one vs the other. You also need to realize that MapR-DB is MapR's proprietary NOSQL HBase equivalent and if you require support for MapR-DB you'll have to get that from MapR (HBase support can come from any of the other Hadoop distributions as well as the open source community).

Some links you should look at: http://www.theregister.co.uk/2013/05/01/mapr_hadoop_m7_edition_solr/ https://www.mapr.com/blog/get-real-hadoop-enterprise-grade-nosql#.VVfHuvlVhBc

Larry Advey
  • 180
  • 1
  • 5
  • Thank you Larry for such a nice explanation. As I understand here , MapR-DB and Hbase have got similar feature with Mapr-DB having more advantages. So something written in Hbase (piece of code in java for bulk load)can be directly used with Mapr-DB? or it will need some special changes to make it compatible? – Shashi May 18 '15 at 06:43
  • Based on MapR's goal of full compatibility between HDFS and MapRFS I fully I expect it to work on their platform. – Larry Advey May 18 '15 at 21:18
  • @Shashi All you need to do to get your HBase code to work on MapR-DB is to simply add the `mapr-hbase-4.0.1-mapr.jar` to your classpath. – Matthew Moisen Mar 10 '16 at 19:45
  • Note that MapR DB does not support co-processors and a few other capabilities that require code injection into the database core. This is for security reasons. Most applications do not need these capabilities. – Ted Dunning May 05 '16 at 17:52
  • If they do not use HDFS, what do they use? It is a distributed database right? – r4bb1t Mar 06 '22 at 19:16
7

They are similar but not same. MapR claims that MapR DB is faster and more efficient as they have migrated the critical functionality in native C/C++ code and interface is kept the same. But end of the day MapR DB is propriatory and you depend on the support of MapR for any thing which is done differently than HBase. I didn't liked MapR-DB because it's not compatible with Apache Phoenix(HBase coprocessors are not present in MapR DB) - the SQL way of accessing HBase kind of NoSQL databases. Limitations that i have taken from MapR documentation:

  • Custom HBase filters are not supported.
  • User permissions for column families are not supported. User permissions for tables and columns are supported.
  • HBase authentication is not supported.
  • HBase replication is handled with Mirror Volumes.
  • Bulk loads using the HFiles workaround are not supported and not necessary. HBase coprocessors are not supported.
  • Filters use a different regular expression library
    • Co processors are not supported

So i second previous answer - try out your solution in both(MapR DB vs HBase) before going too far. I didn't liked to very idea of MapR DB from MapR as it's propitiatory and the code is not open source. If any Hadoop distributor is enhancing hadoop - they should also make it available to open source community. Why one should totally rely on commercial support when using opensource.

Ashu
  • 614
  • 8
  • 17
  • 1
    Replication in MapR DB has two possible mechanisms. The first is via mirrors (as mentioned in this answer). The more commonly used mechanism is the near real-time table replication system built into the MapR platform that allows global scale multi-master table replication. – Ted Dunning May 05 '16 at 17:53
  • 2
    `"If any Hadoop distributor is enhancing hadoop - they should also make it available to open source community"` - a quick comment here. Its not really enhancing the Hadoop implementation since its a complete re-write with architectural differences. Could they have made the total re-write open source? - Yes, possibly they could have. The community could possibly take and apply some of the ideas from the MapR implementation as well. – IceMan Dec 17 '16 at 01:00
  • 2
    To add to IceMan's comment, I wouldn't even describe it as a rewrite of Hadoop. It's an entirely different system that happens to expose some of the same APIs to make the transition easier. – TurnipEntropy Apr 18 '19 at 18:17