32

I just started exploring Hive. It has all the structures similar to an RDBMS like tables, joins, partitions.. what i understand is Hive still uses HDFS for storage and it is an SQL abstraction of HDFS. From this I am not sure weather Hive itself is a database solution like HBase, Cassnadra.. or simply it is a query system on top of HDFS. I don't think it is simply a query language because it has tables, joins and partitions..

behold
  • 538
  • 5
  • 19
Brainchild
  • 1,814
  • 5
  • 27
  • 52
  • You should go through the docs more deeply, It does have joins, partitions etc but it is not a full fledged real time database at all. Check this for what [Hive is NOT](https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-WhatHiveIsNOT) – Suvarna Pattayil Nov 18 '13 at 05:57

1 Answers1

49

Hive is a data warehousing package/infrastructure built on top of Hadoop. It provides an SQL dialect called Hive Query Language (HQL) for querying data stored in a Hadoop cluster. Like all SQL dialects in widespread use, HQL doesn’t fully conform to any particular revision of the ANSI SQL standard. It is perhaps closest to MySQL’s dialect, but with significant differences. Hive offers no support for row level inserts, updates, and deletes. Hive doesn’t support transactions. So we can't compare it with RDBMS. Hive adds extensions to provide better performance in the context of Hadoop and to integrate with custom extensions and even external programs. It is well suited for batch processing data like: Log processing, Text mining, Document indexing, Customer-facing business intelligence, Predictive modeling, hypothesis testing etc.

Hive is not designed for online transaction processing and does not offer real-time queries.

behold
  • 538
  • 5
  • 19
Sandeep Singh
  • 7,790
  • 4
  • 43
  • 68
  • 6
    Note that [HIVE-5317 - Implement insert, update, and delete in Hive with full ACID support](https://issues.apache.org/jira/browse/HIVE-5317) is actively being worked on, see [Adding ACID to Apache Hive](http://hortonworks.com/blog/adding-acid-to-apache-hive/) – Remus Rusanu Nov 17 '13 at 14:11
  • Thank you @RemusRusanu for the updated information. – Sandeep Singh Nov 17 '13 at 15:26
  • 2
    this feature has since made it into hive. – Matt Johnson Oct 27 '16 at 15:31
  • 1
    But is it a query-language (which translates to map-reduce in background) or a database? I am still not sure if Hive is a database – Vibha Oct 08 '19 at 07:52
  • its `not` a database. `Its an interface` to access a file based database(HDFS). – Koushik Roy Apr 08 '21 at 07:40