Questions tagged [apache-hive]

Apache Hive supports analysis of large datasets stored in Hadoop's HDFS and compatible file systems such as Amazon S3 filesystem. It provides an SQL-like language called HiveQL with schema on read and transparently converts queries to map/reduce, Apache Tez[7] and Spark jobs. All three execution engines can run in Hadoop YARN. To accelerate queries, it provides indexes, including bitmap indexes.

Few features:-

1.Indexing to provide acceleration, index type including compaction and Bitmap index as of 0.10, more index types are planned. 2.Different storage types such as plain text, RCFile, HBase, ORC, and others. 3.Metadata storage in an RDBMS, significantly reducing the time to perform semantic checks during query execution. 4.Operating on compressed data stored into the Hadoop ecosystem using algorithms including DEFLATE, BWT, snappy, etc. 5.Built-in user defined functions (UDFs) to manipulate dates, strings, and other data-mining tools. Hive supports extending the UDF set to handle use-cases not supported by built-in functions. 6.SQL-like queries (HiveQL), which are implicitly converted into MapReduce or Tez, or Spark jobs.

96 questions

votes

1 answer

Distinct on Multiple columns in Hive

Hi does Hive support distinct on multiple columns. like select distinct(a, b, c, d) from table. If not is there a way to achieve this?

hadoop hive apache-hive

asked Mar 13 '14 at 12:16

Bhaskar Mishra

3,332
7
26
36

votes

2 answers

How do I access HBase table in Hive & vice-versa?

As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE table easily. For them, I need to…

hive hbase sqoop apache-hive

asked May 08 '15 at 15:07

Abhishek

6,912
14
59
85

votes

3 answers

Select all columns of a Hive Struct

I have a requirement to select * from all columns from a hive struct. Hive create table script is here below Create Table script Select * from the table displays each struct as a column select * from table The requirement i have is to display all…

struct hive udf apache-hive hive-udf

asked Mar 16 '17 at 22:54

Abhijit Nayak

votes

2 answers

Insert timestamp into Hive

Hi i'm new to Hive and I want to insert the current timestamp into my table along with a row of data. Here is an example of my team table : team_id int fname string lname string time timestamp I have looked at some other examples, How to…

hadoop hive apache-hive

asked Jun 16 '16 at 15:17

Frostie_the_snowman

votes

2 answers

Apache hive MSCK REPAIR TABLE new partition not added

I am new for Apache Hive. While working on external table partition, if I add new partition directly to HDFS, the new partition is not added after running MSCK REPAIR table. Below are the codes I tried, -- creating external table hive> create…

hadoop mapreduce hive apache-hive

asked Aug 03 '15 at 07:46

Green

votes

1 answer

Hive subquery in where clause (Select * from table 1 where dt > (Select max(dt) from table2) )..please suggest an alternative

I am looking for something in hive like Select * from table 1 where dt > (Select max(dt) from table2) Obviously hive doesn't support sub queries in where clause and also, even if I use joins or semi join, it compares only = and not > (As far as I…

hive hiveql apache-hive

asked Jul 01 '14 at 17:21

user2957483

votes

1 answer

Get all Hive table/database creation/deletion details (audit logs)

Lets say I have a database - project . I created a table named tab1 and then later tab2 . Now I dropped the table tab1. Where do I look for the logs that says I have dropped the table tab1 from databse project. I would like to get the time , user…

hadoop hive apache-hive

asked Feb 01 '17 at 21:47

K S Nidhin

2,622
2
22
44

votes

2 answers

Is it possible to add new column partition to already existing partitioned table in hive

I have partition table called employee_part.This table is partitioned by hiredate. It has metadata as given below When I tried to add new column partition to the employee_part table Im getting an error saying ALTER TABLE employee_part ADD…

hadoop apache-hive

asked Jun 17 '15 at 07:58

marjun

votes

5 answers

Spark SQL on ORC files doesn't return correct Schema (Column names)

I have a directory containing ORC files. I am creating a DataFrame using the below code var data = sqlContext.sql("SELECT * FROM orc.`/directory/containing/orc/files`"); It returns data frame with this schema [_col0: int, _col1: bigint] Where as…

apache-spark apache-spark-sql apache-hive

asked Jul 30 '16 at 13:46

Ramu Malur

votes

1 answer

Apache Hive - Single Insert Date Value

I'm trying to insert a date into a date column using Hive. So far, here's what i've tried INSERT INTO table1 (EmpNo, DOB) VALUES ('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as…

hadoop hiveql apache-hive

asked Jun 23 '16 at 11:36

Abbas Gadhia

14,532
10
61
73

votes

0 answers

Is there a way to keep track of schema change in a Hive metastore?

I'm looking for a possible solution to keep track of all schema changes in a Hive metastore such as create new table, add/remove columns, change column type and etc. I haven't found any so far. Should I just monitor the MySQL db that stores the meta…

hive apache-hive

asked Jun 07 '16 at 02:12

piggybox

1,689
1
15
19

votes

2 answers

Why do I get the error "Thrift::TException=HASH(0x122b9e0)" when I try to execute a statement with Thrift::API::HiveClient?

I am trying to connect to Apache Hive from a Perl script but I'm getting the following error: Thrift::TException=HASH(0x122b9e0) I am running with Hadoop version 2.7.0, Hive version 1.1.0, and Thrift::API::HiveClient version 0.003. Here is the…

perl apache-hive

asked Mar 04 '16 at 15:33

Koushik Chandra

1,565
12
37
73

votes

1 answer

How to change the length of a column name in a Hive table?

I have a hive table where the column names are longer than the usual. I referred to the hive metastore for the table definition. This is how it looks: DESCRIBE hive.columns_v2; Output: Name || Null || Type ----------- …

hive apache-hive

asked Jan 20 '16 at 22:26

trips

votes

2 answers

Can a Hive custom SerDe produce multiple rows?

I am using Hive 0.13.1 and I created a custom SerDe that is able to process a special kind of xml data. So far so good. I also created a class for the InputFormat that splits the input data. Is it possible that I produce multiple rows (output) in…

hive apache-hive

asked Dec 11 '15 at 12:50

S. Walz

votes

2 answers

I'm installing Hive 2.0.0 with Hadoop 2.7.2

I' trying to install Hive 2.0.0 with Hadoop 2.7.2 But I don't know what's the problem in my execution parallels@ubuntu:/usr/local/apache-hive-2.0.0-bin$ ./bin/hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in…

apache hadoop hive apache-hive

asked Oct 14 '16 at 12:33

jjj111144444

2 3 4 5 6 7 Next