How to use Parquet files created using Apache Drill inside Hive

Question

Apache Drill has a nice feature of making parquet files out of many incoming datasets, but it seems like there is not a lot of information on how to use those parquet files later on - specifically in Hive.

Is there a way for Hive to make use of those "1_0_0.parquet", etc files? Maybe create a table and load the data from parquet files or create a table and somehow place those parquet files inside hdfs so that Hive reads it?

Possible duplicate of [Dynamically create Hive external table with Avro schema on Parquet Data](http://stackoverflow.com/questions/34181844/dynamically-create-hive-external-table-with-avro-schema-on-parquet-data) — Ani Menon, Jan 13 '17 at 04:19
Unfortunately Apache Drill does not create Avro schema, are you suggesting that I manually create one? — Pavel, Jan 13 '17 at 04:33
Yes.. Refer http://kitesdk.org/docs/0.17.1/labs/4-using-parquet-tools-solution.html — Ani Menon, Jan 13 '17 at 04:45

score 1 · Answer 1 · answered Jan 15 '17 at 02:51

I have faced this problem, if you are using a Cloudera distribution, you can create the tables using impala (Impala and Hive share the metastore), it allows create tables from a parquet file. Unfortunately Hive doesn't allow this

CREATE EXTERNAL TABLE table_from_fileLIKE PARQUET     '/user/etl/destination/datafile1.parquet'
STORED AS PARQUET
LOCATION '/user/test/destination';

How to use Parquet files created using Apache Drill inside Hive

1 Answers1