6

We are looking for a solution in order to create an external hive table to read data from parquet files according to a parquet/avro schema.

in other way, how to generate a hive table from a parquet/avro schema ?

thanks :)

Mehdi TAZI
  • 575
  • 2
  • 5
  • 23

1 Answers1

17

Try below using avro schema:

CREATE TABLE avro_test ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS AVRO TBLPROPERTIES ('avro.schema.url'='myHost/myAvroSchema.avsc'); 

CREATE EXTERNAL TABLE parquet_test LIKE avro_test STORED AS PARQUET LOCATION 'hdfs://myParquetFilesPath';

Same query is asked in Dynamically create Hive external table with Avro schema on Parquet Data

Community
  • 1
  • 1
Ram Manohar
  • 1,004
  • 8
  • 18
  • Can I create table from parquet file directly ? Or how to get Avro schema from specific parquet file ? – Gary Gauh Mar 19 '17 at 16:42
  • @GaryGauh for your second question here's my answer . Using parquet tools you can extract Avro schema of the particular parquet file. Please refer this link for more details : http://kitesdk.org/docs/0.17.1/labs/4-using-parquet-tools-solution.html – JKC Jan 29 '18 at 04:21
  • It worked for me but can i use parquet schema (`org.apache.parquet.schema.MessageType`) to create tables? – Vikram Gulia Aug 29 '18 at 15:37