I have a sqoop job ran via oozie coordinator. After a major upgrade we can no longer use hive cli and were told to use beeline. I'm not sure how to do this? Here is the current process:
I have a hive file: hive_ddl.hql
use schema_name;
SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;
SET mapreduce.map.memory.mb=16384;
SET mapreduce.map.java.opts=-Xmx16G;
SET hive.exec.compress.output=true;
SET mapreduce.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
drop table if exists 'table_name_stg' purge;
create external table if not exists 'table_name_stg'
(
col1 string,
col2 string,
...
)
row format delimited
fields terminated by '\001'
stored as textfile
location 'my/location/table_name_stg';
drop table if exists 'table_name' purge;
create table if not exists 'table_name'
stored as parquet
tblproperties('parquet.compress'='snappy') as
select * from schema.tablename_stg
drop table if exists 'table_name_stg' purge;
This is pretty straight forward, make a stage table, then use that to make the final table stuff...
it's then called in a .sh file as such:
hive cli -f $HOME/my/path/hive_ddl.hql
I'm new to most of this and not sure what beeline is, and couldn't find any examples of how to use it to accomplish the same thing my hivecli is. I'm hoping it's as simple as calling the hive_ddl.hql file differently, versus having to rewrite everything.
Any help is greatly appreciated.