I am using a hive external table to dump data as json. My dump files look fine. However after my dump, the files written by hive are of varied sizes ranging from around 400MB to 7GB. I want to have files of a fixed max size (say 1GB). But I am unable to do so. Please Help! My Query:
INSERT OVERWRITE DIRECTORY '/myhdfs/location'
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.DelimitedJSONSerDe'
select * from MY_EXTERNAL_TABLE;
Hive Version: Hive 1.1.0-cdh5.14.2
Hadoop Version: Hadoop 2.6.0-cdh5.14.2