How to reduce generating files of SQL "Alter Table/Partition Concatenate" in Hive?

Question

Hive version: 1.2.1

Configuration:

set hive.execution.engine=tez;
set hive.merge.mapredfiles=true;
set hive.merge.smallfiles.avgsize=256000000;
set hive.merge.tezfiles=true;

HQL:

ALTER TABLE `table_name` PARTITION (partion_name1 = 'val1', partion_name2='val2', partion_name3='val3', partion_name4='val4') CONCATENATE;

I use the HQL to merge files of specific table / partition. However, after execution there are still many files in output directory; and their size are far less than 256000000. So how to decrease the number of output files.

BTW, use MapReduce instead of Tez also didn't work.

score -2 · Answer 1 · answered Apr 19 '16 at 13:20

-2

You may set your reducer number to 1 then, it would only create one output file.

You may do it with the following;

set mapred.reduce.tasks=1

answered Apr 19 '16 at 13:20

Ducaz035

3,054
2
25
45

Please check the comment: BTW, use MapReduce instead of Tez also didn't work. So he may use MapReduce as well if he wants to. In addition, you may use the configuration above also for Tez. – Ducaz035 Apr 19 '16 at 13:31
I can also ensure you that it does solve the issue. Maybe tez is slightly different story but it does work for MapReduce and it is what user asked. – Ducaz035 Apr 19 '16 at 14:00
I have tried it rigth now and the result is that I have 25 files. Moreover the triggered MapReduce job is a map-only job. Maybe you are using a different Hive version. I'm using Hive 1.2.1 and the files are ORC. And in these conditions, your solution doesn't work. – mgaido Apr 19 '16 at 14:09
Well, can you please try to set the mappers to 1 ? – Ducaz035 Apr 19 '16 at 14:21
Well, then i am out of ideas sorry for that. – Ducaz035 Apr 19 '16 at 14:26
this does not work – pavel_orekhov Jul 22 '23 at 23:05

score -2 · Answer 2 · edited Aug 16 '17 at 04:47

-2

Maybe u can try insert overwrite table ... partition ( ... ) select * from ...

This one can use the merge setting for tezfiles.

edited Aug 16 '17 at 04:47

Fabien

4,862
2
19
33

answered Aug 16 '17 at 00:38

heyhey

1

How to reduce generating files of SQL "Alter Table/Partition Concatenate" in Hive?

2 Answers2

Linked