I tried to run the SQL like the following:
select count(*) from test_table where columna='a' and columnb in ('test1', test2')
For Impala in Cloudera, it takes around 2 mins, but for Hive, it takes 20mins, not sure is this normal? if yes, why does Impala run much faster than Hive in Cloudera? and in which kind of scenario will Hive be faster than Impala?
Thanks.