I am new to Hadoop Hive and I am developing a reporting solution. The problem is that the query performance is really slow (hive 0.10, hbase 0.94, hadoop 1.1.1). One of the queries is:
select a.*, b.country, b.city from p_country_town_hotel b
inner join p_hotel_rev_agg_period a on
(a.key.hotel = b.hotel) where b.hotel = 'AdriaPraha' and a.min_date < '20130701'
order by a.min_date desc
limit 10;
which takes quite a long time (50s). I know I know, the join is on string field and not on integer but the data sets are not big(cca 3300 and 100000 records). I tried hints on this SQL but that didn't turn out any faster. The same query on MS SQL Server lasts 1s. Also a simple count(*) from table lasts 7-8s which is shocking (the table has 3300 records). I really don't know what is the issue? Any ideas or did I misinterpret Hadoop?