Here's the scenario: table record
is as follows:
A | B | C
1 | 1 | 1
2 | 1 | 1
3 | 1 | 1
4 | 1 | 2
5 | 1 | 2
6 | 1 | 3
the result of HQL: select * from record where B = 1 and C < 3 limit 2
would be:
A | B | C
1 | 1 | 1
2 | 1 | 1
But what I want is:
A | B | C
1 | 1 | 1
2 | 1 | 1
4 | 1 | 2
5 | 1 | 2
That is: To limit the number of record in every condition, not limit the final number of records returned.
I really need this to be done just in hive. Could anyone give me an idea? Thanks a lot!
To Summary Here's a nice way to resolve this problem: http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/