5

I have a table in hive

S.no  Age  minutes  code  
 1     10   20     75081     
 2     11   114    75080      
 3     21   104    75180     
 4     31   124    75108    
 5     10   20     75083     
 6     11   114    75180    
 7     21   104    75180    

I want to write an hivesql/sql query that gives ranked list of total minutes spoken based on the region i.e first 4 digits of code. How should i go about it? I know that SUBSTRING() gives me the required cut, but I have not been able to go from there.

Select code, minutes as total  
from TableT   
where S.no > 1
group by code 
order by total

Edit: The ranked results based on first 4 digits of zipcode should look something like this

total code

322(i.e 104+114+104) 7518
154(i.e 20+114+20) 7508
124 7510

DJ_Stuffy_K
  • 615
  • 2
  • 11
  • 29

1 Answers1

15

Hmmm. I'm thinking you want this:

select substr(zip, 1, 4) as region, sum(minutes) as minutes,
       rank() over (order by sum(minutes) desc) as therank
from tableT
where s.no > 1
group by substr(zip, 1, 4)
order by minutes desc;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    thank you Gordon Linoff . I think this is what i was looking for. – DJ_Stuffy_K Oct 13 '15 at 22:14
  • 1
    My data is partitioned based on year month and day , so I have the three columns specifying month date and year. how do i edit the above query to query over a specific date range? I know sql If I have single date field I could use 'select date BETWEEN yyyy-mm-dd and yyyy-mm-dd' or do a ' >= <=', – DJ_Stuffy_K Oct 13 '15 at 22:19
  • 1
    @user4590025 . . .You make careful use of `year`, `month`, and `day`. For a given date, you would use `where year = 2015 and month = 10 and day = 13`, for instance. – Gordon Linoff Oct 13 '15 at 22:51
  • In you solution you have done `select substr(zip, 1, 4) as region` . so towards the end should we not use `group by region ` as opposed to `group by substr(zip, 1, 4) ` – DJ_Stuffy_K Oct 13 '15 at 23:51
  • 1
    @user4590025 . . . You can use `group by region`, I think. I tend not to remember off-hand which databases allow aliases in the `group by` and which do not. – Gordon Linoff Oct 14 '15 at 03:30