0

I want to translate a question to sql type.The question is this: "Most successful SKU in terms of revenue (in USD) per country" So i need the best selling product for each country

My data look like this:

,Platform,Userid,In-app Product,Country,Currency,Timestamp,Transaction Receipt,Price
0,IOS,5.669802981,offer4,US,USD,28/03/2017 02:30,240200367307.75848,1.99
1,ANDROID,6.831350998,offer4,US,USD,27/03/2017 23:23,436138823984.4136,3.99
2,ANDROID,6.831233793,pack2.cash,CA,USD,27/03/2017 18:32,192318559985.1093,4.112

This is what i have tried

select Sum("Price"),"In-app Product","Country" from cleandataset group by "Country","In-app Product" order by "sum" desc,"Country" 

The problem is i dont know how to get only the first row from each group.Any help would be appreciated!

My query right now produce something like this:

sum                In-app Product Country
2578.591999999975   "offer4"    "US"
1917.3325600000014  "offer10"   "US"
1885.1750599999978  "pack4.shields" "US"
1387.155899999998   "pack3.shields" "US"
1200.6980999999992  "offer10"   "FR"
1150.638300000004   "offer4"    "FR"
811.1277999999998   "pack4.shields" "FR"
790.5170000000022   "offer4"    "IT"
773.3864000000004   "offer10"   "ES"
763.307199999999    "pack2.shields" "US"
735.6938099999996   "pack3.cash"    "US"
702.2611000000019   "offer4"    "ES"
701.3979000000004   "offer10"   "IT"
675.4060400000003   "offer15"   "US"
672.7070000000007   "pack3.shields" "FR"
660.80911   "pack5.shields" "US"

So i think i need to select the Product with the biggest sum in each country

5 Answers5

2
  • Calculate sum per app product and country
  • Create a rank based on sum calculated in step 1 in the DESC order which means we are ranking from top to bottom starting with the largest sum valye
  • choose rank = 1 to get rows with the highest sum, eliminating the duplicates
with total as (
select 
    Country,app_product, sum(price) over(partition by country,app_product) as total
from [table]  
)
,
ranking as (
select
 *, row_number() Over(Partition by country by total desc) as rank_
from total_
)
select 
country, app_product, total from ranking where rank_ = 1
trillion
  • 1,207
  • 1
  • 5
  • 15
  • Unfortunatly its doent run.I use pgadmin for the queries.It says that it have synstax error near of the select – xaroulis gekas Sep 06 '21 at 15:56
  • @xaroulis gekas: In case of ties this query picks one of the top products per country arbitrarily. If you want to show all top #1 products, then replace `ROW_NUMBER` by `RANK` or `DENSE_RANK`. – Thorsten Kettner Sep 07 '21 at 04:41
1

You can resolve it using a ROW_NUMBER function with PARTITION BY Product, Country and select the ROW_NUMBER = 1

See https://www.postgresqltutorial.com/postgresql-row_number/

1

You were almost there. Use your query and distinct on :

select distinct on (country) * 
from 
(
  select country, app_product, sum(price) as "sum" 
  from the_table
  group by country, app_product
) as t
order by country, "sum" desc;
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
1

While PostgreSQL has DISTINCT ON to get only one row per group, this often doesn't suffice, because it cannot deal with ties. If two products in a country have the same top total, then you'll probably want to show both.

You can use this standard SQL query instead:

select
  country,
  in_app_product,
  sum(price) as total
from cleandataset
group by country, in_app_product
order by rank() over (partition by country order by total desc)
fetch first row with ties;

Or this:

select country, in_app_product, total
from
(
  select
    country,
    in_app_product,
    sum(price) as total,
    max(sum(total)) over (partition by country) as max_total
  from cleandataset
  group by country, in_app_product
)
where total = max_total;

And there are still other ways to do this.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
1

You can use distinct on directly with group by:

select distinct on ("Country") Sum("Price"), "In-app Product", "Country"
from cleandatase
group by "Country", "In-app Product"
order by "Country", Sum("Price") desc;

Note: As Thorsten points out, if there are ties and you want all the ties, then distinct on is not the simplest solution.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786