0

I have a SQL aggregate function that will get data for every unique normalised_brand, everything works except for my field 'brand_gap', in this query, the brand 'Richell' has 5 results, each result has the brand_gap column set to equal a string 'no', how can I get a single string result from the brand_gap column in my aggregate function?

SELECT 
    normalised_brand, 
    COUNT(DISTINCT merch1) merch1_distinct_count,
    COUNT(DISTINCT category_level_1) category_level_1_distinct_count,
    COUNT(*) product_distinct_count,
    CONCAT(CAST(MIN(effective_price) as varchar(10)),' - ', CAST(MAX(effective_price) as varchar(10))) price_range,
    null amazon_choice,
    CAST(ROUND(COALESCE(AVG(rating),0),2) as varchar(10)) rating,
    CAST(COALESCE(SUM(review_count),0) as varchar(10)) review_count,
    SUM(CAST(questions_count AS INTEGER)) Q_and_A
    -- CONCAT(CAST(brand_gap as varchar(10)),' - ') brand_gap,
    -- FIRST (brand_gap)
FROM  
    "scoring"."final_data" 
WHERE 
    product_gap = 'yes' 
    AND store_name = 'petco' 
    AND normalised_brand = 'Richell'
GROUP BY 
    normalised_brand

So the brand_gap column will always have one value for all the results, how can I get a single brand_gap value? like pick the most commonly occurring value? or pick a random value? or pick the first results value?

Thanks

Piotr Findeisen
  • 19,480
  • 2
  • 52
  • 82
Martin
  • 1,336
  • 4
  • 32
  • 69

3 Answers3

1

You tagged the wrong DBMS. While in MySQL you would use ANY_VALUE, you can use MIN or MAX in Amazon Athena. Which doesn't matter, as the value is the same for all rows in the group.

SELECT 
    normalised_brand, 
    ...
    MIN(brand_gap) AS brand_gap
...
If you just want any value (as they are all the same for the group) use `ANY_VALUE`:
SELECT 
    normalised_brand, 
    ...
    ANY_VALUE(brand_gap) AS brand_gap
...
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • tried adding ```ANY_VALUE(brand_gap) AS brand_gap``` but got the error ```SYNTAX_ERROR: line 11:5: Function any_value not registered``` i am on aws athena btw – Martin Jul 20 '20 at 07:56
  • 1
    You tagged your request `mysql`, but obviously you are not using MySQL. You can use `MIN` or `MAX` then instead. Which doesn't matter, as they return the same value. I've changed the request tag for you. – Thorsten Kettner Jul 20 '20 at 07:58
0

You can use FIRST_VALUE or FIRST function to fetch first from the group by result based on database.

Refer http://www-db.deis.unibo.it/courses/TW/DOCS/w3schools/sql/sql_func_first.asp.html for database specific first value

SELECT 
    normalised_brand, 
    ...
    FIRST_VALUE(brand_gap) AS brand_gap
...

Other alternatives can be found in Selecting first and last values in a group

Naruto
  • 4,221
  • 1
  • 21
  • 32
  • `FIRST_VALUE` is usually a window function and hence not appropriate here. `FIRST` is MS Access and would work here, if it were about MS Access. The w3schools docsumentation for `FIRST`, though, is extremely flawed (as their docs are so often, unfortunately). – Thorsten Kettner Jul 20 '20 at 08:20
0

In Athena there is an aggregate function called ARBITRARY which picks a value from the group, and seems semantically suitable in this situation.

Theo
  • 131,503
  • 21
  • 160
  • 205