-1

I would like to achieve SELECT TOP PERCENT in MySQL.

I used Victor Sorokin's idea in Select TOP X (or bottom) percent for numeric values in MySQL, and got the following query:

SELECT x.log AS Login, 
       AVG(x.PROFIT) AS 'Expected Shortfall', 
       MAX(x.PROFIT) AS '40%VaR'
  FROM
  (SELECT t.PROFIT, 
          @counter := @counter +1 AS counter, 
          t.LOGIN AS log 
     FROM (SELECT @counter:=0) initvar, trades AS t
     WHERE t.LOGIN IN (100,101)
     ORDER BY t.PROFIT) AS x
  WHERE x.counter <= (40/100 * @counter)
GROUP BY x.log

Which return the following result:

Login Expected Shortfall 40%VaR
101 -85 -70

This works when I change WHERE t.LOGIN IN (100,101) to a single value like WHERE t.LOGIN=100. Whereby it will return me values for each login as following:

Login Expected Shortfall 40%VaR
100 -4.5 -4
Login Expected Shortfall 40%VaR
101 -95 -90

I'm not really sure what is happening and I was wondering if there is a way to use the query for multiple accounts or there is a better way to solve the issue? Was thinking of a LOOP statement?

I'm currently using MySQL version 5.7.34. Please do not hesitate to let me know if any clarification is needed. Any ideas would be much appreciated!

Edit: To replicate the issue:

CREATE TABLE trades (
TICKET int(11) PRIMARY KEY,
LOGIN int(11),
PROFIT double)

INSERT INTO trades (TICKET,LOGIN,PROFIT)
VALUES
(1,100,-5),
(2,100,-4),
(3,100,-3),
(4,100,-2),
(5,100,-1),
(6,101,-100),
(7,101,-90),
(8,101,-80),
(9,101,-70),
(10,101,-60),
(11,101,-50),
(12,101,500)

The expected output is just like the outputs you would get if you ran the query for 100 and 101 separately:

Expected Output

LOGIN ES 40%VAR
100 -4.5 -4
101 -95 -90

Expected Output

FanoFN
  • 6,815
  • 2
  • 13
  • 33
KevinK9899
  • 13
  • 4
  • 1
    Please [edit your question](https://stackoverflow.com/posts/67978388/edit) and provide a [Minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) .. e.g. sample data and expected output. – FanoFN Jun 15 '21 at 01:29
  • Assigning @variables inside a `SELECT` is deprecated. You have found one of the reasons for such. – Rick James Jun 15 '21 at 03:56
  • Thanks @FaNo_FN, I have edited the question and provided a reproducible example. Please note I have changed 1/100 * at counter to 40/100 * at counter to make the example minimal. Please let me know if there is anything else you would like me to clarify. – KevinK9899 Jun 16 '21 at 00:03
  • Okay... The solution I referred to was posted like 8 years ago so that is why. Thank you @RickJames – KevinK9899 Jun 16 '21 at 00:05
  • What MySQL version are you using? You can run `SELECT @@version` to check. – FanoFN Jun 16 '21 at 00:32
  • Server: version 5.7.34 ; Local: 8.0.21 @FaNo_FN – KevinK9899 Jun 16 '21 at 01:45
  • You mentioned that the query works on single login, I assume on the same data sets. Can you provide details about the query you use and the result you got when you do on single login? – FanoFN Jun 16 '21 at 02:31
  • Sure. I just used the query in the question, but changed the where clause to: WHERE t.LOGIN IN (100). And the output is LOGIN, E.S., 40%VaR // 100, -4.5, -4 -- The same with the expected output Does it make sense? – KevinK9899 Jun 16 '21 at 02:45
  • Here take a look at this fiddle https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=4b484efb89bb3c321b4dd2335d818892 – FanoFN Jun 16 '21 at 03:08
  • Wow, I think this would work - thanks for that @FaNo_FN But it seems like ROW_NUMBER is not supported "(" is not valid at this position. Possibly because the server version is still 5.7? – KevinK9899 Jun 16 '21 at 19:45
  • Yes, ROW_NUMBER() in MySQL only supported from v8 onwards and on MariaDB v10.2 onwards. I did a quick search and actually found a way to emulate the [adding ROW_NUMBER() to each group in older MySQL version](https://www.mysqltutorial.org/mysql-row_number/). I've tested it and it works, [see this fiddle update](https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=4c291366ef58acc955e3a4514a2f0547). – FanoFN Jun 17 '21 at 00:47
  • Thanks, @FaNo_FN. I figured a similar way from your previous fiddle. Now, how do I get this question reopened so that I can select your answer? – KevinK9899 Jun 21 '21 at 20:36
  • You'll need three votes to reopen, me and another person have already voted so you only need one more person with "open vote privilege" to vote. – FanoFN Jun 22 '21 at 00:33
  • I look forward to October 2023, when [MySQL 5.7 reaches its end-of-life](https://endoflife.date/mysql). Then we can tell people with this type of question, "you need to upgrade — really." Even sooner is [MariaDB 10.2's end-of-life](https://endoflife.date/mariadb), in May 2022. – Bill Karwin Jun 28 '21 at 03:09
  • @BillKarwin I wish I was in charge -- really. – KevinK9899 Jun 29 '21 at 23:56

1 Answers1

0

The reason why the end result was not according to the single value queries was caused by the @row_number assignment. Taking the base query (the subquery) to run alone will return the following results:

PROFIT counter log
-100 1 101
-90 2 101
-80 3 101
-70 4 101
-60 5 101
-50 6 101
-5 7 100
-4 8 100
-3 9 100
-2 10 100
-1 11 100
500 12 101

As you can see, the counter value that was generated using @row_number is giving a running number for all of the data in the table regardless of it's log value. The result below shows the differences with query that using a single log value:

PROFIT counter log
-5 1 100
-4 2 100
-3 3 100
-2 4 100
-1 5 100

Here you can see that if using log=100, you'll get a counter (@row_number) generated from 1-5 as opposed to it being generated from 7-11 in the combined log IN (100,101). This is why WHERE x.counter <= (40/100*v.ctr) in the final query only take log=101 because it's the only one matches the condition. What you're looking for is a counter value separated by log. On MySQL 8.0+ (or MariaDB 10.2+) that support window function, this can be done by using ROW_NUMBER(). However, since OP is using an older version, I found a way to emulate the functionality of ROW_NUMBER() accordingly.

This is the final query generated:

SELECT x.log AS Login,
        AVG(x.PROFIT) AS 'Expected Shortfall', 
       MAX(x.PROFIT) AS '40%VaR'
  FROM
  (SELECT t.PROFIT,
    @row_number:=CASE
        WHEN @id = LOGIN THEN @row_number + 1
          ELSE 1 END AS counter,
    @id:=LOGIN ID, t.LOGIN AS log 
FROM trades t 
CROSS JOIN (SELECT @id:=0,@row_number:=0) as n
ORDER BY LOGIN) AS x 
  JOIN (SELECT Login,COUNT(*) ctr FROM trades GROUP BY login) AS v
  ON x.log=v.login
  WHERE x.counter <= (40/100*v.ctr)
  GROUP BY x.log
  ORDER BY x.log;

And here is the demo fiddle (inclusive of ROW_NUMBER()) on MySQL 8.0+ query.

FanoFN
  • 6,815
  • 2
  • 13
  • 33