1

I wish to update some rows of a table, univ_members, distinguished by their universe #, with their percentile or rank. That pctile value is calculated within a subquery. The subquery works fine by itself, but when fed to an update, fails with "Subquery returns more than one row". Line 23 in the code block is the offending expression. (This bit of code was gleaned from here (stackoverflow), here (a poster's blog) and ultimately (I should think) from a book, "High Performance MySQL", Schwartz, Zaitsev, Tkachenko, p. 648 et seq. in the 3rd edition)

I can't get it to work with the update. I'm beginning to suspect, after reading maybe 30 different posts here concerning update and "Subquery returns more than one row" that this isn't possible in MySQL in a single complex query, rather, I need to write a proc to do it via a loop. Can this be done in MySQL in a query or is a loop required? If with a query, how to do it? I want to mention that the fact that this query uses no self-join is very valuable for me, for performance. My production table starts with 112,000 recs and is likely to grow to 4 million.

Line 11 is where the pctile value is calculated. Line 13 is where the counter variables are initialized (once only, evidently). The inner select orders the set by std_dev_rank (line 15), the standard deviation of each of the pid's $vol (dollar-volume), both calculated previously. @curr is initially set to the highest of them (line 10), then on 13, compared with the previous one and is thus given a rank accordingly. Some simple arithmetic ensues. I omit variable definitions and other details necessary for you to try this yourselves, but I do include the univ_members table (line 19).

In case you're interested, this is out of my stock momentum/price estimation and prediction program. $-volume is a measure of liquidity. This is for preparing to select stocks (the pids) that are good candidates for further examination. (Each 'universe' is stocks across time, often overlapping. That's why the table can grow beyond the number of stocks in existence.)

1  create temporary table um like univ_members;
2  insert into um select * from univ_members where id_universe = @univ_num;
3
4  update um
5    set pctile =
6    (
7    select pctile from(
8      select
9        @prev := @curr,
10       @curr := std_dev_rank,
11       (100- ((@rank := IF(@prev= @curr, @rank, @rank+1))* @recs_per_pctile)) AS pctile
12       from univ_members
13         ,(select @curr := null, @prev := null, @rank := 0) x
14       where id_universe = @univ_num
15       order by std_dev_rank desc) q
16   where id_universe = @univ_num);
17
18 -- the univ_members table
19 create table univ_members
20 (
21   id_universe    smallint(5) unsigned not null,
22   pid            char(8) not null,
23   $vol           double unsigned default null,
24   coef_cor       double default null, -- 'distance' from median $vol
25   pctile         double unsigned default null, -- another 'distance' measure
26   std_dev_rank   double default null, -- individual std dev bracket
27
28   primary key pk_univ_members (id_universe, pid),
29   key k_$vol ($vol)
30 ) engine=InnoDB;
Community
  • 1
  • 1
Shellsunde
  • 76
  • 10
  • If you run the subquery by itself, how many rows does it return? Which one should be used in the update? – Barmar Jun 19 '14 at 23:09
  • By itself, the subquery returns the numbers of rows where id_universe = @univ_num: For my 2-universe test db, either 4 or 5. I want each row of the returned set, which is unique over pid and contains one field per row, pctile, to update the um rec to which it corresponds by pid. Looks like I didn't actually do that in the update or query, either. – Shellsunde Jun 20 '14 at 01:03

1 Answers1

4

You need to have the subquery return both the primary key and the percentiles. Then you can join with the subquery and update the column in the corresponding rows:

UPDATE um
JOIN (select id_universe, pid,
             @prev := @curr,
             @curr := std_dev_rank,
             (100- ((@rank := IF(@prev= @curr, @rank, @rank+1))* @recs_per_pctile)) AS pctile
      from univ_members
           ,(select @curr := null, @prev := null, @rank := 0) x
      where id_universe = @univ_num
      order by std_dev_rank desc) q
SET um.pctile = q.pctile
WHERE um.id_universe = q.id_universe AND um.pid = q.pid
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Very good, much appreciated. I don't properly understand what you did. Would you please explain how there is room for anything between 'um', the name of the table to update and the 'set' operator? In the MySQL docs, it just says 'table_reference', which implies one such table only. It never entered my mind to put the results of a join in that position, either, so that's a great innovation for me. What does it do, make available to the update the columns id_universe and pid as a table? How does the execution proceed? – Shellsunde Jun 20 '14 at 01:15
  • 1
    Look at the second syntax diagram, labeled "Multiple-table syntax". It says ___table_references___. That plural allows you to use a `JOIN`. Further down, it says "multiple-table UPDATE statements can use any type of join permitted in SELECT statements" – Barmar Jun 20 '14 at 01:25
  • Aha, OK. That reveals my lack of understanding. I thought 'multiple tables' meant updating multiple table in one update and I didn't suspect that included what you did. So now, explain please how this works. You present two tables to the update operator and then instruct it to take values from one into the other? Is that what 'multiple-table syntax' means? Or not entirely? – Shellsunde Jun 20 '14 at 01:39
  • Yes, that's exactly it. Basically, any values you can compute with a `SELECT` query can be joined with the target table, to get the values to assign. – Barmar Jun 20 '14 at 02:03
  • Thanks very much. This is a great learning experience for me. I haven't seen a book or post yet that explained SQL like this. The way to understand SQL, it seems, is to grasp that it is a formalization of presenting sets, their attributes and operations, and their consequent interactions. – Shellsunde Jun 20 '14 at 02:33
  • This site explains it exactly that way: http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/ – Barmar Jun 20 '14 at 14:20