1

I have a table of football results:

    CREATE TABLE matches(
    season NUMBER(4),
    matchDate DATE,
    homeTeam VARCHAR2(25),
    awayTeam VARCHAR2(25),
    homeGoals NUMBER(2),
    awayGoals NUMBER(2),
    totalGoals NUMBER(3));

For each row, I wish to update the totalGoals column. totalGoals is calculated by adding the goals scored (homeGoals + awayGoals) in the 5 most recent matches where the homeTeam played at home to the goals scored in the 5 most recent matches where the awayTeam played away.

It calculates goals scored from using only matches from the same season. It does NOT include goals scored in the current row. If either team has not played the required number of matches in the season, totalGoals remains NULL.

I can update this using PL/SQL, but is there a way to do this using only SQL?

Tams
  • 182
  • 1
  • 1
  • 8
  • Is this some kind of homework? Because wanting SQL is somewhat weird. You would need to run it on a server of some type. Or you want to support multiple server types (MS SQL, MySQL, etc)? – Marcel N. Jun 16 '12 at 11:15
  • 1
    Can you give some sample data? Your explanation is really confusing –  Jun 16 '12 at 11:25
  • 1
    @marceln - if you mean schoolwork, then no. I left school a long time ago. Yes, I'm weird. Well spotted. Are you any good with SQL? – Tams Jun 16 '12 at 11:27
  • I can give you an example that uses the ranking functions in TSQL (Microsoft's version), but not in the standard SQL (SQL 92). – Marcel N. Jun 16 '12 at 11:30
  • @a_horse_with_no_name. I love that song! Trying to post sample data, but hit by limits. Will try again. – Tams Jun 16 '12 at 12:05
  • You're probably looking at a TOP-n-GROUP-BY pattern. IMHO, unless you're actually interested in the theory, your time is better spent parsing it row by row, then having your application properly calculate the totalGoals value on every new insert. That is, of course, dependant on this question actually requiring a practical solution. –  Jun 16 '12 at 12:07
  • @Geoist - Yes I'm very interested in he theory. Can it be done using SQL alone? – Tams Jun 16 '12 at 12:11
  • Yes, http://stackoverflow.com/questions/1442527/how-to-select-the-newest-four-items-per-category/1442867#1442867 –  Jun 16 '12 at 12:12
  • @horse_with_no_name. INSERT INTO matches VALUES ('2012','26-DEC-11','Chelsea','Fulham',1,1,null); INSERT INTO matches VALUES ('2012','31-DEC-11','Chelsea','Aston Villa',1,3,null); INSERT INTO matches VALUES ('2012','04-JAN-12','Newcastle','Man United',3,0,null); INSERT INTO matches VALUES ('2012','22-JAN-12','Arsenal','Man United',1,2,null); INSERT INTO matches VALUES ('2012','31-JAN-12','Everton','Man City',1,0,null); INSERT INTO matches VALUES ('2012','05-FEB-12','Chelsea','Man United',3,3,null); To simplify, lets say 2 most recent matches. For Chelsea v Man United, it would be 1+1+2+0=4 – Tams Jun 16 '12 at 12:19
  • @Tams: Please use something like this to share your information if it's too large to put in a comment.- https://gist.github.com/ –  Jun 16 '12 at 12:27
  • @Geoist and horse_with_no_name. Full season data here https://gist.github.com/2941229. Enjoy! Not sure how your previous solution fits with this query. Can you modify it? I can't use LIMIT in Oracle. – Tams Jun 16 '12 at 12:40
  • I've submitted my actual solution below. I'm just unsure about the join. Are you tallying only when the two teams play eachother, or are you including other teams as long as they play home / away conssitently.. –  Jun 16 '12 at 12:45
  • @Geoist. Example. Last match of season was Norwich v Aston Villa. I want to add the goals in Norwich's 5 previous home matches to Aston Villa's 5 previous away matches. Norwich's 5 previous home matches include 28-Apr which had 3 goals to 11 Mar which had 2 goals. Aston Villa's 5 most recent matches date from 28-Apr which had 0 goals to 3-March with 2 goals. In total those combined 10 matches contained 30 goals, which is the value I want to update on the row dated 13-May-2012. – Tams Jun 16 '12 at 13:17
  • See below for addition comment – Tams Jun 16 '12 at 13:35

2 Answers2

2

If I've understood what you want, you can do this with analytic functions and windowing clauses.

select season, matchdate, hometeam, awayteam, homegoals, awaygoals,
    case when home_cnt >= 5 and away_cnt >= 5 then
        home_tot + away_tot
    else null end as totalgoals
from (
    select season, matchdate, hometeam, awayteam, homegoals, awaygoals,
        count(*) over (partition by season, hometeam
            order by matchdate
            rows between 5 preceding and 1 preceding) as home_cnt,
        sum(homegoals + awaygoals) over (partition by season, hometeam
            order by matchdate
            rows between 5 preceding and 1 preceding) as home_tot,
        count(*) over (partition by season, awayteam
            order by matchdate
            rows between 5 preceding and 1 preceding) as away_cnt,
        sum(homegoals + awaygoals) over (partition by season, awayteam
            order by matchdate
            rows between 5 preceding and 1 preceding) as away_tot
    from matches
)
order by season, matchdate, hometeam, awayteam;

The inner select calculates the number of matches and the total number of goals across them, for each home/away team in each season, using the analytic version of count and sum, and the window clause rows between ... limits both to the previous five, excluding the current row, which I think is what you want. The outer select then adds the relevant totals together for the two teams in the current row, but checks both counts and leaves the total null if either is < 5. Note that it only hits the matches table once.

With an additional filter immediately before the order-by:

where season = 2012 and homeTeam = 'Norwich' and awayteam = 'Aston Villa'

... you get:

    SEASON MATCHDATE HOMETEAM                  AWAYTEAM                   HOMEGOALS  AWAYGOALS TOTALGOALS
---------- --------- ------------------------- ------------------------- ---------- ---------- ----------
      2012 13-MAY-12 Norwich                   Aston Villa                        2          0         30

You could use this to update the table for the matching row, though generally I'd calculate it as needed to avoid potential data integrity errors, possibly in a view.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • You the man! It works for several that I've checked AND it's bloody fast. Thanks and well done! How do I select your answer as the correct one? – Tams Jun 16 '12 at 15:54
  • According to the [FAQ](http://stackoverflow.com/faq#howtoask) there should be a grey tick under the upvote/downvote buttons on the left; just click that and it will go green. There may be a time delay before new users can see/click it though, I'm not sure. (Feel free to up-vote as well, of course ). – Alex Poole Jun 16 '12 at 16:01
  • I've given it the green tick, but it won't allow me to vote. I have a very poor reputation! – Tams Jun 16 '12 at 16:10
0

I tried the following on MySQL, but it failed with "Error Code: 1093. You can't specify target table 'm' for update in FROM clause".

Maybe you can try it with Oracle. You'll have to replace the "limit 0,5" clause with Oracle's specific way of limiting the number of rows. I think they use "where rownum < 6".

update matches m
  set totalGoals = (select sum(homeGoals) + sum(awayGoals) 
                      from matches
                      where homeTeam = m.homeTeam
                        and season = m.season
                        and matchDate < m.matchDate
                      having count(matchDate) > 4
                      order by matchDate limit 0,5) +
                   (select sum(homeGoals) + sum(awayGoals) 
                      from matches
                      where awayTeam = m.awayTeamm
                        and season = m.season
                        and matchDate < m.matchDate
                      having count(matchDate) > 4
                      order by matchDate limit 0,5);
Henrique Ordine
  • 3,337
  • 4
  • 44
  • 70
  • That solution is counting the current game's score, which it shouldn't do according to the OP. And.. it's not ignoring when teams haven't played at least 5 games, which it should. –  Jun 16 '12 at 12:47
  • These are the matches I'm attempting to update..... UPDATE (select m1.matchdate, m1.hometeam, m1.awayteam, m1.totalGoals from matches m1 where 5 <= (select count(0) from matches m2 where M1.season = m2.season and m1.hometeam = m2.hometeam and m2.matchdate < m1.matchdate group by M2.season, m2.hometeam) and 5 <= (select count(0) from matches m2 where M1.season = m2.season and m1.awayteam = m2.awayteam and m2.matchdate < m1.matchdate group by M2.season, m2.awayteam)) SET totalGoals = .... – Tams Jun 16 '12 at 13:36
  • @Geoist Just change <= for < for it to stop counting the current record being updated. – Henrique Ordine Jun 16 '12 at 15:09
  • I've edited it to cover the requirements. You still need to check whether oracle will accept this type of correlated query. – Henrique Ordine Jun 16 '12 at 15:19
  • @Henrique. Rownum can be used to restrict the number of rows returned from querying a single table. It wouldn't be a problem to retrieve the 5 most recent matches from the table. This is how it's done. select * from (select * from matches order by matchdate desc) where rownum <= 5; The difficulty is retrieving 5 rows for each individual row in the table. – Tams Jun 16 '12 at 15:37