0

I have two tables:

  • match_rating, which have data on a team's performance in a match. There are naturally two tuples for every matchId (since there are two teams to each match). The PK is matchId, teamId.
  • event, which has information on events during matches. The PK is an autoincremented UID, and it contains the Foreign Keys match_id and subject_team_id as well.

Now I want to create a new view which counts how many times certain events happen in a match, for each team, with fields like this:

DB table fields

But for the life of me I cannot get around the fact that there are 1) two tuples for each match in the match_rating table, and 2) querying the event table on match_id returns events for both teams.

The closest I got was something like this:

SELECT SUM(
        CASE 
            WHEN evt.event_type_id = 101 THEN 1
            WHEN evt.event_type_id = 111 THEN 1
            WHEN evt.event_type_id = 121 THEN 1
            [etc]
        END    
    ) AS 'mid_chances',
    SUM(
        CASE 
            WHEN evt.event_type_id = 103 THEN 1
            WHEN evt.event_type_id = 113 THEN 1
            WHEN evt.event_type_id = 123 THEN 1
            [etc]
        END    
    ) AS 'right_chances',
    mr.tactic,
    mr.tactic_skill,
    mr.bp,
    evt.match_id,
    evt.subject_team_id    
FROM event evt
JOIN match_rating mr
     ON evt.match_id = mr.match_id
WHERE   evt.event_type_id BETWEEN 100 AND 104 OR 
        evt.event_type_id BETWEEN 110 AND 114 OR 
        evt.event_type_id BETWEEN 120 AND 124 OR 
        [etc]
GROUP BY evt.match_id  
ORDER BY `right_chances`  DESC

But still, this counts the events twice, reporting 2 events where there was only 1, 6 for 3 events and so on. I have tried grouping on team_id as well (GROUP BY evt.match_id AND team_id) , but that returns only 2 rows with all events counted.

I hope I have made my problem clear, and it should be obvious that I really need a good tip or two.

Edit for clarity (sorry): Sample data for match_rating table: enter image description here

Sample data for the event table: enter image description here

What I would like to see as the result is this: enter image description here

That is, two tuples for each match, one for each team, where the types of events that team had is summed up. Thanks so much for looking into this!

Tobias
  • 37
  • 7
  • Please edit your post to include example fields and data from the two tables, and expected final output based on that data. – Ian Kemp Oct 07 '20 at 12:38

1 Answers1

1

Update after comments/feedback

OK.. just to confirm, what you want is

  • Each row of the output represents a team within a match
  • Other values (other than match_id and team_id) are sums or other aggregations across multiple rows?

If that is the case, then I believe you should be doing a GROUP BY the match_id and team_id. This should cause the correct number of rows to be generated (one for each match_id/team_id combination). You say in your question that you have tried it already - I suggest reviewing it (potentially after also considering the below).

With your data, it appears that the 'event' table also has a field which indicates the team_id. To ensure you only get the relevant team's events, I suggest your join between match_rating and event be on both fields e.g.,

FROM event evt
JOIN match_rating mr
     ON evt.match_id = mr.match_id
     AND evt.subject_team_id = mr.team_id

Previous answer - does not answer the question (as per later comments)

Just confirming - the issue is that when you run it, for each match it returns 2 rows - one for each team - but you want to do processing on both teams as one row only?

As such, you could do a few things (e.g., self-join the match rating table to itself, with Team1 ratings and Team2 ratings).

Alternatively, you could modify your FROM to have joins to match_rating twice - where the first has the lower ID for the two teams e.g.,

FROM event evt
JOIN match_rating mr_team1
     ON evt.match_id = mr_team1.match_id
JOIN match_rating mr_team2
     ON evt.match_id = mr_team2.match_id
     AND mr_team1.match_id < mr_team2.match_id

Of course, your processing then needs to be modified to take this into account e.g., one row represents a match, and you have a bunch of data for team1 and similar data for team2. You'd then, I assume, compare the data for team1 columns and team2 columns to get some sort of rating etc (e.g., chance for Team1 to win, etc).

seanb
  • 6,272
  • 2
  • 4
  • 22
  • It's not quite what I'm aiming for - please look at my update to the question. I'm sorry for being unclear and thanks a bunch for your time :) – Tobias Oct 07 '20 at 13:16
  • Thanks for your time seanb, I really appreciate this - and yes, that's what I'm looking for. I did as you suggested, but I still only get those two rows. I made a screenshot: https://imgur.com/a/LOhSS7M. Yes, some events have no subject_team_id, since not all events involve a specific team - but I don't think that should matter...? – Tobias Oct 08 '20 at 13:36
  • If there are some events that don't have match_ratings for both teams, you can use a LEFT JOIN instead of an INNER JOIN to still report on all pairs e.g., `FROM event evt LEFT JOIN match_rating mr ON ...` Regarding the group-by: I don't code MySQL specifically, but I think your GROUP BY has issues. Instead of `GROUP BY evt.match_id AND evt.subject_team_id`, try using commas between fields e.g., `GROUP BY evt.match_id, evt.subject_team_id` - see https://stackoverflow.com/questions/1841426/is-it-possible-to-group-by-multiple-columns-using-mysql – seanb Oct 08 '20 at 15:22
  • You may also need to review your fields after you get the GROUP BY working. Every field should either be a) part of the group-by, or b) be some sort of aggregate (sum, max, avg, etc). For example, you have two fields in the SELECT component `mr.tactic_type` and `mr.tactic_skill` that don't have this specified. It's not clear what you want to happen with them. – seanb Oct 08 '20 at 15:30
  • Finally, you may also want to review your SQL mode, particularly for ONLY_FULL_GROUP_BY. Here's a [db<>fiddle](https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=93b3f703b09667d5ce79a1fd0ce9f32c) showing a relatively simple scenario and multiple attempts at grouping - seeing what works and what doesn't. The first result (labelled 'standard SQL') is a template of what we *should* be getting. – seanb Oct 08 '20 at 15:54