0

I'm fairly new to SQL and trying to do calculate a new field based on 3 very large database tables. Here's an abstraction:

  • Table 1: players (contains a birthday DATE field)
  • Table 2: tournaments (contains a DATETIME field)
  • Table 3: games (has foreign keys referencing players.id and tournaments.id)

I would like to add to the games table a field containing a decimal representation of the player's age in years on the day of the tournament.

Would I need to write a procedure/operation loop to accomplish this? or is there a way to do it with an update query?

user3925713
  • 509
  • 1
  • 4
  • 10
  • Are you sure you want to store a calculated value? If a birth date gets corrected or a tournament rescheduled will you handle updating the calculated age field? – Mark Leiber May 26 '15 at 16:03
  • It might not be ideal, but for the sake of simplicity and performance (when doing statistics) I want to store it in this case. I don't anticipate making changes to birth dates (in the rare occasion that I do, I would just update the field again) and the tournaments are all finished so there won't be reschedulings. – user3925713 May 26 '15 at 16:13
  • You might want to take a look at the age calculation in the official documentation, which uses the TIMESTAMPDIFF function: http://dev.mysql.com/doc/refman/5.0/en/date-calculations.html. That just gets a whole number. If you need precise calculations, have a look at the algorithm here: http://stackoverflow.com/a/30377142/4665459 – Mark Leiber May 26 '15 at 16:29

2 Answers2

1

You can do this with a UPDATE statement. (But, do you really want to?)

First, write the SELECT statement that would give you the result.

SELECT g.player_id
     , g.tournament_id
     , p.birthday 
     , t.tournament_date
     , DATEDIFF(t.tournament_date,p.birthday) AS days_diff
  FROM games g
  JOIN player p ON p.id = g.player_id
  JOIN tournament t ON t.id = g.tournament_id 

Just add an expression that does the calculation you want to perform, and return that as a column.

Then, you can use that SELECT statement as an inline view in an UPDATE statement. Assuming you've added a age_in_days column to the games table

UPDATE games d
  JOIN ( 
         SELECT g.player_id
              , g.tournament_id
              , p.birthday 
              , t.tournament_date
              , DATEDIFF(t.tournament_date,p.birthday) AS days_diff
           FROM games g
           JOIN player p ON p.id = g.player_id
           JOIN tournament t ON t.id = g.tournament_id 
       ) s
   ON s.player_id     = d.player_id
  AND s.tournament_id = d.tournament_id
  SET s.age_in_days   = s.days_diff

Next, we have to be concerned about "large" tables, and avoiding a huge transaction that generates a boatload of rollback (if these are InnoDB tables).

If that's a concern, do the UPDATE operation in "batches" by breaking it up, by adding a predicate in the inline view query, for example, WHERE g.player_id = 42, and running that same UPDATE, once for each player_id.


If you want age in "decimal years", you'd need to develop an expression to return that. (As a rudimentary approximation, divide the difference in days by the number of days in a year 365.25. That would get you pretty close.)


(At the beginning of my answer, I asked, do you really want to.

Beware that if a player birthday value is updated, or a tournament_date is changed, the calculated values stored in the games table will be inconsistent.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • awesome! Thank you. I didn't know you could join on a query instead of a table. (As I said, I'm new XD) – user3925713 May 26 '15 at 16:24
  • Beware of inline views and the performance implications. (The `SELECT` query wrapped in parens, that is treated like a table. MySQL calls that a "derived table". The results from the query are materialized into a "temporary" table, and then the outer statement runs against the "temporary" table. I put temporary in quotes because it's different from an actual `TEMPORARY TABLE`.) There are other query patterns that could be used as well, I just gave an example of one approach. – spencer7593 May 26 '15 at 16:28
0

You can try this

Update games 
set age = (SELECT FLOOR(DATEDIFF(DAY, players.BirthDate, tournaments.datetime) / 365.25) from players,tournaments) 
where players.id=games.id;
mathielo
  • 6,725
  • 7
  • 50
  • 63