3

I have a bunch of candidates, who have had one or more jobs, each with a company, using some skills.

Bad ascii art follows:

         ---------------                              ---------------   
         | candidate 1 |                              | candidate 2 |
         --------------- \                             --------------      
           /              \                                 |
       -------              --------                        etc
       |job 1|              | job 2 |  
       -------              ---------  
        /     \              /      \  
  ---------   ---------  ---------   --------  
  |company |  | skills | |company | | skills |  
  ---------   ---------  ---------- ----------  

Here's my database:

mysql> describe jobs;
+--------------+---------+------+-----+---------+----------------+
| Field        | Type    | Null | Key | Default | Extra          |
+--------------+---------+------+-----+---------+----------------+
| job_id       | int(11) | NO   | PRI | NULL    | auto_increment |
| candidate_id | int(11) | NO   | MUL | NULL    |                |
| company_id   | int(11) | NO   | MUL | NULL    |                |
| start_date   | date    | NO   | MUL | NULL    |                |
| end_date     | date    | NO   | MUL | NULL    |                |
+--------------+---------+------+-----+---------+----------------+

.

mysql> describe candidates;
+----------------+----------+------+-----+---------+----------------+
| Field          | Type     | Null | Key | Default | Extra          |
+----------------+----------+------+-----+---------+----------------+
| candidate_id   | int(11)  | NO   | PRI | NULL    | auto_increment |
| candidate_name | char(50) | NO   | MUL | NULL    |                |
| home_city      | char(50) | NO   | MUL | NULL    |                |
+----------------+----------+------+-----+---------+----------------+

.

mysql> describe companies;
+-------------------+---------------+------+-----+---------+----------------+

| Field             | Type          | Null | Key | Default | Extra          |
+-------------------+---------------+------+-----+---------+----------------+
| company_id        | int(11)       | NO   | PRI | NULL    | auto_increment |
| company_name      | char(50)      | NO   | MUL | NULL    |                |
| company_city      | char(50)      | NO   | MUL | NULL    |                |
| company_post_code | char(50)      | NO   |     | NULL    |                |
| latitude          | decimal(11,8) | NO   |     | NULL    |                |
| longitude         | decimal(11,8) | NO   |     | NULL    |                |
+-------------------+---------------+------+-----+---------+----------------+

.

mysql> describe skills;
+----------+---------+------+-----+---------+-------+
| Field    | Type    | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| skill_id | int(11) | NO   | MUL | NULL    |       |
| job_id   | int(11) | NO   | MUL | NULL    |       |
+----------+---------+------+-----+---------+-------+

.

mysql> describe skill_names;
+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| skill_id   | int(11)  | NO   | PRI | NULL    | auto_increment |
| skill_name | char(32) | NO   | MUL | NULL    |                |
+------------+----------+------+-----+---------+----------------+

Here's my attempt at a query (note, I intend to change the wildcards to field names; I am just trying to get something working):

SELECT can.* , co.*, j.*, sn.*
FROM candidates AS can
JOIN jobs AS j 
JOIN companies AS co ON j.company_id = co.company_id
JOIN skills AS s ON s.job_id = j.job_id
JOIN skill_names AS sn ON s.skill_id = s.skill_id

HediSql says /* Affected rows: 0 Found rows: 34,461,651 Warnings: 0 Duration for 1 query: 0.000 sec. (+ 105.078 sec. network) */

What's wrong with the query? I hope that it is clear from the bad ascii art what I am trying to achieve.

(also, does it make any speed difference which order I join the tables? I will worry about the new MySql v8 functions which retrieve it as nested JSON later)

GMB
  • 216,147
  • 25
  • 84
  • 135
Mawg says reinstate Monica
  • 38,334
  • 103
  • 306
  • 551
  • Actually, sample data wouid greatly help your question. – Tim Biegeleisen Jan 17 '20 at 15:30
  • In what way? I can post data, but it will be humorousness (and randomly generated). I am more than happy to help, just sure how it makes a difference, given the description of a bunch of candidates, who have had one or more jobs, each with a company, using some skills`. – Mawg says reinstate Monica Jan 17 '20 at 15:43
  • 1
    How many rows in each table (at least approximately). – Rick James Jan 17 '20 at 18:30
  • Currently? As stated, about 100 candidates with approx 5 jobs with approx 3 skills. In "real life", maybe 5k candidates, with 10 jobs with 20 skills. I will now auto-generate more ransom data & test. Currently 0.16 seconds after @GMS's excellent answer. – Mawg says reinstate Monica Jan 17 '20 at 18:39

1 Answers1

5

You are missing a join condition between candidates and jobs, so you get a cartesian product between both tables. Also, there is a problem with the join condition on skill_names, where both columns are the same (this again generates a cartesian product).

SELECT can.* , co.*, j.*, sn.*
FROM candidates AS can
JOIN jobs AS j ON j.candidate_id = can.candidate_id --> here: missing join condition
JOIN companies AS co ON j.company_id = co.company_id
JOIN skills AS s ON s.job_id = j.job_id
JOIN skill_names AS sn ON sn.skill_id = s.skill_id  --> and here: wrong join condition

Many RDBMS would raise a syntax error on a JOIN without an ON clause (if you do want a cartesian product, you need to be explicit about it by using CROSS JOIN), but, alas, not MySQL.


When it comes to this question:

does it make any speed difference which order I join the tables?

No. As long as you are using inner joins (not left joins), the join order does not matter to the query planner, which will rearrange them in the order which it thinks is the more efficient.

GMB
  • 216,147
  • 25
  • 84
  • 135