1

I'm using MySQL and I have a query. There is also a subquery.

SELECT * FROM rg, list, status
WHERE (
  (rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL) || 
  (status.season_id = rg.required_status_id AND status.user_id = list.user_id) || 

  (rg.incorrect_status_id IS NOT NULL AND 
    list.user_id NOT IN (SELECT user_id FROM status WHERE user_id = list.user_id AND season_id = rg.incorrect_status_id)
  )
)

The problem is the following part of the code:

(rg.incorrect_status_id IS NOT NULL AND 
  list.user_id NOT IN (SELECT user_id FROM status WHERE user_id = list.user_id AND season_id = rg.incorrect_status_id)
)

How could I check if the table "status" has a row where user_id is same as list.user_id and season_id is same as rg.incorrect_status_id?

Update

Here is my current code, but it does not work at all. I do not know what to do.

SELECT * FROM rg, list, status
  LEFT JOIN status AS stat
    INNER JOIN rg AS rglist
    ON rglist.incorrect_status_id = stat.season_id
  ON stat.season_id = rglist.incorrect_status_id

  WHERE (
    (rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL) || 
    (status.season_id = rg.required_status_id AND status.user_id = list.user_id) || 

    (rg.incorrect_status_id IS NOT NULL AND stat.user_id IS NULL)
  )
)

Update 2

I modified the names, but the basic idea is same.

FROM sarjojen_rglistat, sarjojen_rglistojen_osakilpailut, kilpailukausien_kilpailut, sarjojen_osakilpailuiden_rgpisteet
, sarjojen_kilpailukaudet, sarjojen_kilpailukausien_kilpailusysteemit
/* , kayttajien_ilmoittautumiset */
/* , sarjojen_kilpailukausien_pelaajastatukset */

LEFT OUTER JOIN sarjojen_kilpailukausien_pelaajastatukset
ON sarjojen_kilpailukausien_pelaajastatukset.sarjan_kilpailukausi_id = sarjojen_rglistat.vaadittu_pelaajastatus_id

LEFT OUTER JOIN kayttajien_ilmoittautumiset
ON kayttajien_ilmoittautumiset.kayttaja_id = sarjojen_kilpailukausien_pelaajastatukset.kayttaja_id

Now this says:

Column not found: 1054 Unknown column 'sarjojen_rglistat.vaadittu_pelaajastatus_id' in 'on clause'

Why is that so?

I have a table called "sarjojen_rglistat" and there is a column "vaadittu_pelaajastatus_id".

xms
  • 429
  • 5
  • 24
  • MySQL doesn't optimize `OR` very well. Try splitting your query into separate queries for each condition, and combine them with `UNION`. – Barmar Jun 21 '17 at 23:42
  • You understand that in the first case, it will join the rows with `NULL` with a full cross-product of every `list` and `status` row, right? – Barmar Jun 21 '17 at 23:43
  • See http://stackoverflow.com/questions/21633115/return-row-only-if-value-doesnt-exist?lq=1 for better ways to write the third query that looks for rows that don't match in another table. – Barmar Jun 21 '17 at 23:44
  • I have tried and tried, no success. – xms Jun 22 '17 at 00:36
  • 1
    show sample data and desired output.. it could be better. – Vijunav Vastivch Jun 22 '17 at 00:36
  • The question isn't very clear. You wrote **where `user_id` is same as `list.user_id` and `season_id` is same as `rg.incorrect_status_id`**. But your query is looking for a row that *doesn't* have any matches. – Barmar Jun 22 '17 at 00:43
  • 1
    Don't mix cross-product and explicit `JOIN`. – Barmar Jun 22 '17 at 00:44
  • I updated the opening message. – xms Jun 22 '17 at 00:45
  • Still not clear. The query uses `NOT IN`, but your description says they should match. – Barmar Jun 22 '17 at 00:46
  • In your update you added explicit joins without removing the cross joins, meaning you're now including both `rg` and `status` twice!! Also the INNER JOIN to `rglist` negates the LEFT OUTER JOIN to `stat`. – Disillusioned Jun 22 '17 at 00:57
  • Sorry, I can't help you if you keep moving the goal posts. You've added a bunch of extra tables with implicit cross joins and if you miss a comma, suddenly one table will be aliased as another. E.g. `SELECT Table2.Table1Column FROM Table1 Table2 /* Missing comma makes the Table2 be an alias that actually references Table1 */` This is almost certainly the cause of your error if as you say, the table and column do exist. – Disillusioned Jun 22 '17 at 03:11
  • PS: Your update 2 looks like you've decided to completely ignore my advice, so I'll stop wasting my time. – Disillusioned Jun 22 '17 at 03:14
  • @CraigYoung Thank you very much for your help. It is 6:27 AM and I should go to sleep. I will try my best after that. Anyway, I did not notice any missing commas. – xms Jun 22 '17 at 03:28
  • @xms Replace your commas forming implicit joins with explicit `CROSS JOIN`s if you insist on cross-joining the tables. The commas actually separate parts of the FROM clause. So it seems (from some simple experimentation) the section where you're doing your LEFT OUTER JOINs cannot reference any of the first 5 tables. – Disillusioned Jun 22 '17 at 05:10

2 Answers2

1

1) Simpler queries are easier for the query engine to interpret and produce an efficient plan.

If you pay careful attention to the following part of your query, you may realise something a little "weird" is going. This is a clue the approach is perhaps a little too complicated.

...(
list.user_id NOT IN (
    SELECT  user_id
    FROM    status
            /* Note the sub-query cannot ever return a user_id different
               to the one checked with "NOT IN" above */
    WHERE   user_id = list.user_id
        AND season_id = rg.incorrect_status_id)
)

The query filtering where list.user_id is not in a result set that cannot contain user_id's other than list.user_id. Of course the sub-query could return zero results. So basically it boils down to a simple existence check.

So for a start, you should rather write:

...(
NOT EXISTS (
    SELECT  *
    FROM    status
    WHERE   user_id = list.user_id
        AND season_id = rg.incorrect_status_id)
)

2) Be clear about your "what joins the tables together" (this refers back to 1 as well).

Your query selects from 3 tables without specifying any join conditions:

FROM rg, list, status

This would result in a cross join producing a result set that is a permutation combination of all possible row matches. If your WHERE clause were simple, the query engine might be able to implicitly promote certain filter conditions into join conditions, but that's not the case. So even if for example you have a very small number of rows in each table:

status   20
rg       100
list     1000

Your intermediate result set (before WHERE is applied),
would need 1000 * 100 * 20 = 2000000 rows!

It helps tremendously to make it clear with join conditions how the rows of each table are intended to match up. Not only does it make the query easier to read and understand, but it also helps avoid overlooking join conditions which can be the bane of performance considerations.

Note that when specifying join conditions, some rows might not have matches and this is where knowing and understanding the different types of joins is extremely important. Particularly in your case, most of the complexity in your WHERE clause seems to come from trying resolve when rows do/do not match. See this answer for some useful information.

Your FROM/WHERE clause should probably look more like the following. (Difficult to be certain because you haven't stated your table relationships or expected input/output of your query. But it should set you on the right track.)

FROM    rg
        /* Assumes rg rows form the base of the query, and not to have
           some rg rows excluded due to non-matches in list or status. */
        LEFT OUTER JOIN status ON
            status.season_id = rg.required_status_id
        LEFT OUTER JOIN list ON
            status.user_id = list.user_id
WHERE   rg.incorrect_status_id IS NULL
    /* As Barmar commented, it may also be useful to break this
       OR condition out as a separate query UNION to the above.  */
    OR  (
            rg.incorrect_status_id IS NOT NULL
        AND NOT EXISTS (
            SELECT  *
            FROM    status
            WHERE   user_id = list.user_id
                AND season_id = rg.incorrect_status_id)
        )

Note that this query is very clear about the distinction between how the tables are joined, and what is used to filter the joined result set.

3) Finally and very importantly, even the best queries are of little benefit without the correct indexes!

A good query with bad indexes (or conversely a bad query with good indexes) is going to be inefficient either way. Computers are fast enough that you might not notice on small databases, but you do experiment with candidate indexes to find the best combination for your data and workload.

In the above query you likely need indexes on the following. (Some may already be covered by Primary Key constraints.)

status.season_id
status.user_id
list.user_id
rg.required_status_id
rg.incorrect_status_id
Disillusioned
  • 14,635
  • 3
  • 43
  • 77
  • @xms It's your table, you should know if it has a `required_status_id` column. Certainly you've used it often enough in your own queries in your question. If the table does have the column I can only guess you've pasted something incorrectly resulting in another table being aliased as `rg`. – Disillusioned Jun 22 '17 at 02:21
  • Please take a look at my opening message. There is update 2. – xms Jun 22 '17 at 02:59
  • I managed to solve a problem with unknown column error, but I still have problems. This will return no results. It should return a row in the cases: 1) if `status.season_id` = `rg.required_status_id` or 2) `status.season_id` = `rg.incorrect_status_id` does not exist. – xms Jun 22 '17 at 09:54
1

Use a UNION of subqueries that handle the 3 cases that you're combining with OR. You can then use explicit JOIN in each subquery to make it clear how the tables are related to each other (or not related at all when you're doing a full cross-product, as is the case when rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL).

SELECT rg.*, list.*, status.*
FROM rg
CROSS JOIN list
CROSS JOIN status
WHERE rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL

UNION ALL

SELECT rg.*, list.*, status.*
FROM rg
JOIN status ON rg.required_status_id = status.season_id
JOIN list ON status.user_id = list.user_id

UNION ALL

SELECT rg.*, list.*, status.*
FROM rg
CROSS JOIN list
LEFT JOIN status ON status.user_id = list.user_id AND status.season_id = rg.required_status_id
WHERE rg.incorrect_status_id IS NOT NULL AND status.season_id IS NULL
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • +1 Only thing to note is that if the union queries are mutually exclusive, `UNION ALL` will prevent an inefficient attempt to remove non-existent duplicates. – Disillusioned Jun 22 '17 at 00:52
  • Good point. I wasn't sure if they were, but I think because of the conditions on the null fields they will be. – Barmar Jun 22 '17 at 00:54
  • @Barmar Thanks for your code! Does this return a row in two cases: 1) if `status.season_id = rg.required_status_id` or 2) `status.season_id = rg.incorrect_status_id` does not exist? – xms Jun 22 '17 at 10:00
  • The middle query returns a row in case 1, the bottom query returns a row in case 2. – Barmar Jun 22 '17 at 15:35
  • Couldn't you just use a normal `LEFT JOIN` to get both cases with one query? – Barmar Jun 22 '17 at 15:36