1

Say I have two tables:

Table: customers
Fields:  customer_id, first_name, last_name

Table:  customer_cars
Fields:  car_id, customer_id, car_brand, car_active

Say I am trying to write a query that shows all customers with a first name of "Karl," and the brands of the ** active ** cars they have. Not all customers will have an active car. Some cars are active, some are inactive.

Please keep in mind that this is a representative example that I just made up, for sake of clarity and simplicity. Please don't reply with questions about why we would do it this way, that I could use table aliases, how it's possible to have an inactive car, or that my field names could be better written. It's a fake example that is intended be very simple in order to illustrate the point. It has a structure and issue that I encounter all the time.

It seems like this would be best done with a LEFT JOIN and subquery.

SELECT
    customer_id,
    first_name,
    last_name,
    car_brand

FROM
    customers
    LEFT JOIN
        (SELECT
            customer_id,
            car_brand
        FROM
            customer_cars
            INNER JOIN customers ON customer_cars.customer_id = customers.customer_id
        WHERE
            first_name = 'Karl' AND
            customer_cars.car_active = '1') car_query ON customers.customer_id = car_query.customer_id

WHERE
    first_name = 'Karl'

The results might look like this:

first_name    last_name    car_brand
Karl          Johnson      Dodge
Karl          Johnson      Jeep
Karl          Smith        NULL
Karl          Davis        Chrysler

Notice the duplication of 'Karl' in both WHERE clauses, and the INNER JOIN in the subquery that is the same table in the outer query. My understanding of mySQL is that this duplication is necessary because it processes the subquery first before processing the outer query. Therefore, the subquery must be properly limited so it doesn't scan all records, then it tries to match on the resulting records.

I am aware that removing the car_active = '1' condition would change things, but this is a requirement.

I am wondering if a query like this can be done in a different way that only causes the criteria and joins to be entered once. Is there a recommended way to prioritize the outer query first, then match to the inner one?

I am aware that two different queries could be written (find all records with Karl, then do another that finds matching cars). However, this would cause multiple connections to the database (one for every record returned) and would be very taxing and inefficient.

I am also aware of correlating subqueries, but from my understanding and experience, this is for returning one field per customer (e.g., an aggregate field such as how much money Karl spent) within the fieldset. I am looking for a similar approach as this, but where one customer could be matched to multiple other records like in the sample output above.

In your response, if you have a recommended query structure that solves this problem, it would be really helpful if you could write a clear example instead of just describing it. I really appreciate your time!

R.Conroy
  • 13
  • 2
  • [Is there any rule of thumb to construct SQL query from a human-readable description?](https://stackoverflow.com/a/33952141/3404097) PS Please read & act on [mcve]. You could also read about the general principles of relational query optimization/implementation. – philipxy Mar 29 '18 at 09:55

1 Answers1

1

First, is a simple and straight query not enough?

Say I am trying to write a query that shows all customers with a first name of "Karl," and the brands of the ** active ** cars they have. Not all customers will have an active car. Some cars are active, some are inactive.

Following this requirement, I can just do something like:

SELECT C.first_name
    , C.last_name
    , CC.car_brand
    FROM customers C
    LEFT JOIN cutomer_cars CC ON CC.customer_id = C.customer_id
        AND car_active = 1
    WHERE C.first_name = 'Karl'

Take a look at the SQL Fiddle sample.

KaeL
  • 3,639
  • 2
  • 28
  • 56
  • 1
    Hi, if you do what you suggest, it loses its effect of being a LEFT JOIN. It would only show records that have a direct match and wouldn't show the NULL results. – R.Conroy Mar 29 '18 at 04:32
  • 1
    @R.Conroy I updated my answer. If this works out for you, kindly accept my answer ;) – KaeL Mar 29 '18 at 06:13
  • 1
    I tried that, and it worked. I did not know you could add conditions into the join line directly like that. Thank you very much. Does anybody else have any other suggested methods? – R.Conroy Mar 29 '18 at 14:15
  • @R.Conroy sorry for the late reply. Yes you can add conditions to the `ON` clause depending on the results that you want. `SQL` is pretty flexible to adjust on your requirements. – KaeL Apr 03 '18 at 01:41