Take good care with only
if your subqueries don't select the primary key.
Example:
class Customer:
pass
class Order:
customer: Customer
pass
class OrderItem:
order: Order
is_recalled: bool
- Customer has Orders
- Order has OrderItems
Now we are trying to find all customers with at least one recalled order-item.(1)
This will not work properly
order_ids = OrderItem.objects \
.filter(is_recalled=True) \
.only("order_id")
customer_ids = OrderItem.objects \
.filter(id__in=order_ids) \
.only('customer_id')
# BROKEN! BROKEN
customers = Customer.objects.filter(id__in=customer_ids)
The code above looks very fine, but it produces the following query:
select * from customer where id in (
select id -- should be customer_id
from orders
where id in (
select id -- should be order_id
from order_items
where is_recalled = true))
Instead one should use select
order_ids = OrderItem.objects \
.filter(is_recalled=True) \
.select("order_id")
customer_ids = OrderItem.objects \
.filter(id__in=order_ids) \
.select('customer_id')
customers = Customer.objects.filter(id__in=customer_ids)
(1) Note: in a real case we might consider 'WHERE EXISTS'