Given three tables as follows,
Table item_attributes
:
item_id | store | color | brand |
---|---|---|---|
123 | ABC | red | kings |
456 | ABC | blue | kings |
111 | XYZ | green | queens |
Table item_cost
:
item_id | store | currency | price |
---|---|---|---|
123 | ABC | usd | 2.34 |
111 | XYZ | usd | 9.21 |
122 | ABC | usd | 6.31 |
Table item_names
(main table)
item_id | store | name |
---|---|---|
123 | ABC | Short sleeve t-shirt |
111 | XYZ | Plaid skirt |
The goal is to combine selected columns using AWS Athena from item_cost
(price) and item_attributes
(brand) to the main item_names
tables based on the item_id
as the main join key.
I've tried this and it kinda works:
SELECT
att.item_id,
att.store,
att.brand,
cost.item_id,
cost.store,
cost.price,
main.item_id,
main.store,
main.name,
FROM (item_name main LEFT OUTER JOIN item_attributes att) LEFT OUTER JOIN item_cost cost
ON main.item_id=att.item_id AND main.item_id=cost.item_id
WHERE main.store=cost.store AND main.store=att.store
But my question (in parts are):
- Is there a simpler/clean query to perform the join without increasing the complexity (Big-O)?
- Would it be computationally cheaper if I do nested join of
main JOIN att
thenJOIN price
? Or would it be the same as the chainedLEFT OUTER JOIN
as per the code above? - Would it be computationally cheaper if I
main UNION att UNION price
thengroup by item_id
?