customer - c_id, c_name, c_address product - p_id, p_name, price supplier - s_id, s_name, s_address orders - o_id, c_id, p_id, quantity, time
SELECT o.o_id,
c.c_id,
c.c_name,
p.p_id,
p.p_name,
p.price * o.quantity AS amount
FROM customer c
JOIN orders o ON o.c_id = c.c_id
JOIN product p ON p.p_id = o.p_id;
I want to execute above query without fetching 3 tables as individual data frames in pyspark and performing joins on dataframes.