What SQL query to access data from a table twice-removed, fully normalized

Question

Let's say there are three tables, users, orders, transactions, which look like this:

Users: 
id | name
---------
1  | Mike
2  | Sara

Orders: 
id | user_id | total | transaction_id
-------------------------------------
1  |       1 |   100 |              1
2  |       2 |    10 |              2
3  |       1 |     5 |              1
4  |       1 |     4 |              3
5  |       1 |    40 |              3
6  |       2 |    80 |              2

Transactions: 
id | total |  created_at 
------------------------
1  |   105 | [timestamp]
2  |    90 | [timestamp]
3  |    44 | [timestamp]

And let's say I want to select users.name but sort the names representing the most recent activity (i.e. by most recent transaction).

As you can see the data I want is twice-removed (maybe misusing that word) in the sense that I would need to go through the orders table to figure out the most recent transaction.

I figured I would design it to be as normalized as possible but then I realized I had no idea how to structure a proper query for this and other similar scenarios. I can obviously pull all the data and sort it out in the application code. I could also of course add a "last_transaction_at" column on users that I keep in sync.

So I guess there's two pieces to the question. First is how to do this with a SQL query, and second is whether this is more or less efficient than de-normalizing.

If two users have the same name, what exactly do you want? Names listed in order of the most recent transaction of some user with that name? Or a list where the nth element is the name of the user with the nth most recent transaction time? — philipxy, Aug 12 '17 at 02:48

score 2 · Accepted Answer · edited Aug 11 '17 at 16:21

2

Select u.Name, MAX(created_at) TransactionTime FROM #Users u
INNER JOIN  #ORDERS o
ON u.Id = o.user_id
INNER JOIN #Transactions t
ON o.transaction_id = t.id
GROUP BY u.Name
ORDER BY MAX(created_at) desc

I feel the schema which you created in fine

edited Aug 11 '17 at 16:21

Strawberry

33,750
13
40
57

answered Aug 11 '17 at 16:17

Rahul Patel

500
3
11

philipxy · Answer 2 · 2017-08-12T08:19:02.370

Normalization to higher normal forms replaces a table by components (tables that are projections of it) that (relational natural) join back to the original.

To make sense of joining back in SQL, let's assume that when we normalized we treated null as just another value, except that columns shared by components were not null; that there are no duplicate rows; and that we ignore column ordering.

Then we get back the table via select distinct * from a sequence of natural joins of the components. Or via select distinct of each component column name from a sequence of inner joins of components on equality of same-named columns.

That ties normalization to joins. But you might not want to get back exactly a table that you decomposed into components. Why do we join? Each base table holds the rows that make a true proposition (statement) from some predicate (statement template) parameterized by column names:

-- user I is named N
User(I,N)
-- order I is user U spending $T in transaction X
Orders(I, U, T, X)
-- transaction I for $T was created on C
Transactions(I,T,C)

In a from an alias names a table like a preceding table but with each column renamed to alias.column. Its rows satisfy the predicate of the preceding table. where ANDs in its condition.

from Users u
--     user u.I is named u.N
where N = 'disantlor'
-- AND N = 'disantlor'

Then a cross join b holds the rows that satisfy the AND of alias predicates. And a inner join b holds the rows that satisfy the AND of alias predicates & the on condition.

from Users u
--     user u.I is named u.N
cross join Orders o
-- AND order o.I is user o.U spending $o.T in transaction o.X
inner join Transactions x on o.X = x.I
-- AND transaction x.I for $x.T was created on x.C AND  o.X = x.I
where o.U = u.I
-- AND o.U = u.I

So we join to get rows satisfying the conjunction of given predicates. And normalization replaces a table with predicates using ANDs by components with predicates that are conjuncts. Is there any rule of thumb to construct SQL query from a human-readable description?

Your design is subject to the constraint that select id, total from Transactions =

select transaction_id as id, sum(total) as total
from Orders
group by transaction_id

It's possible that you got this design from normalizing a certain original table without preserving FDs (functional dependencies), but probably what you mean by "normalized" is just "well-designed". Cross Table Dependency/Constraint in SQL Database

What SQL query to access data from a table twice-removed, fully normalized

2 Answers2