Google Big Query SQL - Get Most Recent Column Value

Question

I have a Google Big Query Table that has an email column in it. Basically each rows shows a state the user with that email address existed in. What I want to do is query the table to get a result showing the most recent row per email address. I've tried all sorts of GROUP BY's, JOINing the table against itself and the usual fun stuff that I would use in MySQL, but I keep getting duplicate emails returned if the entire row isn't a match.

Any help is much appreciated!

Sample Data

user_email     | user_first_name | user_last_name | time      | is_deleted
test@test.com  | Joe             | John           | 123456790 |  1
test@test.com  | Joe             | John           | 123456789 |  0
test2@test.com | Jill            | John           | 123456789 |  0

So if sampling that data I would want to return:

user_email     | user_first_name | user_last_name | time      | is_deleted
test@test.com  | Joe             | John           | 123456790 |  1
test2@test.com | Jill            | John           | 123456789 |  0

score 18 · Accepted Answer · answered Aug 12 '14 at 17:36

18

SELECT user_email, user_first_name, user_last_name, time, is_deleted 
FROM (
 SELECT user_email, user_first_name, user_last_name, time, is_deleted
      , RANK() OVER(PARTITION BY user_email ORDER BY time DESC) rank
 FROM table
)
WHERE rank=1

answered Aug 12 '14 at 17:36

Felipe Hoffa

54,922
16
151
325

How would you do the same thing if your table contains nested and repeated fields ? – mravey Jan 29 '16 at 12:52
If you can provide a sample table with nested and repeated fields, I'll do my best to provide an answer. Please start a new question too! – Felipe Hoffa Jan 30 '16 at 04:48
Ok, I've created a new question that you can find here: http://stackoverflow.com/questions/35150732/get-most-recent-column-value-with-nested-and-repeated-fields – mravey Feb 02 '16 at 10:22

Daniel Warner · Answer 2 · 2022-02-13T07:36:32.810

There is a potential shortcoming of the use of RANK() over the alternative numbering function ROW_NUMBER(). The accepted answer does provide the desired solution, except in the event in a tie in the order by clause, were duplicate records are again returned:

with minimal_reproducible as (
select 'test@test.com' as user_email, 'Joe' as user_first_name, 'John' as user_last_name, 123456789 as time, 1 is_deleted
union all
select 'test@test.com', 'Joe', 'John', 123456789, 0
union all
select 'test2@test.com', 'Jill', 'John', 123456789, 0
)

select user_email, user_first_name, user_last_name, time, is_deleted from (
    select *, 
    rank() over (partition by user_email order by time desc) as rank
    from minimal_reproducible) inner_table 
where rank = 1;

A better solution therefore is to use ROW_NUMBER() in the place of RANK() to ensure (albeit arbitrarily) unique user_email come what may:

with minimal_reproducible as (
select 'test@test.com' as user_email, 'Joe' as user_first_name, 'John' as user_last_name, 123456789 as time, 1 is_deleted
union all
select 'test@test.com', 'Joe', 'John', 123456789, 0
union all
select 'test2@test.com', 'Jill', 'John', 123456789, 0
)

select user_email, user_first_name, user_last_name, time, is_deleted from (
    select *, 
    row_number() over (partition by user_email order by time desc) as row_number
    from minimal_reproducible) inner_table 
where row_number = 1;

score 2 · Answer 3 · answered Aug 12 '14 at 16:35

2

Solved!

SELECT l.* FROM [mytable.list] l JOIN (
    SELECT user_email, MAX(time) as time FROM [mytable.list] GROUP EACH BY user_email
) j ON j.user_email = l.user_email WHERE j.time = l.time;

answered Aug 12 '14 at 16:35

Eric Uldall

2,282
1
17
30

Good! Take a look at using OVER() functions too, look at the alternative answer. – Felipe Hoffa Aug 12 '14 at 17:36
Thanks @FelipeHoffa, What is the benefit and/or contrast of using your solution? – Eric Uldall Aug 13 '14 at 23:22
@FelipeHoffa I agree, add some explaining with your solution please. – Tjorriemorrie Aug 14 '14 at 05:55

Google Big Query SQL - Get Most Recent Column Value

3 Answers3

Linked