0

I came across a couple of scenarios now when working on a large table of session data. That I wanted to get the first event within a session that meets certain criteria (or sometimes plain simple the first event). Most of the time I would want a certain column from that event.

Table could look like this.

session_id, timestamp, column1, column2 ... columnn
xyz, datetime_1, attribute1_a, attribute2, ... attribute n
xyz, datetime_2, attribute1_b, attribute2, ... attribute n

Some approaches have been discussed here and also to a certain extent here.

I really like the window function approach, as I find it very clear to read, i.e. something along the line of ROW_NUMBER() OVER(PARTITION BY session_id ORDER BY timestamp) AS first_created. what I don 't like about this is that it requires an outer query, where you have to condition on first_created = 1. Additionally, this fails for large amounts of data.

The inner join approach mentioned in works well, but also feels a tad convoluted. Since the post is from a while ago, I was wondering if there has since been an easier way developed to achieve the desired result.

Fabian Bosler
  • 2,310
  • 2
  • 29
  • 49
  • What does the table look like? – Martin Weitzmann Jul 03 '19 at 09:06
  • Marking as duplicate, but let me know if it's not. – Felipe Hoffa Jul 03 '19 at 18:34
  • this one - https://stackoverflow.com/a/43371222/5221944 - might be even more relevant :o) – Mikhail Berlyant Jul 03 '19 at 18:41
  • thx Felipe and Mikhail! If I am not mistaken, both use the same approach of array_agg on the column/table and then offset to get the first element, which is very easy to read! Maybe its worthwhile summarising somewhere, when a row_number, inner_join or array_agg + offset approach is best suited. I'll see if I get around to do that. – Fabian Bosler Jul 04 '19 at 06:09

0 Answers0