48

I have a report that looks something like this:

CompanyA      Workflow27     June5
CompanyA      Workflow27     June8
CompanyA      Workflow27     June12
CompanyB      Workflow13     Apr4
CompanyB      Workflow13     Apr9
CompanyB      Workflow20     Dec11
CompanyB      Wofkflow20     Dec17

This is done with SQL (specifically, T-SQL version Server 2005):

SELECT company
   , workflow
   , date
FROM workflowTable

I would like the report to show just the earliest dates for each workflow:

CompanyA      Workflow27     June5
CompanyB      Workflow13     Apr4
CompanyB      Workflow20     Dec11

Any ideas? I can't figure this out. I've tried using a nested select that returns the earliest tray date, and then setting that in the WHERE clause. This works great if there were only one company:

SELECT company
   , workflow
   , date
FROM workflowTable
WHERE date = (SELECT TOP 1 date
              FROM workflowTable
              ORDER BY date)

but this obviously won't work if there is more than one company in that table. Any help is appreciated!

dvanaria
  • 6,593
  • 22
  • 62
  • 82

4 Answers4

68

Simply use min()

SELECT company, workflow, MIN(date) 
FROM workflowTable 
GROUP BY company, workflow
denis.peplin
  • 9,585
  • 3
  • 48
  • 55
Achim
  • 15,415
  • 15
  • 80
  • 144
  • 3
    This works as long as you also aren't trying to get a Unique Workflow as well. In this example it would return both Workflow13 and Workflow20 for Company B, but if you needed the earliest date for each company to identify the Workflow you would need to include a sub-select. – Vallier Aug 01 '19 at 17:45
32

In this case a relatively simple GROUP BY can work, but in general, when there are additional columns where you can't order by but you want them from the particular row which they are associated with, you can either join back to the detail using all the parts of the key or use OVER():

Runnable example (Wofkflow20 error in original data corrected)

;WITH partitioned AS (
    SELECT company
        ,workflow
        ,date
        ,other_columns
        ,ROW_NUMBER() OVER(PARTITION BY company, workflow
                            ORDER BY date) AS seq
    FROM workflowTable
)
SELECT *
FROM partitioned WHERE seq = 1
Cœur
  • 37,241
  • 25
  • 195
  • 267
Cade Roux
  • 88,164
  • 40
  • 182
  • 265
  • Why does the WITH command have a semicolon before it? I've seen this a few other times and at first always thought it was a typo from the person's code I was viewing – Alex Watts Jul 21 '16 at 16:01
  • 2
    Precautionary convention. WITH *must* be the first thing in a statement and SQL Server does not enforce all statements ending in semicolons. – Cade Roux Jul 21 '16 at 16:04
8
SELECT company
   , workflow
   , MIN(date)
FROM workflowTable
GROUP BY company
       , workflow
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
0
select company ,workflow ,min(date)
from workflowTable 
group by company 
Suraj Rao
  • 29,388
  • 11
  • 94
  • 103