0

I'm hoping that someone is willing to help. I'm new to SQL and struggling to get a simple query debugged.

I want to create a simple frequency table to see how many values are missing (i.e. = 00000000) for the ArrestDate field.

SELECT 
CASE WHEN ArrestDate=00000000 THEN 'NO DATE' ELSE 'HAS DATE' END AS HasDate,
CASE WHEN ArrestDate=00000000 THEN 0 ELSE 1 END as nDate
FROM ARREST_INFO
--GROUP BY HasDate
--GROUP BY CASE WHEN ArrestDate=00000000 THEN 'NO DATE' ELSE 'HAS DATE' END 
  • Lines 1-4 parse without errors (output below).
  • Line 5 returns: Invalid column name 'HasDate'.
  • Line 6 returns: 'ARREST_INFO.ArrestDate' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

enter image description here

Since lines 1-4 run properly, I tried wrapping them inside another SELECT statement but this also elicits an error ("Invalid column name 'HasDate'.")

SELECT * FROM (
    SELECT 
    CASE WHEN ArrestDate=00000000 THEN 'NO DATE' ELSE 'HAS DATE' END AS HasDate,
    CASE WHEN ArrestDate=00000000 THEN 0 ELSE 1 END as nDate
    FROM ARREST_INFO
    )
 
GROUP BY HasDate
--GROUP BY CASE WHEN ArrestDate=00000000 THEN 'NO DATE' ELSE 'HAS DATE' END 

Sadly, I can't update the SQL Server version. I'd be very grateful for any insight!

nbk
  • 45,398
  • 8
  • 30
  • 47
overhear
  • 3
  • 3
  • you can pnly GROUP BY column so the last should work, but what result do you want to achieve also see https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query – nbk Sep 30 '22 at 16:42
  • GROUP BY without using aggregate functions is a bit odd. What are you actually trying to do? A [mcve] would make things clearer. – jarlh Sep 30 '22 at 17:10
  • I'm trying to create a table that shows how many records have missing vs. non-missing arrest date values (missing arrest dates are coded as 00000000). I'm not allowed to post inline images, but you can see a link to a screenshot of the output I finally managed to generate below. – overhear Sep 30 '22 at 18:17

1 Answers1

0

This answer to a previous question helped me figure out at least part of my problem. My Select statement referred to 2 columns, (nDate and HasDate), but my Group By statement only referred to one of them, so SQL didn't have enough information to determine how to display the unaggregated column data.

@Bill Karwin wrote: https://stackoverflow.com/a/13999903/9499084

This demonstrates the single-value rule, which prohibits the undefined results you get when you run a GROUP BY query, and you include any columns in the select-list that are neither part of the grouping criteria, nor appear in aggregate functions (SUM, MIN, MAX, etc.).

I'm still not sure why SQL doesn't recognize the new column defined in the Select statement (HasDate) when it's redeployed in the Group By statement, but at least I have functioning code now -- see screenshot below for results.

Here's my updated code:

SELECT 
CASE WHEN ArrestDate=00000000 THEN 'NO DATE' ELSE 'HAS DATE' END AS HasDate,
SUM(CASE WHEN ArrestDate=00000000 THEN 0 ELSE 1 END) AS nDate,
COUNT(*) as TotalCases

FROM ARREST_INFO
GROUP BY CASE WHEN ArrestDate=00000000 THEN 'NO DATE' ELSE 'HAS DATE' END

enter image description here

overhear
  • 3
  • 3