-1

I have the following criteria that won't run.

I have tried:

SELECT
    Main_ID AS [First_Custom_Column],
    CASE 
       WHEN Column_1 = 'Y' AND
            SUM(CASE WHEN Column_3 > 1 THEN Column_2 END) > 0 AND
            COUNT(CASE WHEN Column_3 > 1 THEN 2 END) > 1 # Will fail, Error below. 
           THEN 1 
           ELSE 0
    END AS [Second_Custom_Column]
FROM
    [Table_Name] 
WHERE 
    [Date] = '2020-11-01';

Error: 'Main_ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

Then I tried this:

SELECT
    Main_ID AS [First_Custom_Column],
    CASE 
       WHEN Column_1 = 'Y' AND
            (CASE WHEN Column_3 > 1 THEN 1 END) > 0 AND
            (CASE WHEN Column_3 > 1 THEN 2 END) > 1
           THEN 1 
           ELSE 0
    END AS [Second_Custom_Column]
FROM 
    [Table_Name] 
WHERE 
    [Date] = '2020-11-01';

Error: The conversion of the varchar value '3344511715' overflowed an int column.

I am essentially attempting to replicate COUNTIF and SUMIF from Excel. Once fully replicated I'll turn the query into a function, but first I need it to work. The third column, Column_3, has duplicated data that the function is intended to count to see if they recur.

Has anyone successfully replicated COUNTIF with multiple criteria? How?

Desired results from Excel.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Rookie
  • 140
  • 1
  • 3
  • 13
  • 2
    Please provide sample data and desired results. – Gordon Linoff Dec 01 '20 at 22:52
  • 1
    Just like the error says, you need to add 'group by main_id' to the end. Is there an issue with doing that? – pwilcox Dec 01 '20 at 22:58
  • 2
    As it written, `COUNT()` will try to count over the full list of rows. If you need to count over a subset of rows you should try `COUNT() OVER()`. – The Impaler Dec 01 '20 at 22:59
  • @pwilcox, I don't understand why that would solve it. FYI I tried and it gave me an error that Columns 1 & 2 are not contained in aggregate data. But mostly what escapes me is the logic. Wouldn't it be impossible to group a field that is in it's nature a unique field?@The Impaler, I am trying to count the duplicate entries that fit the criteria within that given date range. Does that qualify as a 'subset'? – Rookie Dec 01 '20 at 23:01
  • @Rookie *Replicating* something is not a good idea, because different languages and platforms use different paradigms. For example, SQL can efficiently work with large amount of data in one pass, so "turn it into a function" sounds frightening. You should *solve* your *overall* task not to fall into [XY problem](https://xyproblem.info/). And what about your requirement it should be just `COUNT` with all the `CASE`s inside it or you need to `GROUP BY Main_ID, Column_1, Column_2` depending on what do you need to count. – astentx Dec 01 '20 at 23:13
  • 2
    What about your confusion: every time you get confused by some error or output you should imagine yourself a DBMS, clear all the context and meaning of the data (you do not know it) and do just what is written. DBMS does not know the nature of your data or semantics or something that you can get with just a look on it. It does what you ask it to do. – astentx Dec 01 '20 at 23:18
  • @astentx, I need a function because I will call this particular script numerous times. It will be part of the application. – Rookie Dec 01 '20 at 23:24
  • 3
    @Rookie Row-by-row processing is bad database design and such calculation can be easily done with single statement. So it would be good if you can update your question and provide source data from where you want to call that function and relationship between that data and the reference dataset to calculate something on it. – astentx Dec 01 '20 at 23:28
  • @astentx, I will only run the function once every <2000 rows/day in order to perform the necessary transformations and save the results within the table in a helper column. It is part of an ETL. – Rookie Dec 01 '20 at 23:31
  • If you want help you will need to provide sample data in the form of DDL / DML scripts. See beginning of [this](https://stackoverflow.com/a/65044832/6305294) answer for an example. – Alex Dec 02 '20 at 04:14

1 Answers1

0

Got it to work:

COUNT([Column_3]) OVER (PARTITION BY [Column_3] ORDER BY [Column_3]) AS [xCountIF],
SUM([Column_2]) OVER (PARTITION BY [Column_3] ORDER BY [Column_3]) AS [xSumIF]

If anyone has a similar problem respond to this answer and I'll walk you through it.

Rookie
  • 140
  • 1
  • 3
  • 13
  • Please explain this solution. How is it related to `COUNT(CASE WHEN Column_3 > 1 THEN 2 END)` if it does not contain any CASEs? What is the purpose of `ORDER BY` clause inside analytical function? "It just works" is not a useful answer to learn something from it, especially when you have not very clear question and respond to it by yourself. – astentx Dec 03 '20 at 08:39