1

I am trying to get the status of 2 files submitted, whether 'Passed', 'Failed' or 'Not Submitted' from the table PROCESSED_FILE_LOGS.

The 1st file name does not contain 'PCR' in the filename, and the 2nd file name contains 'PCR' in the file name and that's how i differentiate between these 2 files.

The Contract # are the ones that are submitting these 2 files and I am combining 4 different tables to populate the below.

enter image description here

2 Issues I have with this table, when i run my query[below]:

i. 'H0000' gets repeated, in spite of the query, to pull the MAX(PF.processed_date) - I am trying to get the last processed status for both file 1 and file 2.

ii. For 'H0000', the status for both the files are not displaying in the same line., meaning - the status for File 1 displays in one row, and the status for File 2 displays in the second row.

My query:

select  distinct
          
          OC.cms_contract_number as 'Contract #'  
          ,case when (PF.PROCESSED_FILE_NAME not like '%PCR%' and FS.DISPLAY is not null) then FS.display   else 'Not Submitted' end as 'File 1 Status'
          ,case when (PF.PROCESSED_FILE_NAME like '%PCR%'  and FS.DISPLAY is not null) then FS.display   else 'Not Submitted'  end as 'File 2 Status'
          ,MAX(PF.processed_date) as 'Date Submitted'
          
   from 
                  ((((persons P join person_affiliations PA on P.person_id = PA.person_id)
          join external_orgs EO on EO.org_id = PA.org_id)
          join org_contracts OC on OC.org_contracts_id = PA.org_contracts_id)
          left outer join      processed_file_logs PF on PF.org_contracts_id = OC.org_contracts_id)
          left outer join processed_file_status FS on FS.file_status_id = PF.file_status_id 
   group by  OC.CMS_CONTRACT_NUMBER,  PF.SUBMISSION_ID,
     FS.DISPLAY, PF.PROCESSED_FILE_NAME

Where am i going wrong ?

Community
  • 1
  • 1
Ron
  • 1,901
  • 4
  • 19
  • 38
  • What happens if you remove PF.SUBMISSION_ID from the GROUP BY? – jarlh Mar 16 '17 at 15:35
  • Is this in microsoft access? Please tag the dbms you are using. – SqlZim Mar 16 '17 at 15:38
  • @jarlh, I am getting an error saying 'Column 'processed_fole_logs.SUBMISSION_ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause – Ron Mar 16 '17 at 15:49
  • @SqlZim, It's SQL Server 2008 R2 – Ron Mar 16 '17 at 15:49
  • @Ron Please, do not use parentheses around your `join`s without documenting **why** you are doing that. It can have a significant impact on how the query runs, and if you don't know why you are doing it, don't do it at all. It is the equivalent of using query hints to force join order. [Forcing Join Order Without Hints - Erik Darling](https://www.brentozar.com/archive/2015/05/forcing-join-order-without-hints/) – SqlZim Mar 16 '17 at 15:53
  • @Ron, I can't find SUBMISSION_ID in the select list. – jarlh Mar 16 '17 at 15:56

1 Answers1

1

Ok, updated this answer to reflect your comments below - there is a whole lot of syntax here I may have messed up and it is unlikely I'm going to be able to give you the perfect final solution without access to those tables, but here you go:

Basically your data is disjointed since you want to summarize the data across multiple rows into two different columns. We can use the "with" queries to create tables of data that are finding the last submission for each of the two files and then re-matching it to the main dataset by the contract number key.

With A as (select OC.cms_contract_number as cms_contract_number, PF.PROCESSED_FILE_NAME as PROCESSED_FILE_NAME, FS.DISPLAY as DISPLAY, PF.processed_date as processed_date
from 
          ((((persons P join person_affiliations PA on P.person_id = PA.person_id)
  join external_orgs EO on EO.org_id = PA.org_id)
  join org_contracts OC on OC.org_contracts_id = PA.org_contracts_id)
  left outer join      processed_file_logs PF on PF.org_contracts_id = OC.org_contracts_id)
  left outer join processed_file_status FS on FS.file_status_id = PF.file_status_id )

select  distinct

  A.cms_contract_number as 'Contract #'  
  ,max(s1.FileStatus) as 'File 1 Status'
  ,max(s2.FileStatus) as 'File 2 Status'
  ,MAX(A.processed_date) as 'Date Submitted'

from A
left outer join 
(
    select  distinct

      A.cms_contract_number as 'Contract'  
      ,case when (A.DISPLAY is not null) then A.display   else 'Not Submitted' end as 'FileStatus'
      ,MAX(A.processed_date) as 'DateSubmitted'

    from  A
    where A.PROCESSED_FILE_NAME not like '%PCR%' or A.PROCESSED_FILE_NAME is null 
    group by  A.CMS_CONTRACT_NUMBER, case when (A.DISPLAY is not null) then A.display   else 'Not Submitted' end 
) s1 on s1.Contract = A.CMS_CONTRACT_NUMBER and s1.DateSubmitted = A.processed_date
left outer join
(
    select  distinct

      A.cms_contract_number as 'Contract'  
      ,case when (A.DISPLAY is not null) then A.display   else 'Not Submitted' end as 'FileStatus'
      ,MAX(A.processed_date) as 'DateSubmitted'

    from  A
    where A.PROCESSED_FILE_NAME like '%PCR%'  or A.PROCESSED_FILE_NAME is null 
    group by  A.CMS_CONTRACT_NUMBER, case when (A.DISPLAY is not null) then A.display   else 'Not Submitted' end 
) s2 on s2.Contract = A.CMS_CONTRACT_NUMBER and s2.DateSubmitted = A.processed_date
group by  A.CMS_CONTRACT_NUMBER

Note also, that if you want the date Last submitted for each File1 and File2, it would be a fairly trivial exercise to pull it out now also:

In the main query ONLY just change:

MAX(PF.processed_date) as 'DateSubmitted' 

to:

max((select s1.DateSubmitted from StatusOfLastFile1 s1 where s1.Contract = A.CMS_CONTRACT_NUMBER)) as 'File 1 Last Submit',
max((select s2.DateSubmitted from StatusOfLastFile2 s2 where s2.Contract = A.CMS_CONTRACT_NUMBER)) as 'File 2 Last Submit'
user681574
  • 553
  • 2
  • 15
  • Aha - so what do you want to be displayed in the calculated fields that provide the statuses for file 1 and file 2? Do you want the status of the most recent file submission of that type or "Not submitted" if that file type is null? (Or do you want the 'best' status of any file submitted?) – user681574 Mar 16 '17 at 18:30
  • I would like to have the most recent processing of file submission for both files on the same row. file 1 could have processed yesterday after 10th time with passed status and file 2 could have processed today with failed and my row should contain the last processed file 1 status and last processed file 2 status. – Ron Mar 16 '17 at 19:50
  • I've updated the answer to hopefully capture these requirements. – user681574 Mar 16 '17 at 21:03
  • I made a classic mistake there and unfortunately couldn't test it myself. I've reorganized it following this logic: http://stackoverflow.com/questions/3680254/t-sql-selecting-column-based-on-maxother-column – user681574 Mar 17 '17 at 14:35
  • I am getting incorrect syntaxt near the keyword 'as' in line 18 and 30 - group by OC.CMS_CONTRACT_NUMBER, case when (FS.DISPLAY is not null) then FS.display else 'Not Submitted' end as 'FileStatus' and also, a red squiggly under A on the 1st line – Ron Mar 17 '17 at 14:49
  • I removed the aliases in the group by which I accidentally left in from a copy and paste from the select statement. This might clear up the sqiggly under the A, but I don't see anything specifically wrong with the A right now. – user681574 Mar 17 '17 at 15:09
  • I'm so terrible without access the the tables. It's pretty quick to try and review this when I can test the query myself. It is missing the keyword "End" for the case statement. Sorry for the syntax junk! – user681574 Mar 17 '17 at 15:58
  • 1
    Well shoot - that's the danger of using select * in a CTE (with statement). The join must bring back the same column name from multiple tables. To fix this, you'll need to type out each of the columns that you'd like to utilize from the joined tables to replace the * in that select statement. If you want to utilize columns from two tables in the join with the same name, it will require aliasing those columns. I will give this a shot, but if I miss one, just add the 'error column' to the list. – user681574 Mar 17 '17 at 16:05
  • 1
    hmm, not sure which DB you are on. This would work in Oracle, but may need to be adapted slightly for SQL Server or another DB. You may be able to remove the 'max' function on those completely as the value returned should already be the 'max' and it should be a single value. You also may need to add the aliases 'File 1 Status' and 'File 2 Status' to the Group By if you remove the max function. – user681574 Mar 17 '17 at 17:43
  • 1
    Looking over a few StackOverflow articles, it may be useful to change the "With" statements into Joins within the main query depending on the DB. – user681574 Mar 17 '17 at 17:54
  • 1
    Ok, I've refactored the CTEs (with statements) that contained grouping functions to make them into joins. This appears to solve most issues on other similar errors in SQL Server. – user681574 Mar 17 '17 at 19:46
  • I am getting error on line 12 - Column 's1.FileStatus' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause., but when i added s1.filestatus & s2.filestatus in GROUP By clause, I am getting results, but H0000 gets repeated with null values and passed and failed status in different rows. – Ron Mar 20 '17 at 12:31
  • Contract # File 1 Status File 2 Status Date Submitted H0000 NULL NULL 2017-03-08 15:37:32.590 H0000 NULL Failed 2017-03-16 10:33:39.173 H0000 Failed NULL 2017-02-14 15:52:02.177 H0000 Passed NULL 2017-03-08 15:37:32.663 H0022 NULL NULL NULL – Ron Mar 20 '17 at 12:34
  • 1
    I modified your query to add MAX(s1.FileStatus) as 'File 1 Status' and MAX(s2.FileStatus) as 'File 2 Status' on line 12 & 13 and now getting the desired results. But I am not getting status as 'Not Submitted' for File 1 and File 2 for null values. Thanks for your great help @user681574 – – Ron Mar 20 '17 at 13:47
  • Ok, I may have inadvertently eliminated the null values by including the 'like' statements in the where clauses on the joins. I've hopefully added them back in with the " or A.PROCESSED_FILE_NAME is null" in each of the join where clauses. – user681574 Mar 20 '17 at 14:52
  • still 'Not Submitted' is not displaying for null...tried to modify your query to include 'Not submitted' for null, but no luck – Ron Mar 20 '17 at 16:15