2

I have an SSIS package which has two source inputs sorted identically and with the same collation. The Merge Join is doing a full outer join.

SSIS Package

Merge Join

The two Queries in the source are as follows:

SELECT [PolicyReference]
      ,[PolicyNarrativeReference]
      ,[PolicyNarrativeTypeCode]
      ,[PolicySystemCode]
      ,[NaturalKeyHash]
      ,[FullRowHash]
      ,[SystemName]
      ,[FunctionalEntityName]
  FROM [KeyHash].[dbo].[vw_Debug_PolicyNarrative_Server_KeyHash]
  ORDER BY [PolicyReference] COLLATE Latin1_General_CI_AS
      ,[PolicyNarrativeReference] COLLATE Latin1_General_CI_AS
      ,[PolicyNarrativeTypeCode] COLLATE Latin1_General_CI_AS
      ,[PolicySystemCode] COLLATE Latin1_General_CI_AS


SELECT [PolicyReference]
      ,[PolicyNarrativeDate]
      ,[PolicyNarrativeReference]
      ,[PolicyNarrativeText]
      ,[PolicyNarrativeTypeCode]
      ,[PolicyNarrativeSystemCode]
      ,[PolicySystemCode]
  FROM [KeyHash].[dbo].[vw_Debug_PolicyNarrative_Server_Source]
  ORDER BY [PolicyReference] COLLATE Latin1_General_CI_AS
      ,[PolicyNarrativeReference] COLLATE Latin1_General_CI_AS
      ,[PolicyNarrativeTypeCode] COLLATE Latin1_General_CI_AS
      ,[PolicySystemCode] COLLATE Latin1_General_CI_AS

Now when this Package is run on the SQL Server through the SQL Agent, The sorting of the two sets becomes skewed and creates 2995 rows which have a blank output

SELECT *
  FROM [KeyHash].[dbo].[Debug_PolicyNarrative_Server_MJ]
  where NaturalKeyHash = 0

Skewed sorting

However when we run the exact same SSIS package locally with the exact same variables, Servers, etc, the sorting works as intended and produces exactly what we want.

SELECT *
  FROM [KeyHash].[dbo].[Debug_PolicyNarrative_Local_MJ]
  where NaturalKeyHash = 0

No problem

The data is exactly the same, the queries are exactly the same, the collations are identical. What on earth could be causing this strange behaviour?

We have tried executing this same package with the same variables and connections but we are seeing the results as above and we are completely stumped.

Also worth mentioning as per David Browne's comment that we do not have Duplicate values within the Order By columns

SELECT COUNT(*)
,[PolicyReference] 
      ,[PolicyNarrativeReference] 
      ,[PolicyNarrativeTypeCode] 
      ,[PolicySystemCode] 
  FROM [KeyHash].[dbo].[vw_Debug_PolicyNarrative_Server_KeyHash]
  GROUP BY [PolicyReference]
      ,[PolicyNarrativeReference]
      ,[PolicyNarrativeTypeCode]
      ,[PolicySystemCode] 
HAVING COUNT(*) > 1

SELECT COUNT(*)
,[PolicyReference] 
      ,[PolicyNarrativeReference] 
      ,[PolicyNarrativeTypeCode] 
      ,[PolicySystemCode] 
  FROM [KeyHash].[dbo].[vw_Debug_PolicyNarrative_Server_Source]
  GROUP BY [PolicyReference] 
      ,[PolicyNarrativeReference] 
      ,[PolicyNarrativeTypeCode] 
      ,[PolicySystemCode] 
HAVING COUNT(*) > 1

Output NoDuplicates

  • Are there any duplicates for the ORDER BY columns? If so the ordering within each group is non-deterministic. – David Browne - Microsoft Jan 28 '23 at 18:39
  • If you modify your views to break your problem down to 2 rows. 1 that finds a match and one that does not find a match in the server execution scenario, see whether you consistently get the same erroneous results. The other thing I would do is set up two [data taps](https://www.timmitchell.net/post/2016/12/13/ssis-data-taps/) between "MH -> MRGJ" and the other between "CNT -> MRGJ" That will allow you to see the few rows of data flowing there and mentally compare them to see whether the output is as expected. Otherwise, does the locale of your dev machine match server? Patch level match? – billinkc Jan 28 '23 at 21:06
  • @DavidBrowne-Microsoft - Thanks for your comment, I have updated the question with this info. We do not have duplicates. – Michael Oakman Jan 28 '23 at 22:28
  • @billinkc - Thank you for your comment, The locale and patching is definitely something we may need to look at. The data taps are an interesting concept, however we have done something very similar where we have looked at both outputs before the merge join locally and when executed on the server. The outputs are sorted identically at this stage on both Local and server. it seems that only the Merge Join is behaving differently – Michael Oakman Jan 28 '23 at 22:33
  • New Development: Executed the package from SQL Agent job in 32 bit mode and it ran successfully so we now believe the issue is specifically with 64 bit runtime – Michael Oakman Jan 30 '23 at 17:16

0 Answers0