I need some suggestions for troubleshooting this issue. I have the following setup:
Application T-SQL
I have SQL code in an application that executes a SQL Agent Job.
(exec @rtn = msdb.dbo.sp_start_job 'Job Name')
If the SQL Agent Job returns a 0, I report a successful launch on the UI.
If the SQL Agent Job returns a 1, I report a failure to launch on the UI.
SQL Agent Job
Step 1 – Execute SSIS Package
On Success – End Job Successfully
On Failure – Go To Next Step
Step 2 – Send Custom Failure Email
On Success – End Job Successfully
On Failure – End Job Unsuccessfully
*In the event of unsuccessful SQL Agent job run, a default SQL Agent failure email is sent.
SSIS Package
Runs a bunch of steps to import a spreadsheet into a table.
If any step fails, it generates a custom failure email particular to the step that failed.
If whole package succeeds, it generates a custom success email.
Here is the behavior:
Every two weeks, the user manually initiates the T-SQL from within the application. The first time the user runs it that day, they get a message from the application saying the job started successfully, but then they get the custom failure email from SQL Agent under Step 2. Based on the state of the data, the package clearly has not run successfully. None of the package’s custom emails are received.
Next the user re-runs the job; they do nothing else but re-run it. Now, the entire thing runs successfully. And it runs successfully no matter how many times it is run again that day. That is until two weeks later (after two weeks of not running it) and then the cycle repeats.
My troubleshooting steps:
I’ve manually run the SSIS package multiple times by itself. I’ve update the file that it imports the same way the user does. I can’t get the SSIS package to fail.
I’ve run the SQL Agent manually multiple times. Each time it kicks off the SSIS package just fine. But I’ve done this after I’ve received the initial failure email. I’ve yet to be able to reproduce the error. But I have yet to try this after a week of downtime. I plan to test that next.
My thoughts:
SQL Agent is telling the T-SQL execution that it successfully started, which is corroborated by the fact that SQL Agent is saying it failed to execute the SSIS package via the custom email. And the SSIS package is not saying anything, which suggests it isn’t running at all. I'm thinking this is an issue between SQL Agent and SSIS.
I have several other jobs set up the exact same way. None of them are having any issues. The only difference is that these other jobs are scheduled to run nightly and are only occasionally kicked off manually. The one that is failing is the only one that sits dormant for days/weeks at a time. That is the only thing different that I can point to.
The SQL server and the SSIS packages are on the same physical server. So, it’s unlikely to be a connection issue from a network perspective. It’s possible that it’s a Microsoft issue since SQL Agent is using a proxy to access the SSIS package based on standard configuration. But I can’t find anything on the tech support sites where someone has reported the same symptoms. Everything I’ve researched on that topic discusses things that would stop it from working each and every time. I’ve found nothing that would cause it to fail once and then start working… as if something were asleep and being woken up.
EDIT Turns out it is a failure on the SSIS Package. Here is the SQL Agent History:
03/23/2017 06:32:31,PTO Balance Import,Error,1,SERVERNAME,PTO Balance Import,ExecutePackage,,Executed as user: SERVERNAME\appPROXY. Microsoft (R) SQL Server Execute Package Utility Version 12.0.5000.0 for 32-bit Copyright (C) Microsoft Corporation. All rights reserved. Started: 6:32:31 AM Error: 2017-03-23 06:32:32.70 Code: 0xC0202009 Source: PTO Balance Import Connection manager "Excel Connection Manager" Description: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft Access Database Engine" Hresult: 0x80004005 Description: "External table is not in the expected format.". End Error Error: 2017-03-23 06:32:32.70 Code: 0xC020801C Source: Import PTO Balances PTO Balance [14] Description: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed. End Error Error: 2017-03-23 06:32:32.70 Code: 0xC0047017 Source: Import PTO Balances SSIS.Pipeline Description: PTO Balance failed validation and returned error code 0xC020801C. End Error Error: 2017-03-23 06:32:32.70 Code: 0xC004700C Source: Import PTO Balances SSIS.Pipeline Description: One or more component failed validation. End Error Error: 2017-03-23 06:32:32.72 Code: 0xC0024107 Source: Import PTO Balances Description: There were errors during task validation. End Error DTExec: The package execution returned DTSER_FAILURE (1). Started: 6:32:31 AM Finished: 6:32:32 AM Elapsed: 0.797 seconds. The package execution failed. The step failed.,00:00:01,0,0,,,,0