3

Client Details: The issues occurs with code acting as a producer pushing messages into IBM MQ queues and topics. The producer is exposed as a REST interface using WebApi2 and is deployed on IIS. We are using C# .NET (4.5.2) client to connect to IBM MQ. We connect using the CCDT file AMQCLCHL.TAB to get the client connection details. The underlying libraries used are Apache NMS (1.8.0.4573) and IBM XMS (2.5.0.3).

Exception Received: CWSMQ0006E: An exception was received during the call to the method ConnectionFactory.CreateConnection: CompCode: 2, Reason: 2058.

Error Details: The client correctly works and we are able to push hundred thousands messages through to MQ queues and topics. However, after a random period of time ranging from few hours to more than 1 week, the client starts failing with the error noted above. Few more details:

  • The error is resolved by restarting the IIS pool or reloading the application
  • Connecting to the same MQ server from another client (IIS server 2) continues to work when first client (IIS server 1) continues to have issues

Error seen in AMQERR01.LOG file.

AMQ9516: File error occurred.

EXPLANATION: The filesystem returned error code 6 for file'\\...\AMQCLCHL.TAB'. 

ACTION: Record the name of the file '\\...\AMQCLCHL.TAB' and tell the systems administrator, who should ensure that file '\\...\AMQCLCHL.TAB' is correct and available. 

Error code 6 is ERROR_INVALID_HANDLE.

JoshMc
  • 10,239
  • 2
  • 19
  • 38
Kailash
  • 527
  • 4
  • 13
  • Use from cmd.exe >Netstat -a to check the status of the failed connection. It appears the connection is half-closing so that the same client cannot reconnect, but other clients can connect. – jdweng Aug 07 '18 at 15:38
  • Will have the admins check on that – Kailash Aug 07 '18 at 15:45
  • What does the queue manager's `AMQERR01.LOG` have at the same time you receive the `2058`? Are you running in XMS managed or unmanaged mode ? What is the MQ version that your XMS dlls and amqmdnet.dll are from? – JoshMc Aug 09 '18 at 04:19
  • 1
    `2058 = MQRC_Q_MGR_NAME_ERROR`, this means either the queue manager you connected to does not have the name you specified or that it could not find for instance the CCDT for some reason and tried to then find the queue manager local on your machine and it does not exist so provided the `2058` response. Based on the symptoms it may be the later case. If you review the `%MQ_FILE_PATH%\errors\AMQERR01.LOG` file to see any possible local causes. Also check the contents of the CCDT to make sure there is not any extra entries with the same `QMNAME` that point to the wrong `IP(PORT)`. – JoshMc Aug 10 '18 at 00:19
  • 1
    The reason codes that @TheSoftwareJedi is probably remembering wrong is `2059 = MQRC_Q_MGR_NOT_AVAILABLE` or `2009 = MQRC_CONNECTION_BROKEN`. Auto-reconnect logic would only help with the connection broken reason because it only goes into action if you are already connected and the connection is broken, it does not help if you can't connect in the first place which seems to be the case when you get a `2058` error, so while auto reconnect can be a good idea if it fits your needs, in this case it likely would provide no help, you need to understand why you are getting that error message. – JoshMc Aug 10 '18 at 00:22
  • Host Info:- Windows Server 2008 R2 Server Standard Edition, Build 7601: SP1 – Kailash Aug 10 '18 at 12:51
  • Version:- 7.5.0.1 (p750-001-130308) – Kailash Aug 10 '18 at 12:51
  • The client is running in XMS unmanaged mode – Kailash Aug 10 '18 at 12:52
  • Error seen in AMQERR01.LOG file. AMQ9516: File error occurred. EXPLANATION: The filesystem returned error code 6 for file'\\...\AMQCLCHL.TAB'. ACTION: Record the name of the file '\\...\AMQCLCHL.TAB' and tell the systems administrator, who should ensure that file '\\...\AMQCLCHL.TAB' is correct and available. Error code 6 is ERROR_INVALID_HANDLE. The file is currently on the NAS. Moving it to the local disk on server may help with this issue, going to try that next. – Kailash Aug 10 '18 at 12:55
  • It would be helpful if you edited the question to add the additional information from the logs since it can be formatted. 7.5 is out of support and 7.5.0.1 is almost 6 years old. I would suggest you move to a supported version. 9.1 has a redist version that comes as a zip file you can just extract that has all the dlls required for xms.net. Even if you find NAS was the cause it is better to be at a supported version. If you want someone to no you replied add @user in your comment to ping them. – JoshMc Aug 14 '18 at 06:58
  • @JoshMc, I definitely agree we should be on a supported version of IBM MQ. However, that decision is completely out of our control and we lost the battle to upgrade to a supported version or use an alternate MQ implementation ages ago. – Kailash Aug 22 '18 at 15:19
  • 1
    Kailash, note that the client version does not need to match the queue manager version, you can connect from a MQ client 9.1.0.0 to a queue manager that is at 7.5.0.1, so even if you only have influence and control over your server where the .NET app runs you can still go to a higher version of the MQ client and as stated the Redist client does not even require a install you can unzip it and it includes all the dlls needed for XMS.NET. – JoshMc Aug 22 '18 at 15:53

2 Answers2

2

This happens when the connection was closed (could be remote server restart, network issues, etc...). This really takes me back - I remember dealing with this back in 2002 connecting a Java J2EE application to MQ on an OS/390.

Recently IBM has implemented auto reconnect settings that can be set in the CCDT or manually on the C# object. This is summarized on the XMS page, and the documentation for implementing that is here.

The properties Client Reconnect Options, Client Reconnect Timeout, and Connection Namelist can also be set via Client Channel Definitions Table (CCDT) or by enabling the client reconnection via the mqclient.ini file.

TheSoftwareJedi
  • 34,421
  • 21
  • 109
  • 151
  • I noted this, will do some more related research and try this out – Kailash Aug 07 '18 at 19:58
  • See my comments on the question above, I think you may be remembering the 2059/2009 return codes, as 2058 is not related to a closed connection. – JoshMc Aug 10 '18 at 00:25
1

Based on suggestions from @JoshMc, we noticed intermittent errors related to accessing the AMQCLCHL.TAB file on the NAS in the AMQERR01.LOG file. This seemed to mess up the unmanaged client at our end which could only be fixed with an IIS restart. Our setup was updated to move this file to local disk on the server and then point our code to it. This resolved the issue and we have been going strong without issues for the last two weeks since this change was made.

Kailash
  • 527
  • 4
  • 13
  • Glad I could help point you in the right direction. I went ahead and added the errors you noted in the comment to your question as well to help people in the future find this Q/A in google searches. – JoshMc Aug 22 '18 at 16:52