I'm developing a cloud service (worker role) for collecting data from a number of instruments. These instruments reports data randomly every minute or so. The service itself is not performance critical and doesn't need to be asynchronous. The instruments are able to resend their data up to an hour on failed connection attempt.
I have tried several implementations for my cloud service including this one:
http://msdn.microsoft.com/en-us/library/system.net.sockets.tcplistener.stop(v=vs.110).aspx
But all of them hang my cloud server sooner or later (sometimes within an hour). I suspect something is wrong with my code. I have a lot of logging in my code but I get no errors. The service just stops to receive incoming connections.
In Azure portal it seems like the service is running fine. No error logs and no suspicious cpu usage etc.
If I restart the service it will run fine again until it hangs next time.
Would be most grateful if someone could help me with this.
public class WorkerRole : RoleEntryPoint
{
private LoggingService _loggingService;
public override void Run()
{
_loggingService = new LoggingService();
StartListeningForIncommingTCPConnections();
}
private void StartListeningForIncommingTCPConnections()
{
TcpListener listener = null;
try
{
listener = new TcpListener(RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["WatchMeEndpoint"].IPEndpoint);
listener.Start();
while (true)
{
_loggingService.Log(SeverityLevel.Info, "Waiting for connection...");
var client = listener.AcceptTcpClient();
var remoteEndPoint = client.Client != null ? client.Client.RemoteEndPoint.ToString() : "Unknown";
_loggingService.Log(SeverityLevel.Info, String.Format("Connected to {0}", remoteEndPoint));
var netStream = client.GetStream();
var data = String.Empty;
using (var reader = new StreamReader(netStream, Encoding.ASCII))
{
data = reader.ReadToEnd();
}
_loggingService.Log(SeverityLevel.Info, "Received data: " + data);
ProcessData(data); //data is processed and stored in database (all resources are released when done)
client.Close();
_loggingService.Log(SeverityLevel.Info, String.Format("Connection closed for {0}", remoteEndPoint));
}
}
catch (Exception exception)
{
_loggingService.Log(SeverityLevel.Error, exception.Message);
}
finally
{
if (listener != null)
listener.Stop();
}
}
private void ProcessData(String data)
{
try
{
var processor = new Processor();
var lines = data.Split('\n');
foreach (var line in lines)
processor.ProcessLine(line);
processor.ProcessMessage();
}
catch (Exception ex)
{
_loggingService.Log(SeverityLevel.Error, ex.Message);
throw new Exception(ex.InnerException.Message);
}
}
}
One strange observation i just did:
I checked the log recently and no instrument has connected for the last 30 minutes (which indicates that the service is down).
I connected to the service myself via a TCP client i've written myself and uploaded some test data.
This worked fine.
When I checked the log again my test data had been stored.
The strange thing is, that 4 other instruments had connected about the same time and send their data successfully.
Why couldn't they connect by themself before I connected with my test client?
Also, what does this setting in .csdef do for an InputEndpoint, idleTimeoutInMinutes?
===============================================
Edit:
Since a cuple of days back my cloud service has been running successfully.
Unfortunately this morning last log entry was from this line:
_loggingService.Log(SeverityLevel.Info, String.Format("Connected to {0}", remoteEndPoint));
No other connections could be made after this. Not even from my own test TCP client (didn't get any error though, but no data was stored and no new logs).
This makes me think that following code causes the service to hang:
var netStream = client.GetStream();
var data = String.Empty;
using (var reader = new StreamReader(netStream, Encoding.ASCII))
{
data = reader.ReadToEnd();
}
I've read somewhere that StremReader's ReadToEnd() could hang. Is this possible?
I have now changed this piece of code to this:
int i;
var bytes = new Byte[256];
var data = new StringBuilder();
const int dataLimit = 10;
var dataCount = 0;
while ((i = netStream.Read(bytes, 0, bytes.Length)) != 0)
{
data.Append(Encoding.ASCII.GetString(bytes, 0, i));
if (dataCount >= dataLimit)
{
_loggingService.Log(SeverityLevel.Error, "Reached data limit");
break;
}
dataCount++;
}
Another explanation could be something hanging in the database. I use the SqlConnection and SqlCommand classes to read and write to my database. I always close my connection afterwards (finally block).
SqlConnection and SqlCommand should have default timeouts, right?
===============================================
Edit:
After some more debugging I found out that when the service wasn't responding it "hanged" on this line of code:
while ((i = netStream.Read(bytes, 0, bytes.Length)) != 0)
After some digging I found out that the NetStream class and its read methods could actually hang. Even though MS declares otherwise.
I've now changed my code into this:
Thread thread = null;
var task = Task.Factory.StartNew(() =>
{
thread = Thread.CurrentThread;
while ((i = netStream.Read(bytes, 0, bytes.Length)) != 0)
{
// Translate data bytes to a ASCII string.
data.Append(Encoding.ASCII.GetString(bytes, 0, i));
}
streamReadSucceeded = true;
});
task.Wait(5000);
if (streamReadSucceeded)
{
//Process data
}
else
{
thread.Abort();
}
Hopefully this will stop the hanging.