I have a strange problem. For some time I've been trying to replace a small protocol converter (basically a two way serial to ethernet ... master and slave) that I've got for something that has more features.
Backstory
After a lot of reverse engineering I found out how the device works and I've been trying to replicate it and I've been successful in connecting my board to the device ... I've tried connecting the original as the master and my board as slave and vice versa and everything works perfectly, it's actually better since at higher speeds there are no more packet losses (connecting 2 original ones would cause packet losses).
However when I tried connecting my device as master and another one of my devices as slave .. running the exact same piece of code it works for 2 or 3 exchanges and then it stops ... eventually SOMETIMES after some minutes it will try again 2 or 3 more times.
How the tests were made
- I connected a modbus master and slave (modbustools, two different instances). The master is a serial RTU modbus and the slave is an serial RTU modbus;
- I configure one of my devices as master and connect it to the serial port so that it receives the serial modbus and sends the protocol to a device connected to it;
- I configure my slave so that it connects via the serial port to the slave modbus. Basically it works by creating a socket and connecting to the master's IP, it then waits for a master transmission via ethernet, sends it via serial to the slave modbus (modbustools), receives a response, sends it its master and then it sends it to the modbus master (modbustools);
I's a bit confusing but that's how it works ... my master awaits a socket connection and then the communication between them starts, because that is how the old ones work.
I've written an echo client now to test the connection. Basically now, my code connects to a server (my master), it receives a packet, then it replies back the same packet that it received. When I try connecting this to my 2 boards they don't work. It's more of the same, 2 or 3 exchanges and then it stops, but when I connect it to the original device it keeps running without a hitch.
Sources
Here is my TCP master (server actually) initialization:
void initClient() {
if(tcp_modbus == NULL) {
tcp_modbus = tcp_new();
previousPort = port;
tcp_bind(tcp_modbus, IP_ADDR_ANY, port);
tcp_sent(tcp_modbus, sent);
tcp_poll(tcp_modbus, poll, 2);
tcp_setprio(tcp_modbus, 128);
tcp_err(tcp_modbus, error);
tcp_modbus = tcp_listen(tcp_modbus);
tcp_modbus->so_options |= SOF_KEEPALIVE; // enable keep-alive
tcp_modbus->keep_intvl = 1000; // sends keep-alive every second
tcp_accept(tcp_modbus, acceptmodbus);
isListening = true;
}
}
static err_t acceptmodbus(void *arg, struct tcp_pcb *pcb, err_t err) {
tcp_arg(pcb, pcb);
/* Set up the various callback functions */
tcp_recv(pcb, modbusrcv);
tcp_err(pcb, error);
tcp_accepted(pcb);
gb_ClientHasConnected = true;
}
//receives the packet, puts it in an array "ptransparentmessage->data"
//states which PCB to use in order to reply and the length that was received
static err_t modbusrcv(void *arg, struct tcp_pcb *pcb, struct pbuf *p, err_t err) {
if(p == NULL) {
return ERR_OK;
} else if(err != ERR_OK) {
return err;
}
tcp_recved(pcb, p->len);
memcpy(ptransparent.data, p->payload,p->len);
ptransparent->pcb = pcb;
ptransparent->len = p->len;
}
The serial reception is basically this: detect one byte received, start timeout, when timeout ends send whatever was received via a TCP socket that was already connected to the server .. it then receives the packet via the acceptmodbus function and sends it via serial port.
This is my client's (slave) code:
void init_slave() {
if(tcp_client == NULL) {
tcp_client = tcp_new();
tcp_bind(tcp_client, IP_ADDR_ANY, 0);
tcp_arg(tcp_client, NULL);
tcp_recv(tcp_client, modbusrcv);
tcp_sent(tcp_client, sent);
tcp_client->so_options |= SOF_KEEPALIVE; // enable keep-alive
tcp_client->keep_intvl = 100; // sends keep-alive every 100 mili seconds
tcp_err(tcp_client, error);
err_t ret = tcp_connect(tcp_client, &addr, portCnt, connected);
}
}
The rest of the code is the identical. The only thing that changes is the flow of operation.
- Connect to server
- Wait for packet
- send it via serial
- wait for response timeout (same timeout as the server, it justs starts counting in a different way ... server starts after receiving one byte and client after it sent something via the serial port)
- get response and send it to the server
Observation:
No error is detected in the communication. After some testing it doesn't seem to be the number of exchanges that causes the hang. It happens after some time. In my opinion this sounds like a disconnection problem or timeout error, but no disconnection occurs and no more packets are received. When I stop debugging and check the sockets nothing out of the ordinary is detected.