1

Summary

I have what is, essentially, a load balancer implemented in Node.js. It takes a requests from client sockets and pipes them to the appropriate server socket based on hostname. It works great for my use case except that after running for several hours with a steady stream of connections, memory usage grows from an initial 40% to 80% and beyond. I can't seem to find where the memory leak is.

What I've Tried

I've tried replacing all anonymous functions with named functions that use this context instead of closures; didn't seem to help. I'll provide sample code below.

const run = async () => {
  const gateways = await fetchGateways()
  const gatewayIndexByHostname = {}

  const getGatewayIP = (hostname) => {
    // reads from gateways and mutates gatewayIndexByHostname
    // returns destination IP address
  }

  const server = net.createServer()

  server.on('connection', (clientSocket) => {
    clientSocket.on('error', () => {
      clientSocket.destroy()
    })

    clientSocket.once('data', (buffer) => {
      const data = buffer.toString()

      const destinationHostname = getDestinationHostname(data)
      const gatewayIP = getGatewayIP(destinationHostname)

      const [host, port] = gatewayIP.split(':')
      const gatewaySocket = net.createConnection({ host, port })

      gatewaySocket.on('error', (err) => {
        clientSocket.end(handleError(500, err))
        gatewaySocket.destroy()
      }

      gatewaySocket.on('connect', () => {
        clientSocket.pipe(gatewaySocket)
        gatewaySocket.pipe(clientSocket)
        gatewaySocket.write(data)
      })
    })

  })

  server.listen(PORT, () => {
    console.log(`listening on port ${PORT}`)
  })

}

run()

That's more or less what the code is doing. My heap snapshots show that Socket and all things related to sockets are what are increasing with every request. Any help in understanding the cause of the memory leak would be awesome.

svaterlaus
  • 29
  • 6
  • does your load balancer close all the sockets when the connection finishes? – user253751 Nov 06 '20 at 16:23
  • One tool commonly used for this is to take [heap snanpshots](https://medium.com/@wavded/how-to-heap-snapshots-aac9284d5329) before and after and then use analysis tools to examine what is taking up the additional memory between snapshots. This will sometimes show you what types of data are causing the increase which you can then use as a clue in your source code as to where to look for a problem. – jfriend00 Nov 06 '20 at 16:34
  • If you think you might be failing to close sockets, you can also probably look at some system metrics for your process to see if it has a larger than expected (or growing over time) usage of certain types of system resources (such as sockets). – jfriend00 Nov 06 '20 at 16:36
  • @user253751, I've put listeners for the `close` event on both the client and gateway sockets and it triggers for both after every request, so I assume that means they're closing properly? Or do I still need to run `socket.close` for some reason? – svaterlaus Nov 06 '20 at 16:40
  • https://stackoverflow.com/questions/9191587/how-to-disconnect-from-tcp-socket-in-nodejs this says you need to call `destroy` – user253751 Nov 06 '20 at 16:41
  • @jfriend00, it's definitely the sockets that are taking up the resources. I've checked and the `close` event is triggering for every request on both sockets. I assume that means that they're fully closing? – svaterlaus Nov 06 '20 at 16:42
  • Hmmm, @user253751, I've read that `destroy` is only for when a socket errors out... since you can't close it properly... – svaterlaus Nov 06 '20 at 16:43
  • Does EVERY socket get closed? If you count how many open and how many close, do they match? I've had issues with `.pipe()` that doesn't always clean up properly in some error circumstances. – jfriend00 Nov 06 '20 at 16:45
  • @jfriend00, I'll run a test to be sure. Good idea. – svaterlaus Nov 06 '20 at 16:45
  • What is `proxySocket` in your code? Also, why are you setting up the `.pipe()` when receiving the first `data` message rather than upon connection? Won't the piped targets miss that first `data` event because it has already happened? FYI, there is battle tested proxy code available for nodejs that you could use rather than write your own from scratch. – jfriend00 Nov 06 '20 at 16:55
  • @jfriend00, ran a test and about 3% of client sockets are not closing after 30 seconds, so I'll fix that and see how much it helps. Thanks for the tip! `proxySocket` was just a typo, should be `gatewaySocket`; I'll fix that. My custom routing mechanism depends on the hostname of the request and I need to get that from the first data message. So that's why I wait until I have that before piping the request, so I know which gateway to pipe it to. At the end there you can see that I write that first chunk of data to the gateway socket after I've read it so the target still gets it. – svaterlaus Nov 06 '20 at 17:08
  • @jfriend00, if you could point me to the battle-tested proxy code, that would be great! Hopefully I'll still be able to make it work with my routing mechanism. – svaterlaus Nov 06 '20 at 17:09
  • You can look at [proxy](https://www.npmjs.com/package/proxy) and [http-proxy](https://www.npmjs.com/package/http-proxy). Outside of nodejs programs, [NGINX](https://www.nginx.com/) is often a go-to solution for high performance load balancing. – jfriend00 Nov 06 '20 at 17:16
  • Thanks! I'll check it out and see what I can do. – svaterlaus Nov 06 '20 at 17:24

0 Answers0