1

The child process section might be relevant or not, it might be just the amount of concurrent/parallel connections/requests but it is not even that high (50 max).

My architecture is that my main application spins up child process' on a CRON schedule. Each process that is spawned is passed a unique sports event to focus on. The CRON job is based on the start time of this event. If event times are the same, the spawn of each child process is spaced by 1.2 seconds. The number of child process' vary but the number running at any 1 time should never exceed 50. Each child process creates new connection to an internal API. The internal API only has 1 process running &, as far as I understand, cannot scale with other machines or forking child processes (as is suggested here & here) as the internal API, in turn, calls an external API which originally threw back ECONNRESETs after many requests. Since I didn't know the nature of the external APIs, I put my own in front of it to make sure all requests to the external are going through a single connection. However, now I am getting back ECONNRESETs from my internal API even if only 10 child process' have been spun up.

Connections to my internal API are kept alive to decrease response times & have a max socket count of 1 using Axios. Diagram of architecture below:

enter image description here

The child process' poll the internal API on a schedule so it's possible (likely) there are parallel requests being made to it. However, because these are spawned at least 1.2 seconds apart, I thought that this would decrease the likelihood of parallel requests. What is the difference between Express' ability to handle parallel requests vs their ability to handle concurrent requests? As I know that Express can handle 1000s of concurrent requests without fail.

axios.js (worker)

module.exports = axios.create({
     baseURL: process.env.INTERNAL_API_ENDPOINT,
     httpAgent: new http.Agent({
         keepAlive: true,
         maxSockets: 1
     }),
     httpsAgent: new https.Agent({
         keepAlive: true,
         maxSockets: 1
     }),
     timeout: 60000, // 1 min timeout
     auth: {
         username: process.env.INTERNAL_API_USERNAME || 'user',
         password: process.env.INTERNAL_API_PASSWORD || 'pass'
     },
     headers: {
         Connection: 'keep-alive',
         'Content-Type': 'application/json',
         Accept: 'application/json'
     }
})

request.js

const API = require('./axios')

module.exports = async function(params) {
    try {
        return API.get('/api/events', { params })
    } catch(err) {
        throw err
    }
}

event.js (internal API)

const { Router } = require('express')
const external = require('../client/betting')
const router = Router()

/**
 * GET /api/events
 */
router.get('/events', async (req, res, next) => {
    const { query } = req

    try {
        // Call to External API
        const response = await external.GetEvents(query)

        res.send(response.data)
    } catch (err) {
        next(err)
    }
})

module.exports = router

This answer suggests that my internal API could be being overloaded with requests & is therefore dropping connections as a result.

What would seem to be the issue here? Is it my parallel requests that are causing the ECONNRESETs? In which case I can think of 3 potential remedies (looking for the best option really):

  1. I could queue the requests on my internal API side as is suggested here
  2. I could refactor my code to not spin up child process' &, therefore, only have 1 process & subsequently 1 connection to the API. This is not preferable as is a big architecture change but will do if it's the best suggestion
  3. Is there a way to scale my internal API where the child process can share the TCP connection of the master so, again, there is only 1 connection to the external? Something like cluster-client & the Leader/Follower pattern mentioned there

If none of this is clear then I can provide more clarity where needed :) thanks!

wmash
  • 4,032
  • 3
  • 31
  • 69
  • Is the external API still closing its connection to your internal API? i.e. is your Internal API receiving `ECONNRESET` from the external? THe code sample looks like internal API is not throttling requests to External API `await external.GetEvents(query)` – dm03514 Feb 10 '20 at 17:33

1 Answers1

0

If i'm reading your question correctly the issue is a strict limit on "external api" which is a bottleneck to your application.

This is rough because you may create more events than the external API can possibly handle! At some point it's conceivable that the external api just can't keep up with your load.

For the initial implementation I would explore your first approach:

I could queue the requests on my internal API side as is suggested here

I would go as far as making the internal API reactive, so that it reads from a queue or event stream. enter image description here

IN this architecture your workers enqueue an event and then continue. The internal API listens for events on the queue, it can either pull single events one after another and each event it makes a request to external API, or if you begin to overwhelm the external API you can begin to batch events (ie pull 100 events or for some interval).

Hopefully external API has enough capacity to handle your load, and that you are able to batch events. But at some point your load could be more than what external API is actually able to handle.

dm03514
  • 54,664
  • 18
  • 108
  • 145
  • After doing some more debugging, I was wrong in the API I was getting the ECONNRESET from. In the end, it was the API managing my database connection. This API works fine when I run it locally but fails when I deploy it to a DO box. I'm assuming because the box does not have the resources required. This internal API is on a single process. I am not sure the scalability of it in production as the box is 'Standard' & has shared resources. I think the only way to use a box with dedicated resources – wmash Feb 12 '20 at 20:18