4

My app is a sport betting bot. It's a containerised app running on a DO box which spins up multiple NodeJS process' with cluster. Each child process monitors a single sporting event throughout several betting exchanges (Betfair, Smarkets, & Matchbook).

The master process is the orchestrator of these child ones. On interval, the master process gets the state of each live event. If an event has ended, it will send a message to the child process/worker monitoring that event to die.

The app places 2 bets against each other to cover all outcomes. These 2 bets are supposed to be placed at the same time but, sadly, this is not always the case. If this happens, there is a process to rectify ("rectification process") this position. This is where the problem occurs. Sporadically, a child process will just stop. There are no logs from that process after a certain point. The rectification process works on a recursive setTimeout:

src/rectify.js

const logError = require('./log-error')
const logger = require('./logger/winston')
const Helpers = {
    betfair: {
        tradeOut: require('./betfair/trade-out'),
        retry: require('./betfair/retry')
    },
    smarkets: {
        tradeOut: require('./smarkets/trade-out'),
        retry: require('./smarkets/retry'),
    },
    matchbook: {
        tradeOut: require('./matchbook/trade-out'),
        retry: require('./matchbook/retry')
    }
}

let resolutionAttempts = 0

module.exports = async function(bet, runner) {
    try {
        const _promise = new Promise((resolve, reject) => {
            async function rectify() {
                try {
                    const [lowEx, upEx] = [runner.exchange.toLowerCase(), runner.exchange.toUpperCase()]

                    resolutionAttempts++

                    const isAttemptTradeOut = resolutionAttempts % 2 === 0
                    const _attempt = isAttemptTradeOut ? 'TRADE OUT' : 'RETRY'
                    const _lossMsg = `${(resolutionLossAllowed * 100).toFixed(2)}% LOSS ALLOWED`

                    logger.info(`[ATTEMPTING ${_attempt} FOR ${upEx} WITH ${_lossMsg} (Process: ${process.pid})]`)
 
                    const runners = {
                        matched: { ... },
                        unmatched: { ... }
                    }
                    const result = isAttemptTradeOut
                        ? await Helpers[lowEx].tradeOut(bet, runners)
                        : await Helpers[lowEx].retry(bet, runners)

                    if (result) {
                        return resolve(result)
                    }
                    setTimeout(rectify, 10000) // 10 secs
                } catch(err) {
                    logError(err, __filename)

                    return reject(err)
                }
            }
            rectify()
        })
        const _res = await _promise

        return _res
    } catch(err) {
        logError(err, __filename)

        throw err
    }
})

An example of one of the retry.js files for Matchbook (these files are the same throughout the exchanges):

src/matchbook/retry.js

const bettingApi = require('../api/betting')
const { writeParams } = require('../../../../lib/logger/writer')
const logger = require('../../../../lib/logger/winston')
const utils = {
    isArbStillValid: require('../../../utils/is-arb-still-valid'),
    getFormattedArb: require('../../../utils/get-formatted-arb'),
    logJsonError: require('../../../utils/log-json-error')
}
const arbitrage = {
    getBackBackArbTrade: require('../../../strategies/arbitrage/get-back-back-arb-trade'),
    getBackLayArbTrade: require('../../../strategies/arbitrage/get-back-lay-arb-trade')
}
const PlaceReturns = require('../../../../lib/consts/place-returns')
const placeKeepBet = require('./place-keep-bet')
const placeImmediateBet = require('./place-immediate-bet')
const isNotFoundError = require('./is-not-found-error')

async function retry(arb, config, runners, lossAllowed = 0, isKeep) {
    try {
        if (process.env.NODE_ENV === 'development') {
            writeParams('exchanges/matchbook/helpers/retry-arb', { arb, config, runners, lossAllowed })
        }
        const betId = arb._id.toString()
        const { eventId, marketId, runnerId } = runners.unmatched
        const response = await bettingApi.getPrices(eventId, marketId, runnerId)
        const { prices } = response.data

        if (prices?.length) {
            const samePrice = prices.find(_price => {
                return _price.side === runners.unmatched.side
            })

            if (samePrice) {
                const odds = samePrice['decimal-odds'] || samePrice.odds
                const newArb = {
                    ...arb,
                    runners: [ ... ]
                }
                const isValidArb = utils.isArbStillValid(newArb, lossAllowed)

                if (isValidArb) {
                    const formattedArb = utils.getFormattedArb(newArb)
                    const _newArb = arb.subType === 'BackLay'
                        ? arbitrage.getBackLayArbTrade(formattedArb, config, lossAllowed)
                        : arbitrage.getBackBackArbTrade(formattedArb, config, lossAllowed)

                    if (_newArb) {
                        const unmatchedRunner = _newArb.runners.find(runner => {
                            return runner.exchange === runners.unmatched.exchange
                        })

                        if (unmatchedRunner) {
                            const placeParams = { ... }
                            const result = isKeep
                                ? await placeKeepBet(placeParams)
                                : await placeImmediateBet(placeParams)

                            if (result) {
                                return result
                            }
                        } else {
                            logger.info(`[NO UNMATCHED RUNNER (RETRY | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
                        }
                    } else {
                        logger.info(`[NO NEW ARB (RETRY | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
                    }
                } else {
                    logger.info(`[NO VALID ARB (RETRY | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
                }
                return
            }
            logger.info(`[NO SAME PRICE (RETRY | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
        } else {
            logger.info(`[NO PRICES (RETRY | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
        }
        const _response = await bettingApi.getMarket(eventId, marketId)
        const { status } = _response.data

        if (status === 'closed') {
            logger.info(`[MATCHBOOK MARKET HAS ENDED (RETRY | Bet: ${betId} | Process: ${process.pid})]`)

            return PlaceReturns.MATCHBOOK.ENDED
        }
    } catch (err) {
        if (['ECONNRESET', 'EPIPE'].includes(err.code)) {
            logger.info(`${err.code} (MATCHBOOK) - retryArb [process: ${process.pid}]`)

            return retry(arb, config, runners, lossAllowed, isKeep)
        }
        const notFoundError = isNotFoundError(err)

        if (notFoundError) {
            return PlaceReturns.MATCHBOOK.ENDED
        }
        utils.logJsonError(__filename, 'retryArb', err, {
            _throw: true
        })
    }
}

module.exports = retry

src/matchbook/trade-out.js

const bettingApi = require('../api/betting')
const { writeParams } = require('../../../../lib/logger/writer')
const logger = require('../../../../lib/logger/winston')
const utils = {
    getTradeOutProfitStakePairs: require('../../../utils/get-trade-out-profit-stake-pairs'),
    getBestProfitStakePair: require('../../../utils/get-best-profit-stake-pair'),
    logJsonError: require('../../../utils/log-json-error'),
    getCanGreenUp: require('../../../utils/get-can-green-up')
}
const PlaceReturns = require('../../../../lib/consts/place-returns')
const placeKeepBet = require('./place-keep-bet')
const placeImmediateBet = require('./place-immediate-bet')
const isNotFoundError = require('./is-not-found-error')

async function tradeOut(arb, matchedRunner, config, lossAllowed = 0, isKeep) {
    try {
        if (process.env.NODE_ENV === 'development') {
            writeParams('exchanges/matchbook/helpers/trade-out', { arb, matchedRunner, config, lossAllowed, isKeep })
        }
        const betId = arb._id.toString()
        const { eventId, marketId, runnerId } = matchedRunner
        const response = await bettingApi.getPrices(eventId, marketId, runnerId)
        const matchedLay = matchedRunner.side === 'lay'
        const lowEx = matchedRunner.exchange.toLowerCase()
        const { prices } = response.data
        const maxToInvest = config.balances[lowEx] * 0.999

        if (prices?.length) {
            const oppositePrice = prices.find(price => {
                return price.side !== matchedRunner.side
            })

            if (oppositePrice) {
                const odds = oppositePrice['decimal-odds'] || oppositePrice.odds
                const canGreenUp = utils.getCanGreenUp(matchedLay, odds, matchedRunner.price, lossAllowed)

                if (canGreenUp) {
                    const profitStakePairs = utils.getTradeOutProfitStakePairs(matchedRunner, arb, config, lossAllowed)

                    if (profitStakePairs.length) {
                        const bestProfitStakePair = utils.getBestProfitStakePair('BackLay', profitStakePairs, config)
                        const stakeToUse = bestProfitStakePair.side === 'lay' ? bestProfitStakePair.stakes.lay : bestProfitStakePair.stakes.back
                        const placeParams = { ... }
                        const result = isKeep
                            ? await placeKeepBet(placeParams)
                            : await placeImmediateBet(placeParams)

                        if (result) {
                            return result
                        }
                    } else {
                        logger.info(`[NO PROFIT/STAKE PAIRS (TRADE-OUT | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
                    }
                } else {
                    logger.info(`[CANNOT GREEN UP (TRADE-OUT | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
                }
                return
            }
            logger.info(`[NO OPPOSITE PRICE (TRADE-OUT | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
        } else {
            logger.info(`[NO PRICES (TRADE-OUT | MATCHBOOK | Bet: ${betId} | Process: ${process.pid})]`)
        }
        const _response = await bettingApi.getMarket(eventId, marketId)
        const { status } = _response.data

        if (status === 'closed') {
            logger.info(`[MATCHBOOK MARKET HAS ENDED (TRADE-OUT | Bet: ${betId} | Process: ${process.pid})]`)

            return PlaceReturns.MATCHBOOK.ENDED
        }
    } catch (err) {
        if (['ECONNRESET', 'EPIPE'].includes(err.code)) {
            logger.info(`${err.code} (MATCHBOOK) - tradeOut [process: ${process.pid}]`)

            return tradeOut(arb, matchedRunner, config, lossAllowed, isKeep)
        }
        const notFoundError = isNotFoundError(err)

        if (notFoundError) {
            return PlaceReturns.MATCHBOOK.ENDED
        }
        utils.logJsonError(__filename, 'tradeOut', err, {
            _throw: true
        })
    }
}

module.exports = tradeOut

Below are the files that retry.js calls that trade-out.js does not:

src/utils/get-formatted-arb.js

const { writeParams } = require('../../lib/logger/writer')

module.exports = function(arb) {
    if (process.env.NODE_ENV === 'development') {
        writeParams('utils/get-formatted-arb', { arb })
    }
    return {
        ...arb,
        runners: arb.runners.reduce((acc, _runner) => {
            if (arb.subType === 'BackBack') {
                const otherRunner = arb.runners.find(__runner => {
                    return __runner.eventId !== _runner.eventId && __runner.marketId !== _runner.marketId && __runner.runnerId !== _runner.runnerId
                })
                const outcome = (_runner.price < otherRunner.price) ? 'outcome1' : 'outcome2'

                acc[outcome] = _runner
                return acc
            }
            acc[_runner.side] = _runner
            return acc
        }, {})
    }
}

src/utils/is-arb-still-valid.js

const ArbTable = require('../../lib/consts/arb-table')
const { writeParams } = require('../../lib/logger/writer')
const logJsonError = require('./log-json-error')

module.exports = function(arb, lossAllowed = 0) {
    try {
        if (process.env.NODE_ENV === 'development') {
            writeParams('utils/is-arb-still-valid', { arb, lossAllowed })
        }
        const { runners } = arb

        if (arb.subType === 'BackBack') {
            const [r0Price, r1Price] = [runners[0].price, runners[1].price]
            const [r0PriceMinusLoss, r1PriceMinusLoss] = [r0Price - (r0Price * lossAllowed), r1Price - (r1Price * lossAllowed)]

            if (r0PriceMinusLoss < 1.2 && r1PriceMinusLoss < 1.2) {
                return false
            }
            return ArbTable.find(_arb => {
                const outcome1 = runners.find(runner => {
                    return String(runner.outcomeSide) === '1'
                })
                const outcome2 = runners.find(runner => {
                    return String(runner.outcomeSide) === '2'
                })
                const [o1MinusLoss, o1AddLoss] = [_arb.outcome1 - (_arb.outcome1 * lossAllowed), _arb.outcome1 + (_arb.outcome1 * lossAllowed)]
                const [o2MinusLoss, o2AddLoss] = [_arb.outcome2 - (_arb.outcome2 * lossAllowed), _arb.outcome2 + (_arb.outcome2 * lossAllowed)]

                return outcome1.price >= o1MinusLoss && outcome1.price < (o1AddLoss + 0.1) && outcome2.price > o2MinusLoss
            })
        }
        const backRunner = runners.find(_runner => _runner.side === 'back')
        const layRunner = runners.find(_runner => _runner.side === 'lay')
        const [layPrice, backPrice] = [layRunner.price, backRunner.price]
        const layPriceMinusLoss = layPrice - (layPrice * lossAllowed)

        return layPriceMinusLoss < backPrice
    } catch(err) {
        logJsonError(__filename, '_init_', err, {
            _throw: true
        })
    }
}

src/arbitrage/get-back-lay-arb-trade.js

const utils = {
    getHighestLowestExchangeBalances: require('../../utils/get-highest-lowest-exchange-balances'),
    getBackLayStakesAndProfits: require('../../utils/get-back-lay-stakes-and-profits'),
    logJsonError: require('../../utils/log-json-error')
}
const { writeParams } = require('../../../lib/logger/writer')
const logger = require('../../../lib/logger/winston')

module.exports = function(arb, config, lossAllowed = 0) {
    try {
        if (process.env.NODE_ENV === 'development') {
            writeParams('strategies/arbitrage/get-back-lay-arb-trade', { arb, config })
        }

        const highestLowestExchangeBalances = utils.getHighestLowestExchangeBalances(arb.runners, config, lossAllowed)
        const backLayStakesAndProfits = utils.getBackLayStakesAndProfits({
            arb,
            maxToInvest: highestLowestExchangeBalances.lowest,
            config,
            lossAllowed
        })

        if (backLayStakesAndProfits && Object.keys(backLayStakesAndProfits).length) {
            const backLayArbTrade = {
                ...arb,
                type: 'Arb',
                subType: 'BackLay',
                netProfits: backLayStakesAndProfits.netProfits,
                grossProfits: backLayStakesAndProfits.grossProfits,
                runners: Object.keys(arb.runners).map(side => {
                    return {
                        stake: backLayStakesAndProfits.stakes[side],
                        side,
                        ...arb.runners[side]
                    }
                })
            }

            logger.debug(`backLayArbTrade: ${JSON.stringify(backLayArbTrade)}`)

            return backLayArbTrade
        }
    } catch(err) {
        utils.logJsonError(__filename, '_init_', err, {
            _throw: true
        })
    }
}

src/utils/get-highest-lowest-exchange-balances.js

const { writeParams } = require('../../lib/logger/writer')

module.exports = function(runners, config, lossAllowed = 0) {
    if (process.env.NODE_ENV === 'development') {
        writeParams('utils/get-highest-lowest-exchange-balances', { runners, config, lossAllowed })
    }
    const exchangeBalances = Object.values(runners).map(val => {
        const lowEx = val.exchange.toLowerCase()
        const amountToSave = (config.amountToSave[lowEx] || 0)

        if (lossAllowed > 0) {
            return config.balances[lowEx] - amountToSave
        }
        return (config.balances[lowEx] - amountToSave) * (config.maxPercentToUse || 1)
    })

    return {
        lowest: Math.min(...exchangeBalances),
        highest: Math.max(...exchangeBalances)
    }
}

src/utils/get-back-lay-stakes-and-profits.js

const { writeParams } = require('../../lib/logger/writer')
const trimNumber = require('./trim-number')
const getBestProfitStakePair = require('./get-best-profit-stake-pair')
const getPairsBetterThanCurrent = require('./get-pairs-better-than-current')
const logger = require('../../lib/logger/winston')
const logJsonError = require('./log-json-error')

module.exports = function({ arb, maxToInvest, config, lossAllowed = 0 }) {
    try {
        if (process.env.NODE_ENV === 'development') {
            writeParams('utils/get-back-lay-stakes-and-profits', { arb, maxToInvest, config, lossAllowed })
        }
        const profitStakePairs = []
        const [backRunner, layRunner] = [arb.runners.back, arb.runners.lay]
        const [backEx, layEx] = [backRunner.exchange.toLowerCase(), layRunner.exchange.toLowerCase()]
        const [minBack, minLay] = [config.minimumBetSizes[backEx].back, config.minimumBetSizes[layEx].lay]
        const maxQuantityToUse = config.maxQuantityToUse || 1
        const [backCommission, layCommission] = [config.commissions[backEx].back, config.commissions[layEx].lay]
        const [backPrice, layPrice] = [backRunner.price, layRunner.price]
        const [backQuantity, layQuantity] = [backRunner.quantity, layRunner.quantity]
        const [backStake, layStake] = [backRunner.stake, layRunner.stake]
        const [backIsMatched, layIsMatched] = [backRunner.status === 'MATCHED', layRunner.status === 'MATCHED']
        const [matchedBackStake, matchedLayStake] = [backIsMatched && backStake, layIsMatched && layStake]
        const profitStakePairsNotInParams = []

        let isExisting = false

        if (arb._id && process.CURRENT_BETS[arb._id.toString()]) {
            isExisting = true

            const matchedRunner = process.CURRENT_BETS[arb._id.toString()].runners.find(runner => {
                return runner.status.toUpperCase() === 'MATCHED'
            })
            const { netProfits } = process.CURRENT_BETS[arb._id.toString()]

            if (matchedRunner) {
                const matchedNetProfit = process.CURRENT_BETS[arb._id.toString()].subType === 'BackBack' ? netProfits[`outcome${matchedRunner.outcomeSide}`] : netProfits[matchedRunner.side.toLowerCase()]

                lossAllowed = (matchedNetProfit * lossAllowed) * -1
            }
        }

        let [potentialLayStake, potentialBackStake] = [undefined, maxToInvest * 0.999]

        do {
            if (!matchedLayStake) {
                potentialLayStake = ((backPrice - backCommission) * (matchedBackStake || potentialBackStake)) / (layPrice - layCommission)
            }
            const [layStakeToUse, backStakeToUse] = [matchedLayStake || potentialLayStake, matchedBackStake || potentialBackStake]
            const grossLayProfit = layStakeToUse
            const layLiability = (layPrice - 1) * layStakeToUse
            const netLayProfit = layStakeToUse * (1 - layCommission) - backStakeToUse
            const calculatedBackGrossProfit = (backPrice - 1) * backStakeToUse
            const grossBackProfit = backIsMatched ? arb.grossProfits?.back || calculatedBackGrossProfit : calculatedBackGrossProfit
            const backCommissionPaid = grossBackProfit * backCommission
            const netBackProfit = grossBackProfit - backCommissionPaid - layLiability
            const withinLossAllowed = netBackProfit >= lossAllowed && netLayProfit >= lossAllowed
            const amountToSave = config.amountToSave[layEx] || 0
            const trueLayExchangeBalance = (config.balances[layEx] - amountToSave) * (config.maxPercentToUse || 1)
            const betWithinParameters = layStakeToUse <= maxToInvest && layStakeToUse > minLay && layLiability <= trueLayExchangeBalance
            const quantityMagnifier = lossAllowed > 0 ? 1 : maxQuantityToUse
            const stakeWithinQuantityLimits = layStakeToUse <= (layQuantity * quantityMagnifier) && backStakeToUse <= (backQuantity * quantityMagnifier)
            const currProfitStakePair = {
                netProfits: {
                    lay: parseFloat(netLayProfit.toFixed(2)),
                    back: parseFloat(netBackProfit.toFixed(2))
                },
                grossProfits: {
                    lay: parseFloat(grossLayProfit.toFixed(2)),
                    back: parseFloat(grossBackProfit.toFixed(2))
                },
                stakes: {
                    lay: trimNumber(matchedLayStake || potentialLayStake, 2),
                    back: trimNumber(matchedBackStake || potentialBackStake, 2)
                }
            }

            if (betWithinParameters && stakeWithinQuantityLimits && withinLossAllowed) {
                profitStakePairs.push(currProfitStakePair)
            }

            if (potentialBackStake > minBack) {
                potentialBackStake = matchedBackStake ? minBack : potentialBackStake * 0.999
            }
        } while (potentialBackStake > minBack)

        if (profitStakePairs.length) {
            const pairsBetterThanCurrent = getPairsBetterThanCurrent(arb, 'BackLay', profitStakePairs)

            return getBestProfitStakePair('BackLay', pairsBetterThanCurrent, config)
        }
    } catch (err) {
        logJsonError(__filename, '_init_', err, {
            _throw: true
        })
    }
}

src/matchbook/place-keep-bet.js

const logError = require('./log-error')
const logger = require('./logger/winston')
const api = require('./api')

async function placeKeepBet(bet) {
    const betId = bet.id.toString()
    const offer = { ... }
    const _attempt = ...

    try {
        logger.info(`[MATCHBOOK OFFER (PLACE-KEEP-BET | ${_attempt} | Bet: ${betId} | Process: ${process.pid})]${JSON.stringify(offer)}`)

        const response = await api.submitOffers(offer)
        const _offer = response.data.offers[0]

        if (_offer.status === 'failed') {
            logger.error(`[PLACE BET ERROR (MATCHBOOK | PLACE-KEEP-BET | ${_attempt} | Bet: ${betId} | Process: ${process.pid})]`)

            return 'FAILED'
        }
        if (_offer.status === 'matched') {
            logger.info(`[MATCHED (MATCHBOOK | PLACE-KEEP-BET | ${_attempt} | Bet: ${betId} Process: ${process.pid})]`)

            return 'MATCHED'
        }
        if (['flushed', 'cancelled'].includes(_offer.status)) {
            logger.info(`[CANCELLED (MATCHBOOK | PLACE-KEEP-BET | ${_attempt} | Bet: ${betId} | Process: ${process.pid})]`)

            return 'CANCELLED'
        }
        if (['delayed', 'open'].includes(_offer.status)) {
            const [_stake, _remaining] = [_offer.stake || offer.offers[0].stake, _offer.remaining || offer.offers[0].stake]
            const _status = _remaining < _stake ? 'PARTIALLY_MATCHED' : 'UNMATCHED'

            logger.info(`[${_status} (MATCHBOOK | PLACE-KEEP-BET | ${_attempt} | Bet: ${betId} | Process: ${process.pid})]`)

            return 'UNMATCHED'
        }
        const _mresponse = await api.getMarket()
        const market = _mresponse?.data

        if (!['open', 'suspended'].includes(market?.status)) {
            return 'ENDED'
        }
        return 'FAILED'
    } catch (err) {
        logError(err, __filename)

        throw err
    }
}

module.exports = placeKeepBet

Example expected behaviour for logs is:

{"message":"[ATTEMPTING RETRY FOR MATCHBOOK WITH 30.20% LOSS ALLOWED (Process: 174)]","level":"info","timestamp":"2020-12-29 17:38:16"}
{"message":"[MATCHBOOK OFFER (PLACE-KEEP-BET | RETRY | Bet: 1234 | Process: 174)]","level":"info","timestamp":"2020-12-29 17:38:32"}
{"message":"[MATCHED (MATCHBOOK | PLACE-KEEP-BET | RETRY | Bet: 1234 | Process: 174)]","level":"info","timestamp":"2020-12-29 17:38:38"}
{"message":"[EVENT HAS ENDED (Process: 174)]","level":"info","timestamp":"2020-12-29 18:25:01"}
{"message":"[EXITING WITH CODE 0 (Process: 174)]","level":"info","timestamp":"2020-12-29 18:25:01"}

The "EVENT HAS ENDED ..." message is logged from the master process that is polling the state of all the live events. The "EXITING WITH CODE ..." message is logged by the child process acknowledging that the master process has told it to die.

When this issue occurs, actual logs are:

{"message":"[ATTEMPTING RETRY FOR MATCHBOOK WITH 30.20% LOSS ALLOWED (Process: 174)]","level":"info","timestamp":"2020-12-29 17:38:16"}
{"message":"[EVENT HAS ENDED (Process: 174)]","level":"info","timestamp":"2020-12-29 18:25:01"}

Even if there is no offer logged as the rectification is not possible, I would still expect more logs. I have logs throughout retry.js to inform me of what's happening. The timeout callback has been invoked but nothing after that.

I have these hooks in the entry point for a worker (src/worker/index.js) which are not logging either:

const logger = require('../logger/winston')

module.exports = async function() {
     try {
        process.on('exit', code => {
            logger.info(`[EXITING WITH CODE ${code} (Process: ${process.pid})]`)
        })
        process.on('disconnect', () => {
            logger.info(`[WORKER DISCONNECTED (Process: ${process.pid})]`)
        })
     } catch(err) {
         // handle error
     }
}

In the latter example of the logs (Actual) that are not working correctly, the master process has logged that the event has ended but the child process has not. This makes me think that the child process has already exited but has not logged that it has? Is this possible?

It's worth mentioning that the master process continued to log & that other child process' were spun up (& logging) whilst this one was stalled.

To provide a little more info on the "rectification process". The process is a recursive setTimeout. The example I gave above of the expected logs were that of best case scenario where the other side of the "matched bet" was matched instantly & the process succeeded on the 1st pass.

However, this is not always the case & could take many iterations of the process to be rectified. An example of logs here would be:

{"message":"[ATTEMPTING RETRY FOR MATCHBOOK WITH 30.20% LOSS ALLOWED (Process: 174)]","level":"info","timestamp":"2020-12-29 17:38:16"}
{"message":"[NO MATCHBOOK SAME PRICE (RETRY | Bet: 1234 | Process: 174)]","level":"info","timestamp":"2020-12-29 17:38:32"}
{"message":"[ATTEMPTING TRADE OUT FOR SMARKETS WITH 32.10% LOSS ALLOWED (Process: 174)]","level":"info","timestamp":"2020-12-29 17:48:28"}
{"message":"[NO SMARKETS RUNNER (RETRY | Bet: 1234 | Process: 174)]","level":"info","timestamp":"2020-12-29 17:48:34"}
{"message":"[ATTEMPTING RETRY FOR MATCHBOOK WITH 34.50% LOSS ALLOWED (Process: 174)]","level":"info","timestamp":"2020-12-29 17:58:37"}
{"message":"[MATCHBOOK OFFER (PLACE-KEEP-BET | RETRY | Bet: 1234 | Process: 174)]","level":"info","timestamp":"2020-12-29 17:58:38"}
{"message":"[MATCHED (MATCHBOOK | PLACE-KEEP-BET | RETRY | Bet: 1234 | Process: 174)]","level":"info","timestamp":"2020-12-29 17:58:43"}
{"message":"[EVENT HAS ENDED (Process: 174)]","level":"info","timestamp":"2020-12-29 18:25:01"}
{"message":"[EXITING WITH CODE 0 (Process: 174)]","level":"info","timestamp":"2020-12-29 18:25:01"}

EDIT:

This happened again this evening:

enter image description here

These are the logs from a sporting event this evening. These logs are stripped from my UI. I query them via the Loggly API.

To cut through the noise in those logs, what you need to know is @ 06/01/2021 21:06:49 the "rectification process" was started with the log [HANDLING INCOMPLETE IMMEDIATE ARB .... There was 1 attempt shown using a 48.5% loss allowed with Matchbook. There are no further logs after this. There was no logs to say that the worker had died/been disconnected either. This event was not over until approximately 21:45:00.

EDIT 2:

This happened again around 6 hours ago:

enter image description here

The "rectification process" consists of 2 methods which 'take it in-turns'; retry or trade-out. I've noticed that this only happens when it's retry's turn irrespective of the exchange being used. This makes me thing that the error lies in the retry.js files for the exchanges. I've added the trade-out.js file as well for comparison & have edited retry.js to the unedited version.

From all the files I've now added above, they are all wrapped in try...catch's but still no errors.

EDIT 3:

enter image description here

^^ an example showing logs that the app continued to spin up other child process' even though 1 has been stalled

EDIT 4:

About resource usage on the DO box. The CPU usage is pretty much constantly running at 100% CPU. There are 6 docker containers running on the same DO box. 5 of these are just APIs & don't consume many resources but better, this one, is the one that does. These are the results from docker stats:

enter image description here

The reason I wrote off resource usage as an issue was that this issue was happening when the CPU usage on the DO box was only about 20% & docker stats was showing CPU usage for better at below 100%. Running ps -aux, I get this:

enter image description here

As you can imagine, these are all child process' that have been spun up to monitor each event. When spawning them, I don't specify max-old-space-size option. The cluster module must just do that. Preferably, each 1 of these child process' should only consume 0.5gb (512). The command to initially run the docker container is:

docker run -d -p ${PORT}:${PORT} --restart always -m ${MEM_LIMIT} --log-opt max-size=${LOG_LIMIT} --init --name ${CI_PROJECT_NAME} --net ${NETWORK_NAME} --ip ${NETWORK_IP} ${CI_REGISTRY}/${CI_PROJECT_PATH}:latest;

The MEM_LIMIT is 5g for this container. I understand resources might be an issue in the long run but I'm not 100% sure it's an issue here even when the app runs fine without hiccups at 100% CPU usage but causes issues at only 20% usage. Mem usage is > 50% btw.

wmash
  • 4,032
  • 3
  • 31
  • 69
  • 1
    `catch (err) { throw err }` is pointless - just omit the whole `try`/`catch`. And avoid the [`Promise` constructor antipattern](https://stackoverflow.com/q/23803743/1048572?What-is-the-promise-construction-antipattern-and-how-to-avoid-it)! – Bergi Dec 29 '20 at 22:48
  • "*The master process is the orchestrator of these child ones - if the event has ended, it kills the child process monitoring it.*" - are you certain that your master process is not the one killing your processes and all you notice is that they "*sporadically stop*"? I would revisit the entire design - why not just the child processes exit themselves when the event finishes? – Bergi Dec 29 '20 at 22:51
  • @Bergi the `catch` is not pointless. It does not just `throw` the error but logs it. I just stripped that out as to not overload the question with code that didn't matter. I will research the Promise constructor anti-pattern – wmash Dec 29 '20 at 22:57
  • @Bergi I will add in code to show the exit logs. With regards to the child process handling the event, it originally was (https://stackoverflow.com/questions/65030139/child-process-setinterval-sporadically-not-firing) however, it was polling the event every 10 secs or so which, if you had many child process', drove up the CPU usage as each one was polling. It was less resource intensive to have the master do 1 poll but handle many events at once rather than many child process' polling it's own event – wmash Dec 29 '20 at 22:58
  • @Bergi I have updated the question to show the lifecycle hooks for child process'. If the entire process was behaving correctly, I've updated the logs. The master process logs that the event has ended which sends a message to the worker handling it to die. In the latter logs that are borked, there is no message from the worker to acknowledge that the even has ended & is exiting making me think that the child process is already dead but has not logged that it has. – wmash Dec 29 '20 at 23:15
  • Now that you've posted the event handling, I believe you actually have a problem with logging only. What transports is your winston logger using? Have a look at https://github.com/winstonjs/winston/issues/1629 and https://stackoverflow.com/questions/58933772/how-to-force-nodejs-winston-log-to-file-before-process-exit – Bergi Dec 30 '20 at 16:16
  • @Bergi please see my edit. I am using 2 main transport layers. The links you have posted, the GitHub one mentions that it is still being logged to the Console if not to the File. This is not the case for me, there is no further logging to the Console either. I'm intrigued by the `finish` hook on the logger but how would I use that if **all** process' are logging to the same file? If I could get the child process to log if it sporadically exits then that would help with narrowing down the issue. – wmash Dec 31 '20 at 10:36
  • "*all processes log to the same file*" - this might be the actual issue. [It's hard to make this work](https://stackoverflow.com/q/36658995/1048572), and I doubt winston's file transport takes such precautions. I would expect your processes to overwrite each other's logs. – Bergi Dec 31 '20 at 15:25
  • @Bergi would you suggest a different logging mechanism then? Maybe syslog or log stash? I am confused though as to why it would not even appear in the console logs. We've found out that the process is dying but haven't found out why. It's just surprising that it's so sporadic & that 60-70% (ish) of the time, it's fine. I may also limit the number of child process' that can be spun up at any one time. I will do that & switch to another logging mechanism. Do you have any other suggestions to try & narrow down why this would be happening? I will also add a bit more info for you to the question – wmash Dec 31 '20 at 16:59
  • Oh right if you're also logging the console, those should work. So there's probably an additional issue. – Bergi Dec 31 '20 at 17:00
  • @Bergi I've added **EDIT 2** to see if additional context to the "rectification process" will help – wmash Dec 31 '20 at 17:30
  • Could you maybe shorten the question a bit by stripping out all the irrelevant details about the betting logic? That would raise the chance of answers :-) – Bergi Jan 01 '21 at 20:55
  • @Bergi I have stripped as much as I can. I think some of the betting details are relevant to explain the *rectification process* & how it happens as I think that's where the error lies. I have stripped out the initial edit I made about the logging since we ruled out that that was not the issue as it was not logging to the console either – wmash Jan 01 '21 at 21:08
  • @wmash are you sure the server this thing runs on has enough resources? Maybe OS kills the child process when there's high ram or cpu usage – Sebastian Kaczmarek Jan 08 '21 at 20:07
  • @SebastianKaczmarek please take a look at **EDIT 4** – wmash Jan 08 '21 at 20:58

1 Answers1

0

Can you log the process status to identify if one of your process is crashing / being suspended ?

#!/bin/bash
while :
do
    ps aux | grep node | grep -v "grep" | sed "s#^#$(date +%Y/%m/%d/%H:%M) #" >> process_log
    sleep 1
done
Daphoque
  • 4,421
  • 1
  • 20
  • 31
  • To keep this running constantly? As I can't just run it once as child process' sporadically die – wmash Jan 08 '21 at 11:13
  • yeah then when you discover a "fail" in you logs check at the same time in this file to see if your process has really been killed/suspended – Daphoque Jan 08 '21 at 11:17