Precise method of segmenting & transcoding video+audio (via ffmpeg), into an on-demand HLS stream?

Question

recently I've been messing around with FFMPEG and streams through Nodejs. My ultimate goal is to serve a transcoded video stream - from any input filetype - via HTTP, generated in real-time as it's needed in segments.

I'm currently attempting to handle this using HLS. I pre-generate a dummy m3u8 manifest using the known duration of the input video. It contains a bunch of URLs that point to individual constant-duration segments. Then, once the client player starts requesting the individual URLs, I use the requested path to determine which time range of video the client needs. Then I transcode the video and stream that segment back to them.

Now for the problem: This approach mostly works, but has a small audio bug. Currently, with most test input files, my code produces a video that - while playable - seems to have a very small (< .25 second) audio skip at the start of each segment.

I think this may be an issue with splitting using time in ffmpeg, where possibly the audio stream cannot be accurately sliced at the exact frame the video is. So far, I've been unable to figure out a solution to this problem.

If anybody has any direction they can steer me - or even a prexisting library/server that solves this use-case - I appreciate the guidance. My knowledge of video encoding is fairly limited.

I'll include an example of my relevant current code below, so others can see where I'm stuck. You should be able to run this as a Nodejs Express server, then point any HLS player at localhost:8080/master to load the manifest and begin playback. See the transcode.get('/segment/:seg.ts' line at the end, for the relevant transcoding bit.

'use strict';
const express = require('express');
const ffmpeg = require('fluent-ffmpeg');
let PORT = 8080;
let HOST = 'localhost';
const transcode = express();


/*
 * This file demonstrates an Express-based server, which transcodes & streams a video file.
 * All transcoding is handled in memory, in chunks, as needed by the player.
 *
 * It works by generating a fake manifest file for an HLS stream, at the endpoint "/m3u8".
 * This manifest contains links to each "segment" video clip, which browser-side HLS players will load as-needed.
 *
 * The "/segment/:seg.ts" endpoint is the request destination for each clip,
 * and uses FFMpeg to generate each segment on-the-fly, based off which segment is requested.
 */


const pathToMovie = 'C:\\input-file.mp4';  // The input file to stream as HLS.
const segmentDur = 5; //  Controls the duration (in seconds) that the file will be chopped into.


const getMetadata = async(file) => {
    return new Promise( resolve => {
        ffmpeg.ffprobe(file, function(err, metadata) {
            console.log(metadata);
            resolve(metadata);
        });
    });
};



// Generate a "master" m3u8 file, which the player should point to:
transcode.get('/master', async(req, res) => {
    res.set({"Content-Disposition":"attachment; filename=\"m3u8.m3u8\""});
    res.send(`#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=150000
/m3u8?num=1
#EXT-X-STREAM-INF:BANDWIDTH=240000
/m3u8?num=2`)
});

// Generate an m3u8 file to emulate a premade video manifest. Guesses segments based off duration.
transcode.get('/m3u8', async(req, res) => {
    let met = await getMetadata(pathToMovie);
    let duration = met.format.duration;

    let out = '#EXTM3U\n' +
        '#EXT-X-VERSION:3\n' +
        `#EXT-X-TARGETDURATION:${segmentDur}\n` +
        '#EXT-X-MEDIA-SEQUENCE:0\n' +
        '#EXT-X-PLAYLIST-TYPE:VOD\n';

    let splits = Math.max(duration / segmentDur);
    for(let i=0; i< splits; i++){
        out += `#EXTINF:${segmentDur},\n/segment/${i}.ts\n`;
    }
    out+='#EXT-X-ENDLIST\n';

    res.set({"Content-Disposition":"attachment; filename=\"m3u8.m3u8\""});
    res.send(out);
});

// Transcode the input video file into segments, using the given segment number as time offset:
transcode.get('/segment/:seg.ts', async(req, res) => {
    const segment = req.params.seg;
    const time = segment * segmentDur;

    let proc = new ffmpeg({source: pathToMovie})
        .seekInput(time)
        .duration(segmentDur)
        .outputOptions('-preset faster')
        .outputOptions('-g 50')
        .outputOptions('-profile:v main')
        .withAudioCodec('aac')
        .outputOptions('-ar 48000')
        .withAudioBitrate('155k')
        .withVideoBitrate('1000k')
        .outputOptions('-c:v h264')
        .outputOptions(`-output_ts_offset ${time}`)
        .format('mpegts')
        .on('error', function(err, st, ste) {
            console.log('an error happened:', err, st, ste);
        }).on('progress', function(progress) {
            console.log(progress);
        })
        .pipe(res, {end: true});
});

transcode.listen(PORT, HOST);
console.log(`Running on http://${HOST}:${PORT}`);

To avoid the glitch at the seams, you have to transcode the entire audio in one encoding instance. So, pre-encode the audio. — Gyan, Nov 17 '19 at 12:10
Unfortunately, I was not. I put the project aside for a while and haven't circled back. It seems like there isn't a simple way to reliably split the audio at the precise point I would need. — Felix, May 01 '21 at 20:08
@Felix thanks for the reply. I am also struggling through the same problem, let's update here if anything works for us. — Dinesh Kumar, May 05 '21 at 05:00
I've found a solution to this, instead of creating each segment everytime it's requested you should start the HLS transcoding when the m3u8 file is requested. That way you are doing a regular HLS transcoding with your manually created m3u8 file. I will create an answer when I have finalized the code. — Gustav P Svensson, Nov 04 '21 at 22:05
Interesting. I'd be happy to know if you get any working implementation of this. — Felix, Nov 13 '21 at 00:23

score 4 · Answer 1 · answered Nov 27 '21 at 02:33

I had the same problem as you, and I've managed to fix this issue as i mentioned in the comment by starting the complete HLS transcoding instead of doing manually the segment requested by the client. I'm going to simplify what I've done and also share the link to my github repo where I've implemented this. I did the same as you for generating the m3u8 manifest:

        const segmentDur = 4; // Segment duration in seconds
        const splits = Math.max(duration / segmentDur); // duration = duration of the video in seconds
        let out = '#EXTM3U\n' +
            '#EXT-X-VERSION:3\n' +
            `#EXT-X-TARGETDURATION:${segmentDur}\n` +
            '#EXT-X-MEDIA-SEQUENCE:0\n' +
            '#EXT-X-PLAYLIST-TYPE:VOD\n';

        for (let i = 0; i < splits; i++) {
            out += `#EXTINF:${segmentDur}, nodesc\n/api/video/${id}/hls/${quality}/segments/${i}.ts?segments=${splits}&group=${group}&audioStream=${audioStream}&type=${type}\n`;
        }
        out += '#EXT-X-ENDLIST\n';
        res.send(out);
        resolve();

This works fine when you transcode the video (i.e use for example libx264 as video encoder in the ffmpeg command later on). If you use videocodec copy the segments won't match the segmentDuration from my testing. Now you have a choice here, either you start the ffmpeg transcoding at this point when the m3u8 manifest is requested, or you wait until the first segment is requested. I went with the second option since I want to support starting the transcoding based on which segment is requested.

Now comes the tricky part, when the client requests a segment api/video/${id}/hls/<quality>/segments/<segment_number>.ts in my case you have to first check if any transcoding is already active. If a transcoding is active, you have to check if the requested segment has been processed or not. If it has been processed we can simply send the requested segment back to the client. If it hasn't been processed yet (for example because of a user seek action) we can either wait for it (if the latest processed segment is close to the requested) or we can stop the previous transcoding and restart at the newly requested segment.

I'm gonna try to keep this answer as simple as I can, the ffmpeg command I use to achieve the HLS transcoding looks like this:

         this.ffmpegProc = ffmpeg(this.filePath)
        .withVideoCodec(this.getVideoCodec())
        .withAudioCodec(audioCodec)
        .inputOptions(inputOptions)
        .outputOptions(outputOptions)
        .on('end', () => {
            this.finished = true;
        })
        .on('progress', progress => {
            const seconds = this.addSeekTimeToSeconds(this.timestampToSeconds(progress.timemark));
            const latestSegment = Math.max(Math.floor(seconds / Transcoding.SEGMENT_DURATION) - 1); // - 1 because the first segment is 0
            this.latestSegment = latestSegment;
        })
        .on('start', (commandLine) => {
            logger.DEBUG(`[HLS] Spawned Ffmpeg (startSegment: ${this.startSegment}) with command: ${commandLine}`);
            resolve();
        })
        .on('error', (err, stdout, stderr) => {
            if (err.message != 'Output stream closed' && err.message != 'ffmpeg was killed with signal SIGKILL') {
                logger.ERROR(`Cannot process video: ${err.message}`);
                logger.ERROR(`ffmpeg stderr: ${stderr}`);
            }
        })
        .output(this.output)
        this.ffmpegProc.run();

Where output options are:

    return [
        '-copyts', // Fixes timestamp issues (Keep timestamps as original file)
        '-pix_fmt yuv420p',
        '-map 0',
        '-map -v',
        '-map 0:V',
        '-g 52',
        `-crf ${this.CRF_SETTING}`,
        '-sn',
        '-deadline realtime',
        '-preset:v ultrafast',
        '-f hls',
        `-hls_time ${Transcoding.SEGMENT_DURATION}`,
        '-force_key_frames expr:gte(t,n_forced*2)',
        '-hls_playlist_type vod',
        `-start_number ${this.startSegment}`,
        '-strict -2',
        '-level 4.1', // Fixes chromecast issues
        '-ac 2', // Set two audio channels. Fixes audio issues for chromecast
        '-b:v 1024k',
        '-b:a 192k',
    ];

And input options:

        let inputOptions = [
            '-copyts', // Fixes timestamp issues (Keep timestamps as original file)
            '-threads 8',
            `-ss ${this.startSegment * Transcoding.SEGMENT_DURATION}`
        ];

Parameters worth noting is the -start_number in the output options, this basically tells ffmpeg which number to use for the first segment, if the client requests for example segment 500 we want to keep it simple and start the numbering at 500 if we have to restart the transcoding. Then we have the standard HLS settings (hls_time, hls_playlist_type and f). In the inputoptions I use -ss to seek to the requested transcoding, since we know we told the client in the generated m3u8 manifest that each segment was 4 seconds long, we can just seek to 4 * requestedSegment.

You can see in the 'progress' event from ffmpeg I calculate the latest processed segment by looking at the timemark. By converting the timemark to seconds, then adding the applied seek-time for the transcoding we can calculate approximately which segment was just finished by dividing the amount of seconds with the segment duration which I've set to 4.

Now there is a lot more to keep track of than just this, you have to save the ffmpeg processes that you've started so you can check if a segment is finished or not and if a transcoding is active when the segment is requested. You also have to stop already running transcodings if the user requests a segment far in the future so you can restart it with the correct seek time.

The downside to this approach is that the file is actually being transcoded and saved to your file system while the transcoding is running, so you need to remove the files when the user stops requesting segments.

I've implemented this so it handles the things I've mentioned (long seeks, different resolution requests, waiting until segment is finished etc). If you want to have a look at it it's located here: Github Dose, most interesting files are the transcoding class, hlsManger class and the endpoint for the segments. I tried explaining this as good as I can so I hope you can use this as some sort of base or idea on how to move forward.

Precise method of segmenting & transcoding video+audio (via ffmpeg), into an on-demand HLS stream?

1 Answers1