1

I'm using windows 11, and chrome for the web client. I have a golang program that runs two c++ programs as subprocess. the first uses the nvidia video codec SDK to set up an hevc encoder:

NV_ENC_INITIALIZE_PARAMS IP = {};
    NV_ENC_CONFIG C = {};
    IP.encodeConfig = &C;
    IP.encodeConfig->version = NV_ENC_CONFIG_VER;
    IP.version = NV_ENC_INITIALIZE_PARAMS_VER;

    IP.encodeGUID = NV_ENC_CODEC_HEVC_GUID;
    IP.presetGUID = NV_ENC_PRESET_P7_GUID;
    IP.tuningInfo = NV_ENC_TUNING_INFO_LOW_LATENCY;
    IP.encodeWidth = 1920;
    IP.encodeHeight = 1080;
    IP.frameRateNum = 60;
    IP.frameRateDen = 1;
    IP.enablePTD = 1;
    IP.enableEncodeAsync = false;

    NV_ENC_PRESET_CONFIG presetConfig = { NV_ENC_PRESET_CONFIG_VER, { NV_ENC_CONFIG_VER } };
    NvEncFunctions.nvEncGetEncodePresetConfigEx(pEncoder, NV_ENC_CODEC_HEVC_GUID, NV_ENC_PRESET_P7_GUID, NV_ENC_TUNING_INFO_LOW_LATENCY, &presetConfig);
    memcpy(IP.encodeConfig, &presetConfig.presetCfg, sizeof(NV_ENC_CONFIG));

    IP.encodeConfig->frameIntervalP = 1;
    IP.encodeConfig->gopLength = NVENC_INFINITE_GOPLENGTH;
    IP.encodeConfig->rcParams.rateControlMode = NV_ENC_PARAMS_RC_CBR;
    IP.encodeConfig->rcParams.averageBitRate = 800000;
    IP.encodeConfig->rcParams.enableAQ = 1;
    IP.encodeConfig->rcParams.zeroReorderDelay = 1;

    res = NvEncFunctions.nvEncInitializeEncoder(pEncoder, &IP);
    if (res != NV_ENC_SUCCESS) {
        cerr << "nvEncInitializeEncoder " << res;
        return 1;
    }

the second process uses opus.lib to set up an audio encoder:

// Define the Opus encoder parameters
    OpusEncoder* pEncoder = opus_encoder_create(48000, 2, OPUS_APPLICATION_AUDIO, (int*)&res);
    if (res != OPUS_OK) {
        cerr << "opus_encoder_create " << res;
        return 1;
    }

    // Define the Opus encoder bitrate
    opus_encoder_ctl(pEncoder, OPUS_SET_BITRATE(40000));
    opus_encoder_ctl(pEncoder, OPUS_SET_COMPLEXITY(10));
    opus_encoder_ctl(pEncoder, OPUS_SET_VBR_CONSTRAINT(0));
    opus_encoder_ctl(pEncoder, OPUS_SET_SIGNAL(OPUS_SIGNAL_MUSIC));
    opus_encoder_ctl(pEncoder, OPUS_SET_APPLICATION(OPUS_APPLICATION_AUDIO));
    opus_encoder_ctl(pEncoder, OPUS_SET_BANDWIDTH(OPUS_BANDWIDTH_FULLBAND));

    //opus_encoder_ctl(pEncoder, OPUS_SET_INBAND_FEC(1));
    //opus_encoder_ctl(pEncoder, OPUS_SET_PACKET_LOSS_PERC(100));

The bitstreams outputted by these encoders are sent via udp to the loopback interface (127.0.0.1) and are received by the golang host process, which promptly forwards them to a remote web client via webtransport (webtransport-go pkg) (audio example only below):

var audioStream webtransport.SendStream
    //audioStreamOnline := false
    go func() {
        audioPipeAddr, err := net.ResolveUDPAddr("udp", "127.0.0.1:10050")
        if err != nil {
            panic(err)
        }
        audioPipe, err := net.ListenUDP("udp", audioPipeAddr)
        if err != nil {
            panic(err)
        }
        err = audioPipe.SetReadBuffer(100)
        if err != nil {
            panic(err)
        }

        for {
            buffer := make([]byte, 100)
            len, _, err := audioPipe.ReadFromUDP(buffer)
            if err != nil {
                panic(err)
            }

            audioStream.Write(buffer[0:len])

            //fmt.Printf("Received %d bytes from audio pipe: %s\n", len, string(buffer[:len]))
        }
    }()

at the client side, the received bitstream is directly fed into webcodecs opus and hevc decoders:

let ts = 0
                    
                    while (true) {
                        const {done, value} = await reader.read()
                        if (done) return

                        lastVideoPacket = performance.now()

                        videoDecoder.decode(new EncodedVideoChunk({
                            type: "delta",
                            data: value.slice(5),
                            timestamp: ts,
                            duration: 16000
                        }))

                        ts += 16000
                    }
let ts = 0

                    while (true) {
                        const {done, value} = await reader.read()
                        if (done) return

                        lastAudioPacket = performance.now()

                        audioDecoder.decode(new EncodedAudioChunk({
                            type: "key",
                            data: value,
                            timestamp: ts,
                            duration: 10000
                        }))

                        ts += 10000
                    }

The decoders are configured as follows:

videoDecoder.configure({
        codec: "hev1.2.4.L120.B0",
        codedWidth: 1920,
        codedHeight: 1080,
        hardwareAcceleration: "prefer-hardware",
        optimizeForLatency: true
    })
audioDecoder.configure({
        codec: "opus",
        sampleRate: 48000,
        numberOfChannels: 2
    })

however, the decoded audio and video show clear corruption as depicted in the video: https://youtu.be/wAY5w4zlku4 it may seem that the audio corruption is due to me moving the windows, but I can confirm that it happens all the time. the video is actually on the less glitchy side of what I have experienced, and if I left it, the decoders would eventually suffer a fatal error and close.

This was one of the first problems I experienced when I started this project and I have tried so hard to fix it. the bitstreams were initially posted and read from stdout instead of sent through the loopback interface, and switching made no difference. now I am out of ideas, and I wish for those experienced with encoded av bitstreams to have a look at the code above to see if something's wrong, or if the corruption in the video looks familiar.

Thanks in advance!

Edit: I had the subprocesses write the bitstream to files once, and the hevc file played absolutely perfectly using ffplay with no issues, haven't tested playing the audio, however I'm sure there is nothing wrong with the subprocesses themselves.

Tiger Yang
  • 61
  • 4
  • Please take a look at a [UDP overview](https://en.wikipedia.org/wiki/User_Datagram_Protocol), particularly "It has no handshaking dialogues, and thus exposes the user's program to any unreliability of the underlying network; there is no guarantee of delivery, ordering, or duplicate protection". While UDP *is* used like you're using it there is effectively a layer on top of it to manage these challenges. – stdunbar May 04 '23 at 23:08
  • I understand the downsides of udp, but in this case the underlying network is the loopback IP so there shouldn't be any worries since the packets don't leave the computer. Additionally, the bitstream was originally sent through the stdout pipe, and the result is the same, so if the bitstream was being corrupted, it must be something more general that's unrelated to both the stdout pipe and the loopback interface. – Tiger Yang May 05 '23 at 21:07

1 Answers1

1

SOLVED! turns out there's some weird bug with the webtransport-go pkg which is corrupting the data in some way when I send it as a stream. (I'm pretty sure it's something more subtle than the system swapping line endings as that makes the bitstream completely unplayable in my experience). I sent the bitstreams using datagrams instead using SendMessage(msg []byte) and implemented my own packet fragmentation system for the larger video packets and it works perfectly!

Tiger Yang
  • 61
  • 4
  • Sounds like you might want to consider reporting an issue at https://github.com/quic-go/webtransport-go/issues, so the webtransport-go maintainers can investigate ー and to help others developers in the future to avoid running into the same problem you hit. – sideshowbarker May 06 '23 at 23:21
  • 1
    got it! https://github.com/quic-go/webtransport-go/issues/76 – Tiger Yang May 09 '23 at 07:57