I have searched for various samples online but I'm unable to find a suitable sample which is able to provide enough information.
I have tried Microsoft Expression Encoder, but the delay is too huge if I use broadcast method.
Directshow.net wise, the sample DxWebCam seems promising, but it lacks audio sample.
The idea I had in my mind is to send audio and video (frames) separately via TCP (or maybe UDP as highlighted by @macbral) but I am not sure how to handle synchronisation.
I'm looking at free samples as the current design is a 1 to 1 video conference via intranet.
Thanks for any help in advance.