0

I want to use azure for my video cam live stream. And I want to use that for a realtime object detection process.

Problem :
I used azure media services but it has approx 4 sec delay which is not acceptable for a realtime operation.

So any other options available for live cam stream using azure with ultra low latency like 200 ms.

Or any other modification needed for ultra low latency in media services?

I used azure media service live stream.
I expecting a ultra low latency like 200 ms.

sourab maity
  • 1,025
  • 2
  • 8
  • 16
  • Azure IoT Edge is a framework for centrally managing solutions where processing modules are deployed near telemetry sources. https://learn.microsoft.com/en-us/azure/iot-edge/about-iot-edge?view=iotedge-1.4 Object recognition in video streams is a common use case for this. – David Browne - Microsoft May 21 '23 at 12:55

1 Answers1

1

Q* : "any other options available for live cam stream ... with ultra low latency like 200 ms."

Yes ... sure in case the object-detector's own latency is not more than that 200 [ms] End-to-End latency target

Q : "any other options available for live cam stream using azure with ultra low latency like 200 ms."

Well, it depends ... it depends a lot ... details ( as always ) matter ... a lot

           CAM       ISP.a     TELCO.b  CARRIER.c CARRIER.xyz      +-------------+
          +-----+    :         :        :         :                |             |
Hic     |\|     |____o_________o________o__ ... __o________________|+ISO-OSI-L1  |signal    DSP
Sunt    |/|     |                                 :    ^    ^node  | +ISO-OSI-L2 |framing   decode/reject
Leones    +-----+                                 :     rack       |  +ISO-OSI-L3|tcp:stack decode/confirm/resend
                :                                 :                +---+?--------+ - - - if Virtual, add HyperVisor driven shared(!) vCPU throttling & work-stealing from phys CPU ( Cloud-marketing tries not tell you this )
( in            :                       ^         DataCentre       ^    +O/S                kernel/schedule jobs
  real          :              ^        ^         TELEHOUSE        ^     +app stream.recv() FIRST BYTE of scene raw-data arrived
  time )        :    ^         ^        ^         ^                ^      +???...???:       LAST  BYTE of scene raw-data arrived  ( depend on size + eff.BW )
             To+0....:.........:........:.  ... ..:................:                ++M/L   scene raw-data decoded + CV-scene setup depend on in?(v)RAM? (v)CPU? resources )
                :  +10 ms      :        :         :                :                 ++M/L  CV-model detect                       ( depend on in?(v)RAM? (v)CPU? resources )
                :            +10 ms     :         :                :                  ++M/L CV-object identified?                 ( depend on in?(v)RAM? (v)CPU? resources )
                :                     +10 ms      :                :                    ?
                :                              +150 ms             :                    ?
                :                                               +0.5 ms                 ?
                :                                                               In To+200 ms ?

While
still ignoring the whole lumpsum of the ultra-low latency of CAM raw-CCD-data I/O & white-balance and other colour-space and error corrections & data chopping and re-framing into a packet-data transport service ( be it { tcp | udp | whatever }-network-protocol driven), the most of the End-to-End latency gets collected along the inter-ingress-ISP/exgress-DataCentreTELEHOUSE transport path ( imagine long, indeed long submarine and trans-continental optical cables, lambda- and retiming-amplifiers (yes, they also add latency), pass-through network interconnections operated core switches, their non-zero deep packet waiting-queues in their line-cards I/O-buffers, next some waiting for traffic routing-engines' decisions, congestion-policies, load-balancing, re-routing, etc, etc, so before your very first byte of the CAM-captured scene-data arrives into the hands of the final DataCentreTELEHOUSE, your accumulated latency might already be near, if not above, your above wished to get E2E-latency-target )

Next
comes the in-DC-"hydraulics". Bytes get relatively fast through the TELEHOUSE internal cross-connects inside "your" rented "service" ( hosted in some of the racks, using some of the rental-pool's node ).

Arriving here is fast,
yet - the worst is only to come here. If "your" service uses a shared, virtualised vCPU/vRAM computing, sold to general public as a "Cloud", you never know, when, where and how much of the physical resources (thus almost always with all phys-Cache-Lines dirty/evicted/depleted, so none of your, expensively retrieved and LRU-Cached data gets ever re-used -- read M/L-models will never be fast, as they always have to re-fetch the whole scope of the decision-making engine's data from (v)RAM (ouch, that is expensive, very expensive, repeatedly very expensive - these are the real add-on costs you will pay as you go for all those "promised benefits" of inexpensive use of "democratised", "efficiently shared", "green-energy (clean)" virtualised, shared among you and others as {weakly|non}-guaranteed when, where & for how long will become available for your workloads among others Cloudyneers, as resources publicly rented from cloud operators finally are ).

Possibilities?

Cables
could be shorter, not by knife, but if you move the whole M/L-model and CV-processing and decision making engines way closer to the source-feed-CAM.

Once
co-located CV-processing as close to the CAM as feasible, never rent a "shared" virtualised hardware emulation for RTMP-sensitive stream-processing. Never. Even if you get a fantastic offer or a free-meal voucher. That will never fly as fast as your wisely crafted, manageable, properly configured for RTMP-performance/latency targets using a set of dedicated phys-multicore-CPUs with properly sized, fast enough, never shared phys-RAM for due in-RAM, lightning-fast real-time streamed data object detection.

Any compromise in doing this just explodes your End-to-End latencies.

If you re-read the actual latency costs of data-operations, you exactly know where, how much and why.

user3666197
  • 1
  • 6
  • 50
  • 92