Identifying Frame boundaries in RTP stream

Question

I have few doubts regarding frame boundaries in RTP packets.
First, If the marker bit is set, does it say that a new frame has begun(this is what I understand from RFC 3551)?
Second, According to what I read a frame starts with I-frame followed by P, B frames. Now, which field indicates this? And is the I frame has the marker bit set?
Third, If I need to find the start and end of a frame, would the check for marker bit suffice?

Thanks!

@Cipi may have more helpful info on this http://stackoverflow.com/a/1968958/127938 — Paul Gregoire, Dec 21 '14 at 14:18

score 1 · Accepted Answer · edited May 23 '17 at 11:49

1

The RTP entry on the Wireshark Wiki provides a lot of information, including (edit) sample captures. You could exlore it, and it might answer some of your questions. If you're going to write code to work with RTP, Wireshark is useful for monitoring/debugging.

Edit For your first question about Marker bit, this FAQ might help. Also, finding the frames (I, P, B) depends on the payload. There's another question here that has an answer showing how I, P, B are found for MPEG. The h263-over-rtp.pcap has examples with I and P frames for H.263.

edited May 23 '17 at 11:49

Community

1
1

answered Apr 19 '12 at 17:52

Fuhrmanator

11,459
6
62
111

I did explore it. But I was little confused about those things. Hence I asked it here. Any help in answering those would be appreciated – user1192671 Apr 20 '12 at 06:52

score 0 · Answer 2 · answered Jul 31 '21 at 18:37

This in an old question but I think it is a good one.

As you mention I,P and B frames, in 2012 you are likely referring to H.264 over RTP.

According to [rfc6184]1 , the marker bit is set on the last packet of a frame , so indeed the marker bit can be used as an indicator of the end of 1 frame and the next packet in sequence will be the start of the next frame.

According to this rfc, all packets of a frame also have the same RTPTIME so changes in RTPTIME is another indicator of the ending of the previous frame and start of a new frame.

Things get more tricky when you lose packets. For example, let's say you lose packets 5 and 6 and that these were the last packet of frame 1 and the first packet of frame 2. You know to discard frame 1 because you never got a packet with a marker bit for that frame, but how can you know if frame 2 is whole or not. Maybe the 2 lost packets were both part of frame 1 or maybe the second packet was part of frame 2?

rfc6184 defines the start bit that is present in the first packet of a fragmented NAL unit. If the NAL unit is not fragmented then by definition, we got the whole NAL unit if we got the packet. This means that we can know if we got a full NAL unit. Unfortunately, this does not guarantee we have the full frame since a frame could contain multiple NAL units (e.g. multiple slices) and we may have lost the first one. I don't have a solution for this problem but maybe somone will provide one sometime in the next 10 years.

Identifying Frame boundaries in RTP stream

2 Answers2