I understand that a JPEG file starts with 0xFFD8 (SOI), followed by a number of 0xFFEn segments holding metadata, then a number of segments holding the compression relate data (DQT, DHT, etc) of which the final one is 0xFFDA (SOS); then comes the actual image data which ends with 0xFFD9 (EOI). Each of those segments states its length in the two bytes following the JPEG marker so it is a trivial execise to calculate the end of a segment/start of next segment and the start of the image data can be calculated from the length of the SOS segment.
Up to that point, the appearance of 0xFFD9 (EOI) is irrelevant 1, because the segments are identified by the length. As far as I can see, however, there is no way of determining the length of the image data other than finding the 0xFFD9(EOI) marker following the SOS segment. In order for that to be assured, it would mean that 0xFFD9 must not appear inside the actual image data itself. Is there something built into the JPEG algorithm to ensure that or am I missing something here?
1 A second 0xFFD8 and 0xFFD9 can appear if a thumbnail is included in the image but that is taken care of by the length of the containing segment - usually a 0xFFE1 (APP1) segment from what I have seen. In images I have checked so far, the start and size of the thumbnail image data is still given in the 0x0201 (JPEGInterchangeFormat - Offset to JPEG SOI)and 0x202 (JPEGInterchangeFormatLength - Bytes of JPEG data) fields in IFD1, even though these were deprecated in Tech Note #2.