15

There is a question with the same title but unfortunatly it doesn't help me.

I am trying to parse the data of a SOS marker. All documentations I can find say that after the marker (0xFFDA) a two byte number follows which defines how long this segment is -- for example here -- as with most variable size markers. But I seem not to understand it correctly in this case. It works for every other marker type.

I checked multiple files but just can't get this right. Is this number not defining how long the complete SOS field is? So for a baseline JPEG there should be exaclty one SOS segment and after this should the End Of Image marker follow. If it is progressive there can be multiple SOS segments but still all should have a length field.

I have one picture with a SOF0 Marker so it should be baseline. I believe that this is the correct SOFn marker because the image resolution can be found after that marker. With a hex editor I have found 3 0xFFDA marker and all of them have 0x000C in the following 2 bytes. So that segment, as I understand it, should always be 12 Byte long. But in all 3 cases no new marker is following after 12 byte of data. I guess the very last one is the scan I am looking for because if the value 0xFF comes up it is followed by 0x00 -- except the reset markers.

Are those two byte following 0xFFDA not the length fields?

EDIT: So thanks to the comments and answer there seems to be no length field for the actual compressed data and only way to know where it ends seems to be decoding it.

Why does a Baseline DCT Image have multiple scans? I would understand why it has two; the main image and a thumbnail, but what is the third scan?

But there is one more thing. According to DRI Marker (Define Restart Interval) it contains the value after which a Scan should have a restart marker 0xFFD0 - 0xFFD7. But I seem to misunderstand that too or I'm not doing it right. For example a marker contained the value 0x0140 as restart interval. In the following Scan I started from the beginning and searched for the first 0xFFD0 but it came after 862 bytes instead of 320.

hippietrail
  • 15,848
  • 18
  • 99
  • 158
ap0
  • 1,083
  • 1
  • 12
  • 37
  • 3
    The 2 bytes following FFDA are the length (12), but immediately after the SOS marker is the compressed image data. You need to decode that "scan" of data and the next FFxx marker will be after the compressed data. – BitBank Nov 03 '14 at 14:06
  • 1
    @BitBank, so there is no field which tells me how long the compressed data is? – ap0 Nov 03 '14 at 15:10
  • 2
    The compressed data doesn't have a length field; it must be decoded to find the end, or if you must know where the end is, look for the next FFD9 marker after the FFDA. – BitBank Nov 03 '14 at 17:16
  • @BitBank, I have edited my question, maybe you could help one more time, please. – ap0 Nov 04 '14 at 07:49
  • 1
    Baseline images will have one scan. A file with a thumbnail is really 2 JPEG images and 2 scans. Please post a sample image which has 3 scans and I'll take a look. Here is my previous answer about restart markers: http://stackoverflow.com/questions/8748671/jpeg-restart-markers?s=5|1.9294 – BitBank Nov 04 '14 at 09:50
  • @BitBank, I have seen your answer to that question and understand it now. Unfortunatly I do not know where I can find the number of vertical and Horizontal MCUs in one image. Here is the link to the image in which I found 3 Scans http://oi61.tinypic.com/28hhr1e.jpg – ap0 Nov 04 '14 at 09:55
  • Your image was generated by Photoshop and it has a photoshop thumbnail in the FFED (PhotoShop APP13 marker) as well as an EXIF thumbnail and the main image (3 images total). In order to understand JPEG, you must understand how color and color subsampling are defined with different MCU configurations. A non-subsampled YCC image will have each MCU define 8x8 pixels -> a 640x480 image will have 80x60 MCU blocks = 4800 total MCUs (and 14400 total DCT blocks - 4800 each for Y, Cr, Cb) – BitBank Nov 04 '14 at 11:06
  • Baseline JPEG can have multiple scans. The only effective difference between baseline and extended sequential is that baseline is limited to 2 huffman tables and 2 quantization tables; a someone nonsensical limit as it take no more code to decode a stream with 9999999 huffman or quantization tables than it does with 2. – user3344003 Nov 04 '14 at 16:41

2 Answers2

5

The SOS marker contains the compressed data; the most complex part of the JPEG stream. The SOFn marker indicates the format of the data. SOF0 and SOF1 are processed identically. SOF2 (progressive) is quite a bit different. (The read of the SOFn markers are not in common use or commonly supported).

The length is that of the SOS header, not the compressed data. Most of the header is only applicable to progressive scans (SOF2).

The compressed data follows the header. The compressed data has no length. You have to scan through the data to find the next marker.

user3344003
  • 20,574
  • 3
  • 26
  • 62
  • 5
    You don't have to decode the data to find the end of the SOF marker. Scan for FF in the stream. FFFF means a compressed FF value (skip those). FFD0 - FFD7 are restart markers. Ignore those. Any other FFxx value should be the next block in the stream. The restart markers mean nothing unless you decompress. They are MCU intervals, not byte intervals. – user3344003 Nov 04 '14 at 17:25
  • This is incorrect for some reasons since it is possible to create FF as values and along. So if you are using it you must check for even more information. – Martin Kersten Mar 14 '16 at 16:35
  • 5
    I have one error. FF is encoded as FF00. That's all you have to check for other than a restart marker. – user3344003 Mar 15 '16 at 00:16
  • Depends about what position you really talk. Basically you just have to add more checks to it then just FF+some more. All you can do is reducing the propability that you interpret something as the start that is not. So basically one should adapt to it or it will generate hard to spot random defects in upper layers of your application (users of your decoder code). – Martin Kersten Mar 15 '16 at 10:03
  • 4
    @MartinKersten, I believe user3344003 is right. from wikipedia [page](https://en.wikipedia.org/wiki/JPEG): *Within the entropy-coded data, after any 0xFF byte, a 0x00 byte is inserted by the encoder before the next byte, so that there does not appear to be a marker where none is intended, preventing framing errors. Decoders must skip this 0x00 byte. This technique, called byte stuffing (see JPEG specification section F.1.2.3)* – AaA Jun 22 '16 at 07:12
5

Summary of how to find next marker after SOS marker (0xFFDA):

  1. Skip first 3 bytes after SOS marker (2 bytes header size + 1 byte number of image components in scan).
  2. Search for next FFxx marker (skip every FF00 and range from FFD0 to FFD7 because they are part of scan).

*This is summary of comments below post of user3344003 + my knowledge + Table B.1 from https://www.w3.org/Graphics/JPEG/itu-t81.pdf.

*Basing on Table B.1 I can also suspect that values FF01 and FF02 through FFBF should also be skipped in point 2 but I am not sure if they cannot appear as part of encoded SOS data.


Additional question above:

Why does a Baseline DCT Image have multiple scans? I would understand why it has two; the main image and a thumbnail, but what is the third scan?

If image stream contains APP2 marker (0xFFE2), that tells us it can be (but not have to be) Multi Picture JPEG (tag named MPF), we can have more than 3 images. In general APP markers can store anything, there are a lot of standards related to APP segments in JPEG files.

First table tells us about "things" that can be stored in each APP segment:
https://exiftool.org/TagNames/JPEG.html

Gringo Suave
  • 29,931
  • 6
  • 88
  • 75
fider
  • 1,976
  • 26
  • 29
  • The answer is miss-given. It actually depends on actual compression JPEG container uses. Original ITU T.81 said in encoding process we shall escape 0xFF byte with 0xFF 0x00 byte sequence, and decoding process we should skip 0x00 in 0xFF 0x00 byte sequence. While other JPEG Standard extensions do not follow this guide literally. For example ITU T.87 JPEG-LS encoding samples clearly have 0xFF not followed with 0x00 (example 0xFF 0x7F) in sample .jls files available with original Standard document. Is this failure of Standard, or Standard examples, or just guys who wrote the standard ? – SoLaR Jun 04 '21 at 08:51