12

The Photoshop file format documentation mentions Pascal strings without explaining what they are.

So, what are they, and how are they encoded?

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742

1 Answers1

12

A Pascal-style string has one leading byte (length), followed by length bytes of character data.

This means that Pascal-style strings can only encode strings of between 0 and 255 characters in length (assuming single-byte character encodings such as ASCII).

As an aside, another popular string encoding is C-style strings which have no length specifier, but use a zero-byte to denote the end of the string. They therefore have no length limit.

Yet other encodings may use a greater number of prefix bytes to facilitate longer strings. Terminator bytes/sentinels may also be used along with length prefixes.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
  • 1
    This description is correct, but it applies to traditional Pascal strings only. Modern Object Pascal dialects like those in Delphi, Free Pascal and Lazarus support a plethora of string types with different encodings and some of them without length limit. See http://wiki.freepascal.org/Character_and_string_types for reference. I assume, however that "Pascal strings" in the Photoshop documentation refers to traditional Pascal strings. – jwdietrich Feb 15 '15 at 21:09
  • Indeed, the (rather long-lived) Photoshop spec uses _Pascal string_ in place of the more precise [_Pascal ShortString_](http://wiki.freepascal.org/Character_and_string_types#ShortString). Thanks for the reference. – Drew Noakes Feb 15 '15 at 21:55
  • Only Delphi renamed "string" to shortstring because they introduced ansistring. The origin of string afaik is UCSD Pascal, see http://stackoverflow.com/questions/25068903/what-are-pascal-strings/25079998#25079998 – Marco van de Voort Feb 16 '15 at 12:07