18

I'm not a Pascal newbie, but I still don't know until now why Delphi and Free Pascal usually declares parameters and returned values as signed integers whereas I see them should always be positive. For example:

  • Pos() returns type of Integer. Is it possible to be a negative?
  • SetLength() declares the NewLength parameter as a type of Integer. Is there a negative length for string?
  • System.THandle declared as Longint. Is there a negative number for handles?

There are many decisions like those in Delphi and Free Pascal. What considerations were behind this?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Astaroth
  • 2,241
  • 18
  • 35
  • 2
    FPC consideration is simple: be Delphi-compatible. Delphi consideration for Pos is (via TurboPascal) probably longing back to 1974 Pascal Report – Arioch 'The Oct 08 '12 at 12:10
  • 4
    Arioch: FPC also via TP. This kind of stuff was already long decided before FPC started working on Delphi compatibility. – Marco van de Voort Oct 08 '12 at 13:54

4 Answers4

18

In Pascal, Integer (signed) is the base type. All other integer number types are a subranges of integer. (this is not entirely true in Borland dialects, given longint in TP and int64 in Delphi, but close enough).

An important reason for that if the intermediate result of calculations gets negative, and you calculate with unsigned integers, range check errors will trigger, and since most older programming languages DON'T assume 2-complement integers, the result (with range checks off) might even be corrupt.

The THandle case is much simpler. Delphi didn't have a proper 32-bit unsigned till D4, but only a 31-bit cardinal. (since 32-bit unsigned integer is not a subrange of integer, the later unsigned ints are a subset of int64, which moved the problem to uint64 which was only added in D2010 or so)

So in many places in the headers signed types are used where the winapi uses unsigned types, probably to avoid the 32th bit getting accidentally corrupt in those versions, and the custom stuck.

But the winapi case is different from the general case.

Added later Some Pascal (and Modula2/3) implementations circumvent this trap by setting the integer at a size larger than the wordsize, and require all numeric types to declare a proper subrange, like in the below program.

The first holds the primary assumption that everything is a subset of integer, and the second allows the compiler to scale nearly everything down again to fit in registers, specially if the CPU has some operations for larger than word operations. (like x86 where 32-bit * 32-bit mul gives a 64-bit result, or can detect wordsize overflows using status bits (e.g. to generate range exceptions for adds without doing a full 2*wordsize add)

   var x : 0..20;
       y : -10..10;
       
   begin
     // any expression of x and y has a range -10..20

Turbo Pascal and Delphi emulate an integer type twice the wordsize for their 16-bit and 32-bit offerings. The handling of the highest unsigned type is hacky at best.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • 2
    In recent versions, Delphi has moved to using unsigned types for Windows API types where appropriate. For instance, `THandle` now maps to `NativeUInt`. – Remy Lebeau Oct 08 '12 at 19:26
  • I would expect that, but I happened to compile the most recent VST (5.01) yesterday, which is said to be XE2 compatible. I still got many signed<->unsigned differences in the interface definitions that VST implements. – Marco van de Voort Oct 08 '12 at 19:34
13

Well, for a start THandle is declared incorrectly. It's unsigned in the Windows headers and should be so in Delphi. In fact I think this was corrected in a recent release of Delphi.

I'd imagine that the preference for signed over unsigned is largely historical and not particularly significant. However, I can think of one example where it is important. Consider the for loop:

for i := 0 to Count-1 do

If i is unsigned and Count is 0 then this loop runs from 0 to $FFFFFFFF which is not what you want. Using a signed integer loop variable avoids that problem.

Pascal is a victim of its syntax here. The equivalent C or C++ loop has no such trouble

for (unsigned int i=0; i<Count; i++)

due to the syntactic difference and use of a comparison operator as stopping condition.

This could also be the reason why Length() on a string or dynamic array returns a signed value. And so for consistency, SetLength() should accept signed values. And given that the return value of Pos() is used to index strings, it should be signed also.

Here's another Stack Overflow discussion of the topic: Should I use unsigned integers for counting members?

Of course, I'm speculating wildly here. Perhaps there was no design and just out of habit the precedent of using signed values was set and became enshrined.

Community
  • 1
  • 1
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • +1 from me, your comment, as well as Marco van de Voort's comment, deserve to accept. However, I must chose one of them. – Astaroth Oct 08 '12 at 14:19
  • 2
    @Astaroth Marco is an [authority on the subject](http://www.freepascal.org/aboutus.var). His answer is the right one to accept. – David Heffernan Oct 08 '12 at 14:21
  • This is one area where C/C++ outshines Delphi. A C/C++ `for` loop is much more flexible than Delphi's `for` loop, eg: `for(unsigned i = 0; i < Count; ++i)` does not suffer from the overflow issue when `Count` is unsigned, thus no need to check if `Count` is 0 beforehand. I have never liked that Delphi's loop condition is inclusive, requiring you to perform a bounds check beforehand. – Remy Lebeau Oct 08 '12 at 19:28
  • 4
    @RemyLebeau On the other hand, the C family of languages don't have a loop which evaluates the terminating condition once only. So if the loop condition is `i – David Heffernan Oct 08 '12 at 19:32
  • 1
    True, but that can be done using a local variable, same as the Delphi compiler uses implicitally: `unsigned condition = ...; for(unsigned i = 0; i < condition; ++i)` – Remy Lebeau Oct 08 '12 at 20:05
  • @RemyLebeau Yes, that's exactly what I said. But it clutters the code. – David Heffernan Oct 08 '12 at 20:16
  • Pascal only calculates the higher bound once (since FOR is not a while under the hood). It could simply check for underflow there. It might not even require extra instructions, just a modification of the branch instruction that does the not nil check now. – Marco van de Voort May 20 '15 at 12:39
  • Apples to Oranges, if the for loop was different says 1 to Count then now C has a problem and it will loop forever: I = 1; I <= Count; I++, so C has it's flaws too... now in Delphi the code would work perfectly again: I:=1 to Count. So conclusion for zero based for loops Delphi has an issue, For 1 based for loops C has an issue. Flip-Flop ! – oOo Nov 12 '21 at 11:29
6
  • Some string related search functions return -1 when nothing is found.
  • I believe the reasoning behind this is that MaxInt is 2GB which is the maximum size for strings in 32 bit Delphi. This because a single process can have up to 2GB memory
Community
  • 1
  • 1
whosrdaddy
  • 11,720
  • 4
  • 50
  • 99
  • 3
    In Delphi Pos() returns 0 when nothing is found. – MBo Oct 08 '12 at 12:30
  • `Pos()` was a bad example, but some other VCL functions/methods (e.g. `TStrings.IndexOf()`) return -1 when not found. – afrazier Oct 08 '12 at 13:13
  • @whosrdaddy, another question arises, why MaxInt is 2GB, not 4GB? Why the maximum size for strings is 2GB, not 4GB? – Astaroth Oct 08 '12 at 14:21
  • @Astaroth this is CPU architecture/OS related. $7FFFFFFF = 2GB which is the positive value maximum value for a 32-bit integer – whosrdaddy Oct 08 '12 at 14:49
  • It appears that the string length limit in 64 bit Delphi is also about 1 billion characters, which is about 2 gigabytes of memory. – Warren P Oct 09 '12 at 13:21
  • String size scales with the size of the largest possible single allocation which is not necessarily the same as the range of a pointer. Classically, 32-bit OSes reserved two GB for kernels and reserved I/O leaving 2GB max for an application. Larger memory sizes require special bits in the PE header to be set (large address aware). Though I think this is simply a 640k like artifact, from a time when 2GB seemed "huge". – Marco van de Voort Dec 06 '13 at 12:42
6

There are many reasons for using signed integers, even some that might apply when you do not intend to return a negative value.

Imagine I write code that calls Pos, and I want to do math with the results. Would you rather have a negative result (Pos('x',s)-5) raise a range-check exception, underflow and become a very large unsigned number around 4 billion, or go negative, if Pos('x',s) returns 1? Either one is a source of problems for new users who seldom think about these cases, but the long-established tradition is that by using Integer results, it's your job to check for negative and zero results and not use them as string offsets. There is an advantage for beginning and for advanced programmers, in using Integer, and not having "negative" values roll under and become large unsigned values or raise range exceptions.

Secondly, remember that in beginning programming, one usually introduces Integer (signed) types long before one introduces unsigned types like Cardinal. Beginners often work with functions like Pos, and it makes sense to use the type that will create the least-unfriendly set of side effects. There are no negative side effects to having a range larger than the one you absolutely need (the range you probably need for Pos is 1 to maximum-string-length-in-delphi). There is zero benefit in 32-bit Delphi to using the Cardinal type for Pos, and there definitely ARE downsides to choosing it.

Once you get to 64-bit delphi, however, you could theoretically have strings LARGER than an Integer can hold, and moving to Cardinal wouldn't fix all your potential problems. However, the chance of anyone having a 2+ GB string is probably nil, and Delphi 64-bit compiler doesn't allow a >2 GB string, anyway. In my testing, I can achieve an almost 1 GB String in 64 bit Delphi. So the practical length limit for a Win64 string is about a billion (1073741814) characters, which is using nearly 2 GB of actual RAM. At that limit, I either get EIntOverflow or EAccessViolation, and it seems I am hitting Delphi run time library (RTL) bugs, not properly defined limits, so your mileage may vary.

Warren P
  • 65,725
  • 40
  • 181
  • 316