Why does len() returned a signed value?

Question

Go's builtin len() function returns a signed int. Why wasn't a uint used instead?

Is it ever possible for len() to return something negative?
As far as I can tell, the answer is no:

Arrays: "The number of elements is called the length and is never negative."
Slices: "At any time the following relationship holds: 0 <= len(s) <= cap(s)"
Maps "The number of map elements is called its length". (I couldn't find anything in the spec that explicitly restricts this to a nonnegative value, but it's difficult for me to understand how there could be fewer than 0 elements in a map)
Strings "A string value is a (possibly empty) sequence of bytes.... The length of a string s (its size in bytes) can be discovered using the built-in function len()" (Again, hard to see how a sequence could have a negative number of bytes)
Channels "number of elements queued in channel buffer (ditto)

A `uint` might be superior in case you want to address the 2147483648th element of a slice. But then, you'd be in trouble if you wanted to address the 4294967296th element. — Jesse Amano, Aug 22 '16 at 21:37

score 48 · Answer 1 · edited Mar 21 '18 at 14:26

48

len() (and cap()) return int because that is what is used to index slices and arrays (not uint). So the question is more "Why does Go use signed integers to index slices/arrays when there are no negative indices?".

The answer is simple: It is common to compute an index and such computations tend to underflow much too easy if done in unsigned integers. Some innocent code like i := a-b+7 might yield i == 4294967291 for innocent values for aand b of 6 and 10. Such an index will probably overflow your slice. Lots of index calculations happen around 0 and are tricky to get right using unsigned integers and these bugs hide behind mathematically totally sensible and sound formulas. This is neither safe nor convenient.

This is a tradeoff based on experience: Underflow tends to happen often for index calculations done with unsigned ints while overflow is much less common if signed integers are used for index calculations.

Additionally: There is basically zero benefit from using unsigned integers in these cases.

edited Mar 21 '18 at 14:26

Jonathan Hall

75,165
16
143
189

answered Aug 23 '16 at 04:22

Volker

40,468
7
81
87

1

I thought that Go runtime does bounds checking for any kind of access to arrays or slices, thus underflows with both signed or unsigned integers are absolutely equal in its consequences (run-time panic) when we're talking about _indexes_. But unsigned math can still be problematic when one uses it to compute _length_ or _capacity_ of the array to allocate. – Roman Khimov Aug 23 '16 at 09:20
1

@RomanKhimov Yes, **if** an over-/underflow happens the consequences are the same. My argument is: There are less ...flows if signed ints are used and more ...flows if unsigned ints are used for index calculations because indices around 0 are much more common than indices around MaxUint32. – Volker Aug 23 '16 at 09:25
3

Is there an error in the example here? I don't think 6 - 10 + 7 underflows, does it? Regardless, I'm not sure I follow the argument. Is it that in an underflow scenario an obvious negative index error is preferable to the more subtle bugs that could be caused by a large unsigned integer index? – Thom Carter Feb 09 '21 at 20:53
@Carter 1. I never claimed 6-10+7 would underflow. This is a constant expression evaluated at compiletime with arbitrary precission. I said a-b+7 could result in unexpected behaviour if a and b wäre unsigned.) 2. Every over or underflow is a bug. The argument is: index calculations around 0 are much more common than araound MaxUint. This doing Index calculations with uints is more dangerous as there will be more errors. – Volker Feb 10 '21 at 08:22
1

question: what would be an example of undefined behavior that happens in the context of such a "temporary under-zero dip"? the a+b-6 appears to be an example of a "temporary underflow" - the final result is as expected but an intermediate value blows up to a huge value. – anon2328 Feb 05 '22 at 18:24
@aleks224 "an example of undefined behavior" there is no undefined behavior in GO for such things. I have to admit I do not understand your question. – Volker Feb 06 '22 at 12:34
"basically zero benefit from unsigned integers" Isn't the benefit that your slices can be twice as big (full 64-bit address space)? – weberc2 Apr 03 '23 at 21:09
@weberc2 I'm not sure how to interpret your comment as it's either a few days late to be a funny April Fools' Day joke or lacks understanding on how large 2^63 actually _is_, how many bits of 64-bit address space are used at all and how large hardware is going to be the next 50 years. – Volker Apr 04 '23 at 06:10
@Volker yeah, I mathed wrong. – weberc2 Apr 04 '23 at 14:47

score 7 · Answer 2 · edited Apr 07 '21 at 00:34

There is a proposal in progress "issue 31795 Go 2: change len, cap to return untyped int if result is constant"

It might be included for Go 1.14 (Q1 2010)

we should be able to do it for len and cap without problems - and indeed there aren't any in the stdlib as a type-checking it via a modified type checker shows

See CL 179184 as a PoC: this is still experimental.

As noted below by peterSO, this has been closed.

Robert Griesemer explains:

As you noted, the problem with making len always untyped is the size of the result. For booleans (and also strings) the size is known, no matter what kind of boolean (or string).

Russ Cox added:

I am not sure the costs here are worth the benefit. Today there is a simple rule: len(x) has type int. Changing the type to depend on what x is will interact in non-orthogonal ways with various code changes. For example, under the proposed semantics, this code compiles:
const x string = "hello"
func f(uintptr)
...
f(len(x))
but suppose then someone comes along and wants to be able to modify x for testing or something like that, so they s/const/var/. That's usually fairly safe, but now the f(len(x)) call fails to type-check, and it will be mysterious why it ever worked.

This change seems like it might add more rough edges than it removes.

[proposal: Go 2: change len, cap to return untyped ints if result is constant](https://github.com/golang/go/issues/31795) has been retracted and closed. — peterSO, Dec 21 '19 at 04:32

score 6 · Answer 3 · 2016-08-22T22:47:46.783

6

Length and capacity

The built-in functions len and cap take arguments of various types and return a result of type int. The implementation guarantees that the result always fits into an int.

Golang is strongly typed language, so if len() was uint then instead of:

i := 0 // int
if len(a) == i {
}

you should write:

if len(a) == uint(i) {
}

or:

if int(len(a)) == i {
}

Also See:

uint either 32 or 64 bits
int same size as uint
uintptr an unsigned integer large enough to store the uninterpreted bits of a pointer value

Also for compatibility with C: CGo the C.size_t and size of array in C is of type int.

edited Aug 22 '16 at 22:47

answered Aug 22 '16 at 21:40

Are the conversions required? A quick experiment indicates that you can compare `int` and `uint` values to integer literals without any problem (though trying to compare `int` and `uint` *variables* causes a "mismatched types" error). – Keith Thompson Aug 22 '16 at 21:53
@KeithThompson for untyped numbers You are right , this is imaginary example, thanks. – Aug 22 '16 at 22:06

score 1 · Answer 4 · answered Aug 22 '16 at 21:49

1

From the spec:

The length is part of the array's type; it must evaluate to a non-negative constant representable by a value of type int. The length of array a can be discovered using the built-in function len. The elements can be addressed by integer indices 0 through len(a)-1. Array types are always one-dimensional but may be composed to form multi-dimensional types.

I realize it's maybe a little circular to say the spec dictates X because the spec dictates Y, but since the length can't exceed the maximum value of an int, it's equally as impossible for len to return a uint-exclusive value as for it to return a negative value.

answered Aug 22 '16 at 21:49

Jesse Amano

800
5
16

(Speculative) There are a few good reasons I can think of for a language's author to desire that arrays be indexed by *signed* integers. Arrays are usually allocated as cohesive blocks of memory[citation needed] with pointer arithmetic used to address an element[citation needed]. You might store metadata, such as the array's length, in a negative offset[citation needed], and it may be circumstantially easier to represent certain array operations by adding an offset which may or may not be negative, as opposed to conditionally adding or subtracting offsets. – Jesse Amano Aug 22 '16 at 21:57

Why does len() returned a signed value?

4 Answers4

Linked