12

TypeScript allows us to alias an array-typed variable with a variable of a supertype (TypeScript arrays are covariant):

const nums: number[] = [];
const things: (number | string)[] = nums;
things.push("foo");
nums[0] *= 3;
console.log(nums[0]); // `NaN` !!

Why? This seems like a nice place to protect us from runtime errors. Given how Java was mocked for having covariant arrays, it seems this was an intentional TS feature.

This was asked by someone else on a stale TypeScript issue, but I didn't see any answers.

Max Heiber
  • 14,346
  • 12
  • 59
  • 97

2 Answers2

21

As you've noted, array covariance is unsound and can lead to errors at runtime. One of TypeScript's Design Non-Goals is

  1. Apply a sound or "provably correct" type system. Instead, strike a balance between correctness and productivity.

which means that if some unsound language feature is very useful, and if requiring soundness would make the language very difficult or annoying to use, then it's likely to stay, despite potential pitfalls.

Apparently there comes a point when it is "a fool's errand" to try to guarantee soundness in a language whose primary intent is to describe JavaScript.


I'd say that the underlying issue here is that TypeScript wants to support some very useful features, which unfortunately play poorly together.

The first is subtyping, where types form a hierarchy, and individual values can be of multiple types. If a type S is a subtype of type T, then a value s of type S is also a value of type T. For example, if you have a value of type string, then you can also use it as a value of type string | number (since string is a subtype of string | X for any X). The entire edifice of interface and class hierarchy in TypeScript is built on the notion of subtyping. When S extends T or S implements T, it means that S is a subtype of T. Without subtyping, TypeScript would be harder to use.

The second is aliasing, whereby you can refer to the same data with multiple names and don't have to copy it. JavaScript allows this: const a = {x: ""}; const b = a; b.x = 1;. Except for primitive data types, JavaScript values are references. If you tried to write JavaScript without passing around references, it would be a very different language. If TypeScript enforced that in order to pass an object from one named variable to another you had to copy all of its data over, it would be harder to use.

The third is mutability. Variables and objects in JavaScript are generally mutable; you can reassign variables and object properties. Immutable languages are easier to reason about / cleaner / more elegant, but it's useful to mutate things. JavaScript is not immutable, and so TypeScript allows it. If I have a value const a: {x: string} = {x: "a"};, I can follow up with a.x = "b"; with no error. If TypeScript required that all aliases be immutable, it would be harder to use.

But put these features together and things can go bad:

let a: { x: string } = { x: "" }; // subtype
let b: { x: string | number }; // supertype 
b = a; // aliasing
b.x = 1; // mutation
a.x.toUpperCase(); //  explosion

Playground link to code

Some languages solve this problem by requiring variance markers. Java's wildcards serve this purpose, but they are fairly complicated to use properly and (anecdotally) considered annoying and difficult.

TypeScript has decided not to do anything here and treat all property types as covariant, despite suggestions to the contrary. Productivity is valued above correctness in this aspect.


For similar reasons, function and method parameters were checked bivariantly until TypeScript 2.6 introduced the --strictFunctionTypes compiler option, at which point only method parameters are still always checked bivariantly.

Bivariant type checking is unsound. But it's useful because it allows mutations, aliasing, and subtyping (without harming productivity by requiring developers to jump through hoops). And method parameter bivariance results in array covariance in TypeScript.


Okay, hope that helps; good luck!

jcalz
  • 264,269
  • 27
  • 359
  • 360
  • 2
    The example with the `number | string` union is especially helpful for me! I hadn't realized that the issues with variance ran so deep. – Max Heiber Mar 30 '20 at 13:21
  • The [fool's errand](https://github.com/microsoft/TypeScript/issues/9825#issuecomment-306272034) link was interesting. I'd never considered computed properties effecting semantics of functions like `toString`. Funny stuff. – Nathan Chappell Sep 25 '20 at 07:45
  • Couldn't they fix this problem of array unsoundness by saying that every formerly invariant type parameter consists of two implicit parameter halves, one `in` parameter and one `out` parameter? It could infer both halves of the type parameter automatically. For example, reading Animals from an array and putting Tigers into it would work with a Cat array but also for other array-like structures, with one explicit `in` parameter and one explicit `out` parameter could work just as fine. I believe this could improve productivity because it is more generic than invariance and doesn't hide bugs. – ChrisoLosoph Jul 10 '23 at 16:52
3

This isn't an issue of covariance; it's an issue of aliasing. Forbidding array covariance is incredibly frustrating, so much so that even languages that are almost entirely invariant (Swift) include an exceptions for Arrays. (Swift avoids the problem you've shown here by preventing aliasing, so this bug is not possible in Swift.)

Imagine a function that accepted a list of optional numbers:

function sum(values: (number|undefined)[]): number {
  return values.reduce((s: number, x?: number) => s + (x ?? 0), 0)
}

Imagine if you could not pass number[] to this function. It is not hard to understand why they did not impose that.

The frustration is that JavaScript makes it too easy to mutate an alias, but this is a broad class of problem that's much larger than the covariance case. IMO, the following is the major headache that TypeScript should be helping us avoid (difficult as that might be to do given how TS works). If this code would raise an error, it solve the covariance problem as a special case:

const nums: number[] = [];
const things: number[] = nums;
things.push(1);
nums[0] *= 3;
console.log(things[0]); // `3` !!
Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • Can you elaborate on what you mean by "alias" here? Is it just having a different reference to the same array? If so, then if I call `sum(nums)`, is `values` inside the function implementation an alias to `nums`? If not, what's the difference? – jcalz Mar 29 '20 at 22:05
  • Yes, just as you're thinking. When you alias you have two references to the same thing (at least one of which is writable; if neither is writable, aliasing doesn't really matter). The fact that arrays are mutable objects that can be modified from multiple points is a major problem in JavaScript. I just discovered about 5 minutes ago that this is the source of a really confusing bug in my current project; I had another aliasing bug in JS this morning, and a couple more earlier this week. Accidentally sharing an object is just too easy. – Rob Napier Mar 29 '20 at 23:37
  • When I write `let x = 4; let y = x; x++`, I don't expect `y` to increment (and it doesn't). Why does this suddenly change when I write `let x = [4]; let y = x; x.push(1)`? We get used to it, but it's a major problem in the language, of which this question is small subset. @jcalz – Rob Napier Mar 29 '20 at 23:40
  • 1
    @RobNapier doesn't `readonly number[]` do what you're looking for? – Max Heiber Mar 31 '20 at 20:58
  • @MaxHeiber I may not want the array to be readonly. I'm intentionally modifying `things` and I'm intentionally modifying `nums`. What is surprising is that they're the same thing. (I realize that after many years, and many mistakes, the surprise wears off. But if `let x = 4; let y = x; x++` modified `y`, I suspect you'd be surprised. Why should it be different just because it's an array of values?) – Rob Napier Mar 31 '20 at 22:19
  • @RobNapier I see what you're saying, I hadn't thought about it like that. What languages get this "right"? – Max Heiber Apr 01 '20 at 09:47
  • @MaxHeiber C++ stdlib does a good job IMO, though it gives the programmers so many options that I think it can be confusing. ObjC also does a good job. It's not automatic, but following well-established best practice will avoid problems. Rust has an excellent ownership model that is probably best in class for these problems, and Swift IMO is brilliant with its copy-on-write value types for most collections (though they make reasoning about performance very hard). In my experience Python and Perl are in the doghouse with JavaScript. I don't know PHP well enough to grade it. – Rob Napier Apr 01 '20 at 14:26
  • Go and C are…different…about this. They definitely have the aliasing problem and make it hard to create copies, but C collections are very manual things (which means there are all kinds of mistakes you can make; aliasing is a minor one…) Go collections are just different beasts. The fact that you work on slices rather than arrays makes it hard to compare to other languages (aliasing is the normal way you program, so it's much less surprising). Conventions can fix a lot of this (like in ObjC), and JS is also just really bad about not having good universal conventions. – Rob Napier Apr 01 '20 at 14:33
  • (I intentionally leave out "FP" languages like Haskell, Scala, and SML, since they rely primarily on immutable collections. They alias all the time, but aliasing is not an issue when everything is immutable. But immutability introduces its own trade-offs, and many languages handle mutability very well, particularly newer languages like Rust and Swift.) – Rob Napier Apr 01 '20 at 14:36