What are the limits of type checking and type systems?

Question

Type systems are often criticised, for being to restrictive, that is limiting programming languages and prohibiting programmers to write interesting programmes.

Chris Smith claims:

We get assurance that the program is correct (in the properties checked by this type checker), but in turn we must reject some interesting programs.

and

Furthermore, there is an ironclad mathematical proof that a type checker of any interest at all is always conservative. Building a type checker that doesn't reject any correct programs isn't just difficult; it's impossible.

Could someone please outline what kind interesting programmes this could be? Where is it proven that type checkers have to conservative?

And more general: What are the limits of type checking and type systems?

try putting "static vs dynamic languages" into Bing, there's a host of papers giving you lot of information. Be aware that the author may not be 100% objective or have full understanding of the other viewpoint — Rich Seller, Aug 09 '09 at 17:32
Type checking System F is undecidable: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.6.6483 — , Aug 11 '09 at 22:51
I realised that the claim is not about dynamic vs. static typing (which is a debate on its own), but rather about typed vs. untyped. Thus, I changed the question. — , Jul 11 '10 at 19:23
@user141335, type checking System F is perfectly decidable (and almost trivial). Type _inference_ is undecidable for it. — Andreas Rossberg, Apr 14 '13 at 06:24
Also see [what-are-the-deficiencies-of-the-java-c-type-system?](http://stackoverflow.com/questions/884391/what-are-the-deficiencies-of-the-java-c-type-system?lq=1) — nawfal, Jul 08 '14 at 10:14

score 8 · Answer 1 · answered Jul 17 '10 at 20:04

I think the first statement is technically wrong, though correct in practice.

Static typing is essentially the same as proving properties of programs, and by using a sufficiently powerful logic you could prove that all interesting programs are correct.

The problem is that for powerful logics type inference no longer works, and you have to provide the proofs as part of the program so that the type checker can do its job. A concrete example are higher order provers such as Coq. Coq uses an extremely powerful logic, but it is also extremely tedious to get anything done in Coq, because you have to provide proofs in great detail.

The kind of interesting programs that will give you the most trouble will be interpreters, as their behavior depends entirely on an input. Essentially you will need to reflectively prove the correctness of type checking for that input.

As for the second statement, he may be referring to Gödels incompleteness theorem. It states that for any given proof system there are true statements of arithmetic (the logic of addition and multiplication of natural numbers) that cannot be proved in the proof system. Translated to static type systems you would have a program that doesn't do anything bad but the static type system couldn't prove that.

These counterexamples are constructed by referring back to the definition of the proof system, saying essentially "I cannot be proved" translated into arithmetic, which is not very interesting. IMHO a program constructed analogously wouldn't be interesting either.

score 6 · Answer 2 · answered Aug 09 '09 at 17:45

6

You can express everything in both static and dynamic language. Proof == you can write any language compiler in any turing complete language. So whatever the language is, you can create the static language that does X.

What can be interesting in dynamic typing?... With good enough duck typing you could interact with objects over the network without ever knowing their types and pass their results (of a type unknown to you) as parameters to local functions which may actually do something useful.

Static answer to that problem would be to wrap everything in "exportable interface" providing .call() and .provides?() working on text name, but that would be definitely harder.

That's the most "limiting" case I know and it's really stretching things a bit (only really useful with mock objects?). As for theoretical limits, there are none - you just need some extra code to overcome the practical issues.

answered Aug 09 '09 at 17:45

viraptor

33,322
10
107
191

Their claims seem to be more existential, because they say that a type checker rejects certain programmes which dynamic typing allows. So the dynamically typed programmes are able to do something statically typed programmes will never be able to, no matter what tricks you use. – Aug 09 '09 at 18:58
6

@ott: that's not possible. If you give an example of such program A(x), I'll respond with compiler B for that language converted to assembler (which is not a dynamic language). Now B(A,x) is a new program in a static language, doing exactly the same as your program A, but takes A as a text input (data). – viraptor Aug 09 '09 at 19:19
your argument is mostly correct, save for the fact that assembler is neither statically nor dynamically types: the two notions are not opposite. Anyway, Turing completeness certainly guarantees that you can do the same in, say, Haskell. – Andrea Oct 12 '12 at 13:54

score 5 · Answer 3 · answered Oct 12 '12 at 14:04

I think there is a misunderstanding. It is true that any type system will reject a correct program (I do not recall the exact name of the result, so I cannot look it up right now, sorry). At the same it is true that any Turing complete language can do the same as any other one, so it is false that there are some programs in dynamically typed languages that you cannot reproduce, say, in Haskell.

The catch is that the fact that a type system will reject a program does not mean that it will reject all programs equivalent to it. So some programs will be rejected, but you can replace them with other, equivalent, programs. As an example take the following Scala program

def foo: Int =
  if (1 > 0) 15 else "never happens"

The type checker will reject it, as the expression if (1 > 0) 15 else "never happens" is formally of type Any. When you run it, it will certainly return an integer, but without evaluating 1 > 0 you cannot be sure that it will not return a string. You could write, in Python

def foo():
  if 1 > 0:
    return 15
  else:
    return "never happens"

and the Python compiler would not care.

Of course there are programs equivalent to this one that you can write in Scala, the simplest being

def foo: Int = 15

In Haskell you could write something like: `foo :: Thing` and `foo = if (1 > 0) Num 15 else Str "never happens"` where `data Thing = Num Int | Str String`. You could do the same in Scala, but I don't know the syntax; the important part here being that *this is what the Python program is doing*. Since you can create a universal type that is enumerated for each different inner type you care about, you can rewrite any of those Python programs in a statically typed language with that universal type. — porglezomp, Nov 29 '15 at 04:20

score 5 · Answer 4 · answered Oct 27 '09 at 12:48

An interesting example is This Paper, which is probably the only apples-to-apples comparison ever done on static vs dynamic typing. They implemented self (a language like smalltalk, but with prototypes instead of classes,) with both type inference (static), and type feedback (dynamic).

The most interesting result is that the type inference engine had a lot of trouble resolving between machine integers and arbitrary precision integers -- They auto-promoted in vanilla self, and hence could not be told apart by the type system, which meant that the compiler had to include code to promote to BigInt on every integer operation.

They ran up against a limit of their type system: It could not examine the actual value of integers.

I think that there are no theoretical limits to type systems in general, but any given type checker can only deal with a specific, limited type system, and there will be programs where it cannot determine what is going on. Since the self type inferencer allowed for sets of types, it simply compiled both paths. A type checker that required convergence on a single type would have to reject the program. (Though it would probably have special code for this case.)

One thing I would like to see in a programming language would be a guarantee that code would behave as though all intermediate subexpressions whose type was not forced would behave as though they were performed with the largest fixed-sized integer type in all cases where computations with such type would not cause overflows either with that type, or when coercing to smaller types. Other than certain expressions involving division (or right-shifts) by non-constant values, I would expect a compiler could generally figure out when promotion was required and when it was not. — supercat, May 29 '14 at 20:22
If `a`, `b`, and `c` are 32-bit integers, it shouldn't be hard to have a compiler perform a statement like `a=b+c;` using 32-bit math, but ensure that `a=(b+c)/2;` would yield arithmetically-correct results in the case were the result would bit in a 32-bit integer [perhaps using an `ADD` followed by `ROR`, or by using higher-precision math if it didn't recognize that trick]. I don't think the compiler complexity needed to make things "just work" would be much worse than what would be required to avoid an expensive library call if given `a=((Int64)b*c)>>32`. — supercat, May 29 '14 at 20:28

Brian · Answer 5 · 2009-08-09T18:01:06.923

It's hard to find objective pragmatic comparisons of static versus dynamic typing issues because it's so often such a religious war. The little summary blubs that you've quoted tend to be the same boilerplate disclaimer everone makes that seems to be 'acceptable' to everyone these days.

As someone who's experience is mostly in statically-typed languages, I tried to understand some of the trade-offs in a blog series a while back. Lots of caveats, but you might check out the second half of this blog entry for some comparison that is suggestive as the answer to your question.

Here's one quote that suggests a window into the trade-offs:

In a sense, this tiny function captures the essence of the ongoing debate between static and dynamic typing. The programmer can create domain-specific static types to structure the program, convey intent, and rule out a class of errors and behaviors at compile-time, at the price of having to author that structure and mediate structural mismatches at module boundaries. Or the programmer can choose to compute most everything with just scalars and lists, in which case data flows easily everywhere, resulting in short code, but with the loss of compile-time checks and conveyance of intent.

and the running example shows a case where the statically-typed program disallows a useful/interesting behavior by its nature.

Tamás Szelei · Answer 6 · 2009-08-09T19:09:32.183

1

I think an eval function could be handy sometimes, but never necessary (it is uncommon for a statically typed language, see the explanation on the link).

edited Aug 09 '09 at 19:09

answered Aug 09 '09 at 19:02

Tamás Szelei

23,169
18
105
180

1

I don't see what eval() has to do with type systems. You can have statically and dynamically typed languages which have an eval() function. – Aug 09 '09 at 19:03
@ott: can you give me an example of eval in a statically typed language? – Tamás Szelei Aug 09 '09 at 19:04
@ott: can you be more specific? I looked up it's documentation (http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html), but I was unable to find an eval function. The closest thing was an interactive prompt. I'm really interested in it's inner workings if it does not compile the expression at runtime. – Tamás Szelei Aug 09 '09 at 19:28
@sztomi: I think it compiles the code at runtime, but I'm the wrong person to ask about this. Maybe you should ask on haskell-cafe. – Aug 09 '09 at 19:32

score 0 · Answer 7 · answered Aug 09 '09 at 17:31

People have found that if you write your tests first in a truly TDD manner, your tests typically do a better job at verifying correctness than a strict type checking system would. So for gauging correctness, a strong type system doesn't really measure up.

Strong typing can often buy you some speed because compilers can easily use native types instead of having to do type checks at runtime. Case in point: if you're doing a lot of integer math, you'll find a strongly typed implementation will probably outrun a weakly typed one, since your figures can typically just be immediately used by the CPU and won't have to be verified at runtime.

"Interesting" programs? Sure. It's easier to write extensions off of a dynamic language. Also, in distributed systems, it can be really convenient to have a weakly typed UI and not have to generate specific data transfer objects. For example, if you had a JavaScript front end and a C# backend, you could use LINQ on the C# end to generate anonymous classes and send them to your Javascript via JSON. From the JS layer, you could just use those anonymous types as though they were first-class objects and not have to go through the pain of coding up all of your data contracts explicitly.

ilya n. · Answer 8 · 2009-08-14T08:35:53.553

Here's a simple example (in Python, but completely unrelated to its typing issues):

# The company deals with rectangles. They have horizontal and vertical lengths
# that should not be mixed. Programmer A writes:

class Rectange:
    def __init__(self, width, height):
        enforce_type('horizontal', width)
        enforce_type('vertical', height)
        #
        # A: Hehe! I'm so smart! With my patented type system if I try to
        #    write "width * width" I'll have a loud error so I'll know I'm wrong.  
        #
        area = width * height
        enforce_type('square_meters', area)
        ...

# Everyone's happy. The company shows off A's type system on the conference.

...

# Much later, the clients request ability to specify square by entering only 
# one length. Programmer B writes:

class Square(Rectangle):
    def __init__(self, width):
         Rectange.__init__(self, width, width)
         # !!
         # Error happens! 'horizontal' is not 'vertical'!
         #
         # B: Dear Management, I would like to request a weeklong leave since I 
         #    have to track Programmer A's house and either talk him into changing
         #    his patented type system or BEAT HIM WITH MY LAPTOP until he agrees.
         #

It's very hard to create a type system that would foresee any type of possible exceptions from the rules, especially if you're creating base framework that will be used by people much later.

To answer your direct question, on whose side you're on: Programmer A or Programmer B?

I think that's a bad example, both `width` and `height` should have type `meters` as `area` has type `square_meters`. Then the static typing would prevent programmer C from calculating `10 foot * 3 meters`. — swegi, Jan 11 '10 at 08:53
To be more specific, programmer A misused the static type system as programmer D could misuse a dynamic typed or untyped language to write `99 + "bottles of beer" / 3 * "people"` — swegi, Jan 11 '10 at 08:58
This is exactly the tradeoff quoted in the question between being more **correct** or more **interesting**. Programmer A didn't really misuse the typing system since he was told by the management that it's only possible to multiply vertical and horizontal. — ilya n., Jan 11 '10 at 13:14

score 0 · Answer 9 · answered Jul 13 '10 at 12:10

I am not sure but I believe the issues you are referring to are with Algebraic Type Systems like Haskell and ML. These languages try to do very complete "static" analysis of the types before you even run the program.

Some interesting annoyances with very strict static analysis in algebraic type systems is that it is very difficult to have a container that contains a mixture of objects of varying types.

For example in most mainstream languages you can have a heterogeneous mixture of types in a list. An example in python would be:

["a",1,"b",2]

To do this in a strict type system you would have to make a wrapping unified type and then pattern match all the possible types.

But the real interesting thing that is missing from the languages is powerful introspection that you have in most modern language (eg. C#,Java,Python,Ruby,Lisp).

Consequently these languages allow you to do some powerful meta-programming and data binding that you cannot do with complete static analysis.

Although some statically typed languages -- say Scala -- would just unify this at `List[Any]` which is fine and *perfectly type-safe*. However, as pointed out, to do anything *interesting* with the data requires matching (e.g. on type) -- Any has a toString method, but that's only so interesting. However, Scala's type system is *really quite different* from Haskell or SML. — , Jan 26 '12 at 06:35

Adrian Panasiuk · Answer 10 · 2012-02-22T13:43:44.380

0

There is a lot of complex examples, but it is seems that most people miss out on the trivial example.

This is a correct python program which answers the question of what are the absolute values of 5 and -5.

def abs(x):
    def signum(f):
        if f > 0:
            return 1
        else:
            return "non-positive"
    if signum(x) == 1:
        return x
    else:
        return -x

print("abs(5) = %r and abs(-5) = %r" % (abs(5), abs(-5)))

Obviously abs and signum take int as parameter; abs always returns int, but signum can return int or string. Now, if we introduced a type checker (but not any type-checker; scala's type checker would just say "signum is int->Any" !) this program would be rejected... yet, it is correct and will never crash with the non-strong-type-conformance as the crash reason.

edited Feb 22 '12 at 13:43

answered Jul 14 '10 at 10:43

Adrian Panasiuk

7,249
5
33
54

1

I'm not quite sure I buy this. In a statically typed language with type inference (like SML), `signum` would be rejected because it returns either a `string` or an `int`. If you change it to return 1 or -1 it's type would be inferred as `int -> int`. The hint to the type inference engine is the line `f > 0` which implies `f` is and `int`. – snim2 Feb 04 '11 at 15:26
Yea, it was supposed to be a program that gets rejected by a type checker – Adrian Panasiuk Mar 14 '11 at 12:48
There is no reason that this program needs to be rejected by the type checker as it's a valid program in Python. If the type checker infers your function return type to be Either then when the equality operator (==) is bound to the either you will either get 1 == 1 OR 1 == "non-positive" giving you the Python behaviour. – Ben Seidel Jan 17 '18 at 06:14
And its probably a good idea that this is rejected because there is a shorter version to write the same logic that will not be rejected. I have yet to see a real example where a useful program is rejected that is not just badly written but correct – Tadeo Hepperle Feb 14 '23 at 14:34

score 0 · Answer 11 · answered Jul 14 '10 at 21:22

0

Spec# has a type system with non null types and it is statically safe. (http://research.microsoft.com/en-us/projects/specsharp/)

answered Jul 14 '10 at 21:22

thequark

726
1
7
17

How does this differ from Haskell or Coq in this aspect? – Jan 26 '12 at 06:33

score 0 · Answer 12 · answered Jul 16 '10 at 23:57

0

What are the limits of type checking and type systems?

I'm going to assume that "type systems" implies a statically typed programming language.

Static typing means that types are checked at compile time. The limitation here is that the compiler can only optimize based on the information that is available at compile time. This could be considered a limitation. In dynamically typed programming languages it would be possible for the interpreter to analyze and optimize the program at runtime. This means it can optimize based on usage patterns.

answered Jul 16 '10 at 23:57

StackedCrooked

34,653
44
154
278

Although there is profile-guided optimization, which partly takes care of the "optimization by usage pattern". The disadvantage is that the optimization process is not transparent to the user, and one cannot easily reoptimize it during runtime. – Lajnold Jul 17 '10 at 21:23
Not true. Compiled / interpreted is completely orthogonal to statically / dynamically typed. And compiled / interpreted is a property of implementation, not of the programming language. There is a paper from people who implemented interpreted version of C on the JVM that can take advantage of JIT. Matthias Grimmer, Manuel Rigger, Roland Schatz, Lukas Stadler, Hanspeter Mössenböck: Truffle C: Dynamic Execution of C on the Java Virtual Machine. In Proceedings of the Intl. Conf. on Principles and Practice of Programming in Java (PPPJ'14), 2014 – user7610 Dec 10 '15 at 16:56

score 0 · Answer 13 · answered Aug 20 '10 at 12:02

0

If you look at Mitchell's book Concepts in Programming Languages p.134, you'll find some details on what is called the "Conservativity of Compile-Time Checking". The problem is that some "interesting" features like out-of-bounds accesses for arrays cannot be checked statically, as they would require evaluation of the program/every possible program-run. The standard undecidability-result tells you'd have to solve the Halting-problem to indeed check EVERY array access.

answered Aug 20 '10 at 12:02

Volker Stolz

7,274
1
32
50

This is not necessarily correct. *Some* languages, like ADA, allow subtypes over ranges of arrays, for instance. – Jan 26 '12 at 06:31
@pst Yes, to workaround the general limitations, people have designed systems that offer tractable subsystems where you can make some guarantees (usually at the cost of having one arm tied behind your back). – Volker Stolz Jan 26 '12 at 09:18

What are the limits of type checking and type systems?

13 Answers13

Linked