9

When writing interpreted languages, is it faster to have weak typing or strong typing?

I was wondering this, because often the faster dynamically typed interpreted languages out there (Lua, Javascript), and in fact most interpreted languages use weak typing.

But on the other hand strong typing gives guarantees weak typing does not, so, are optimization techniques possible with one that aren't possible with the other?


With strongly typed I mean no implicit conversions between types. For example this would be illegal in a strongly typed, but (possibly) legal in a weakly typed language: "5" * 2 == 10. Especially Javascript is notorious for these type conversions.

orlp
  • 112,504
  • 36
  • 218
  • 315
  • 1
    +1 An interesting question, though I fear it's hard to answer in general. I'm looking forward to gory details about interpreter optimizations. –  Mar 10 '12 at 13:13
  • Just to be clear, what do you mean by weak and strong typing? (I try to avoid using these terms for the [reasons given here](http://stackoverflow.com/questions/2690544/what-is-the-difference-between-a-strongly-typed-language-and-a-statically-typed) -- people use them in many incompatible ways.) – DSM Mar 10 '12 at 13:18
  • Strongly typed as in no implicit conversions between types. For example this would be illegal in a strongly typed, but (possibly) legal in a weakly typed language: `"5" * 2 == 10`. Especially Javascript is notorious for these type-conversions. EDIT: added it to the question. – orlp Mar 10 '12 at 13:34
  • Most modern JavaScript engines no longer interpret but compile directly into bytecode these days. So it's difficult to apply this question to that language. – Jivings Mar 10 '12 at 13:44
  • Jivings: why would the machine it's executing on (directly on the machine or a VM) matter for optimization techniques? – orlp Mar 10 '12 at 13:48
  • I assume the JavaScript would be compiled into the appropriate type so it's no longer weakly typed. – Jivings Mar 10 '12 at 13:56
  • are you sure that is what you mean by "strongly typed"? it's quite possible for implicit type conversions to happen in languages that most people would consider strongly typed - for example in java you can implicitly convert anything to a string, while in c you can convert between floats and ints. more usually when people are asking about performance they are concerned with whether the type of an object is known at compile time, which can allow the compiler to optimize (and this advantage is reduced by just-in-time compilation, which uses the runtime type) – andrew cooke Mar 10 '12 at 14:11
  • @andrewcooke: I'm sure this is what I mean with strongly typed, neither Java or C are strongly typed. Yes, they are _statically_ typed, but not strongly. (For example in C you can convert between pointers to different types implicitly). But really, I mean strongly typed (no implicit conversions) and not statically typed (the declaration of type in the code). – orlp Mar 10 '12 at 14:13
  • ah, ok. sorry for the misunderstanding. – andrew cooke Mar 10 '12 at 14:14
  • @Jivings JS is still dynamically typed, so you can't really settle for any type during compilation (especially when compiling to bytecode - you sure you aren't talking about JIT compilers?). The best you can get is speculating that something will usually be of a certain type, and generate code based on that assumption. But you still need guards (conditionals checking that the assumption is indeed true) and a more general/less optimized version to fall back to (when it was incorrect). But none of this matters to this question, as it's due to dynamic typing, not due to weak/strong typing. –  Mar 10 '12 at 14:17
  • so which interpreted languages have strong typing? – andrew cooke Mar 10 '12 at 14:32
  • 1
    This question doesn't really make sense, in two ways. Firstly: the overhead of implicit conversions is incurred when those implicit conversions are performed. A program that doesn't perform any such implicit conversions will be the same whether or not the language supports them. So your question is like asking whether programming languages that have a built-in arctangent function are slower than languages that do not. Secondly: the implementation of implicit conversions is quite different in dynamically-typed languages from in statically-typed languages; so even if your question had a coherent – ruakh Mar 10 '12 at 14:36
  • answer for the one case, there would be no sense trying to apply that answer to the other case. (And, for that matter, it's also sometimes quite different between different languages even within those groups.) – ruakh Mar 10 '12 at 14:37
  • @andrewcooke: Which languages at *all* have strong typing by this definition? Offhand, the only ones I can think of are functional languages like Standard ML. – ruakh Mar 10 '12 at 14:39
  • For example Python comes pretty close to pure strong typing, with very few operators defined to work with mixed types. But Javascript on the other hand has an enormous amount of conversions defined between strings, numbers and arrays, introducing a lot of WTF moments. So I wondered that perhaps there is a speed benefit of allowing that. – orlp Mar 10 '12 at 14:46

2 Answers2

3

it seems to me that the question is going to be hard to answer with explicit examples because of the lack of "strongly typed interpreted languages" (using the definitions i understand from the question comments).

i cannot think of any language that is interpreted and does not have implicit conversions. and i think this for two reasons:

  1. interpreted languages tend not be statically typed. i think this is because if you are going to implement a statically typed language then, historically, compilation is relatively easy and gives you a significant performance advantage.

  2. if a language is not statically typed then it is forced into having implicit conversions. the alternative would make life too hard for the programmer (they would have to keep track of types, invisible in the source, to avoid runtime errors).

so, in practice, all interpreted languages are weakly typed. but the question of a performance increase or decrease implies a comparison with some that are not. at least, it does if we want to get into a discussion of different, existing implementation strategies.

now you might reply "well, imagine one". ok. so then you are asking for the performance difference between code that detects the need for a conversion at runtime with code where the programmer has explicitly added the conversion. in that case you are comparing the difference between dynamically detecting the need for a conversion and calling an explicit function specified by the programmer.

on the face of it, detection is always going to add some overhead (in a [late-]compiled language that can be ameliorated by a jit, but you are asking about interpreters). but if you want fail-fast behaviour (type errors) then even the explicit conversion has to check types. so in practice i imagine the difference is relatively small.

and this feeds back to the original point - since the performance cost of weak typing is low (given all the other constraints/assumptions in the question), and the usability costs of the alternative are high, most (all?) interpreted languages support implicit conversion.

[sorry if i am still not understanding. i am worried i am missing something because the question - and this answer - does not seem interesting...]

[edit: maybe a better way of asking the same(?) thing would be something like "what are the comparative advantages/disadvantages of the various ways that dynamic (late binding?) languages handle type conversion?" because i think there you could argue that python's approach is particularly powerful (expressive), while having similar costs to other interpreted languages (and the question avoids having to argue whether python or any other language is "weakly typed" or not).]

andrew cooke
  • 45,717
  • 10
  • 93
  • 143
  • Python, as you surely know, has very few if any implicit conversions. Numeric types are *promoted* (in one-way fasion, and always from one type to a more general one), and one might argue `if`/`and`/`or`/`not` implicitly convert to booleans (though one could also say all objects have a truth value and these simply get that value and process it - the fact that `and`/`or` return one of the operands supports this view). Other than that, I'm now aware of any implicit conversions. And I doubt Python is alone in that regard. So I dare challenge your reason #2. –  Mar 10 '12 at 17:31
  • by implicit conversion i mean implicit invocation of the methods like `__float__` - see http://docs.python.org/reference/datamodel.html#specialnames (so `__nonzero__` is what you are referring to with conversion to boolean). – andrew cooke Mar 10 '12 at 17:47
  • a really common example in python is conversion to iterator. that is all over the place in idiomatic python code, and is implemented in the same way (`__iter__` iirc). – andrew cooke Mar 10 '12 at 17:55
  • `__float__` is not called implicitly, it's only called from `float` AFAIK (perhaps during numeric promotion too, but we've already got that one). `__iter__` is indeed called implicitly, but only in the context of `for` loops and {list,dict,set} comprehensions so it's extremely restricted and *always* called. For that reason, I don't really consider it an implicit conversion. And most of the "special methods" don't convert at all - they serve purposes like attribute access and operator overloading. –  Mar 10 '12 at 18:06
  • `__str__` is another common one. and iirc `__float__` is called by the default implementation of `__add__`. more generally, i am not sure what your point is - yes, these blur into message dispatch and so you can argue that it's not implicit type conversion but method invocation. and that's cool - it's exactly what i meant when i said python had a good solution and that arguing about exactly what is and isn't "weak typing" is largely pointless. so pointless that i will not do any more. – andrew cooke Mar 10 '12 at 18:15
  • My point is not that strongly/weakly typed is a useful distinction or that we should discuss what exactly counts as weak typing. I'm merely doubtful of your second reason (that dynamic typing makes implicit conversions necessary). On your other examples: `__str__` is only called by `str` (it may be indirect, but it's definitely explicit). Arithmetic operators, as we both already stated, indeed include some implicit promotions/conversions. –  Mar 10 '12 at 18:32
1

With strongly typed I mean no implicit conversions between types.

"5" * 2 == 10

The problem is that "weak typing" is not a well-defined term, since there are two very different ways such "implicit conversions" can happen, which have pretty much the opposite effect on performance:

  • The "scripting language way": values have a runtime type and the language implicitly applies semantic rules to convert between types (such as formatting a binary number as a decimal string) when an operation calls for the different type. This will tend to decrease performance since it A) requires there to be type information at runtime and b) requires that this information be checked. Both of these requirements introduce overhead.
  • The "C way": at runtime, it's all just bytes. If you can convince the compiler to apply an operation that takes a 4 byte integer on a string, then depending on how exactly you do it, either the first 4 bytes of that string will simply be treated as if they were a (probably very large) integer, or you get a buffer overrun. Or demons flying out of your nose. This method requires no overhead and leads to very fast performance (and very spectacular crashes).
Community
  • 1
  • 1
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • There are actually three ways, since you left out the C way: at compile-time, expressions have types, and implicit conversions can be inserted to convert an expression from one type to another as needed. For example, if `i` is an `int`, then `5.0 * i` will implicitly convert `i` from `int` to `double`. The decision to make this conversion is performed at compile-time. (What you describe as "the 'C way'" is not really the C way of performing implicit conversions; rather, it's the result of *bypassing* C's implicit conversions.) – ruakh Mar 10 '12 at 16:24