19

I'm a big fan of PHP and it's obviously a very weakly-typed language. I realize some of the benefits include the general independence of changing variable types on the fly and such.

What I'm wondering about are the drawbacks. What can you get out of a strongly-typed language like C that you otherwise can't get from a weakly-typed one like PHP? Also with type setting (like double($variable)), one could argue that even a weakly-typed language can act just like a strongly-typed one.

So. Weak-type. What are some benefits I didn't include? More importantly, what are the drawbacks?

EGP
  • 387
  • 1
  • 4
  • 12
  • 3
    What do you mean by "strong" and "weak" typing? Static vs. dynamic? Explicit vs. implicit type names? Explicit vs. implicit casting? Safe vs. unsafe casting? – dan04 Jul 31 '10 at 00:33
  • 3
    @thebackhand: ANSI C is reasonably strongly typed because variables have a definite declared type, and most types cannot be implicitly converted to others. C does have *some* implicit conversions (eg. among various floating point and integer types - but then again so do most languages), and it does allow you to *explicitly* override the type system in some cases, but that isn't enough to make it generically "weakly typed". Unfortunately, many C compilers default to being quite lax about allowing implicit conversions that they really shouldn't (eg. integer to pointer). – caf Jul 31 '10 at 00:50
  • 2
    It's become very common to mistake manifest typing for strong typing. A strongly typed language does not coerce values to other types. A weakly typed language does. Python, for example, is largely strongly typed even though it isn't normally manifestly typed, because it seldom (other than boolean contexts) coerces types, even though it doesn't have type declarations. – user1277476 Aug 21 '12 at 19:08

7 Answers7

12

The cited advantage of static typing is that there are whole classes of errors caught at compile time, that cannot reach runtime. For example, if you have a statically-typed class or interface as a function parameter, then you are darn well not going to accidentally pass in an object of the wrong type (without an explicit and incorrect cast, that is).

Of course, this doesn't stop you passing in the wrong object of the right type, or an implementation of an interface where you've given it the right functions but they do the wrong things. Furthermore, if you have 100% code coverage, say the PHP/Python/etc folks, who cares whether you catch the error at compile time or at run time?

Personally, I've had fun times in languages with static typing, and fun times in languages without. It's rarely the deciding issue, since I've never had to choose between two languages which are identical other than their kind of typing and there are normally more important things to worry about. I do find that when I'm using statically typed languages I deliberately "lean on the compiler", trying to write code in such a way that if it's wrong, it won't compile. For instance there are certain refactors which you can perform by making a change in one place, and then fixing all the compilation errors which result, repeat until clean compile. Doing the same thing by running a full test suite several times might not be very practical. But it's not unheard-of for IDEs to automate the same refactors in other languages, or for tests to complete quickly, so it's a question of what's convenient, not what's possible.

The folks who have a legitimate concern beyond convenience and coding style preference are the ones working on formal proofs of the correctness of code. My ignorant impression is that static type deduction can do most (but not all) of the work that explicit static typing does, and saves considerable wear and tear on the keyboard. So if static typing forces people to write code in a way that makes it easier to prove, then there could well be something to it from that POV. I say "if": I don't know, and it's not as if most people prove their statically-typed code anyway.

changing variable types on the fly and such

I think that's of dubious value. It's always so tempting to do something like (Python/Django):

user = request.GET['username']
# do something with the string variable, "user"
user = get_object_or_404(User,user)
# do something with the User object variable, "user"

But really, should the same name be used for different things within a function? Maybe. Probably not. "Re-using", for example, integer variables for other things in statically typed languages isn't massively encouraged either. The desire not to have to think of concise, descriptive variable names, probably 95% of the time shouldn't override the desire for unambiguous code...

Btw, usually weak typing means that implicit type conversions occur, and strong typing means they don't. By this definition, C is weakly typed as far as the arithmetic types are concerned, so I assume that's not what you mean. I think it's widely considered that full strong typing is more of a nuisance than a help, and "full weak typing" (anything can be converted to anything else) is nonsensical in most languages. So the question there is about how many and what implicit conversions can be tolerated before your code becomes too difficult to figure out. See also, in C++, the ongoing difficulty in deciding whether to implement conversion operators and non-explicit one-arg constructors.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • "Weak" and "Strong" is clearly a continuum though, not an either/or proprosition - after all, in C you can assign an `int` to `float`, but you can't assign `struct a` to `struct b` even if those types are defined identically. – caf Jul 31 '10 at 01:12
  • @caf: true, and just what I was writing about as you were commenting :-) – Steve Jessop Jul 31 '10 at 01:19
9

I have been using both strong typed (like Java) and weak typed (like JavaScript) languages for some time now. What I have found is that the convenience of the weak typed languages are great for small applications. Unfortunately, as the application grows in size, it becomes impossible to manage. There becomes too much to keep track of in your head and you have to start depending on your IDE and the compiler more and more or your coding grinds to a halt. That is when strong typed languages start to become more useful - with the application grows very large.

Two examples that constantly drive me nuts in the weak typed JavaScript are using external libraries that are not thoroughly documented and refactoring.

External libraries: When dealing with a strongly typed language, the code from the library itself provides self documentation. When I create a variable of type Person, the IDE can inspect the code and tell there is a getFirstName(), getLastName() and getFullName(). In weak typed languages, this is not the case as a variable could be anything, have any kind of variable or function and have function arguments that could also be anything (they are not explicitly defined). As a result, a developer has to lean heavily on documentation, web searches, discussion forums and their memory of past usages. I find it can take hours of looking things up in JavaScript for external libraries while with Java I just hit the "." key and it pops up all my options with documentation attached. When you encounter libraries that are not 100% fully documented, it can be really frustrating with weak typed languages. I recently found myself asking "What is argument 'plot' in function 'draw'?" when using jqplot, a fairly well but not completely documented JavaScript library. I had to spend an hour or two digging through source code before finally giving up and finding an alternative solution.

Refactoring: With strongly typed languages, I find myself able to refactor quickly by just changing the file I need to change and then going to fix the compiler errors. Some tools will even refactor for you with a simple click of a button. With weak typed languages, you have to do a search and then replace with care and then test, test, TEST and then test some more. You are seldom entirely sure you have found and fixed everything you broke, especially in large applications.

For simple needs and small applications, these two concerns are minimal to non-existent. But if you are working with an application with 100's of thousands or millions of lines of code, weak typed languages will drive you nuts.

I think many developers get upset about this and turn it into an emotional discussion is because we sometimes get it in our heads that there is one right and one wrong approach. But each approach has its merits - its own advantages and disadvantages. Once you recognize that you set the emotion aside and choose the best for you for what you need right now.

Aaron Bono
  • 107
  • 1
  • 4
2

Many a book has been written about this sort of thing. There's an inherent tradeoff; with a weakly-typed language a lot of annoyances simply cease to be. For instance, in Python you never have to worry about dividing a float by an int; adding an int to a list; typing functions' arguments (did you know, OCaml has special +. operators for adding floats because (+) sends ints to ints!); forgetting that a variable can be null... those sorts of problems simply vanish.

In their place come a whole host of new runtime bugs: Python's [0]*5 gives, wait for it, [0,0,0,0,0]! OCaml, for all the annoyance of strong typing, catches many many bugs with its compiler; and this is precisely why it's good. It's a tradeoff.

Katriel
  • 120,462
  • 19
  • 136
  • 170
  • 1
    "Runtime bugs"? Sequence repetition instead of per-element mapping is quite reasonable, consistent, and expected. After all, what would you expect from `"xyz" * 5`? – Jon Purdy Jul 31 '10 at 01:04
  • 5
    @Jon: depends whether arithmetic on chars is valid in your language (in C it is, unfortunately). Per-element mapping is *also* quite reasonable and consistent where the operators make sense: in mathematics we're perfectly happy that [0,1,2] *2 is [0,2,4], not [0,1,2,0,1,2], aren't we? It would be a nightmare if Mathematica defined vector-scalar multiplication like Python list-scalar multiplication. I don't think either side has much right to be astonished by the conventions of the other, you pretty much just have to learn what `*` means in your language. – Steve Jessop Jul 31 '10 at 01:12
  • >>> 1 + [] Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for +: 'int' and 'list' – nate c Jul 31 '10 at 01:15
  • 1
    @Steve: Perfectly reasonable. The important thing is that it's usually not fair to speak in absolutes: languages geared toward mathematics and those for general computation have different target audiences with different assumptions, and consequently different rules. Language designers are responsible for making decisions that are appropriate to the target users, and the less misguided of them make their rules self-consistent, at least. – Jon Purdy Jul 31 '10 at 01:26
  • @jon: well, this is something that occasionally annoys me about python -- I don't think strings should be sequences. They should be sliceable, but being full sequences has caused me a fair few headaches (special-case code for container flattening being the worst). Unfortunately Guido disagrees with me =p. So I think `"abc" * 5` should repeat, but `[ "a", "b", "c" ] * 5` should blow up :p. – Katriel Jul 31 '10 at 01:30
  • But this is sort of the point, no? The fact that we're arguing about what implicit operator overloading should occur just goes to show that you'll never please everyone when designing a language, and static-er typing avoids it! – Katriel Jul 31 '10 at 01:31
  • @katrielalex: Actually, you've sort of contradicted yourself. Sure, you'll never please everyone when designing a language, and a more static typing system may please you, but it may also displease others. The point is that static and dynamic typing have significant bearing on the idioms of the language, but none whatsoever on its overall quality. We can argue all day about this or that design decision of a particular language. – Jon Purdy Jul 31 '10 at 01:43
  • @katrielalex: isn't that another orthogonal issue? What operator overloads are defined, and what they do, is independent of whether the compiler knows which one it's going to be? I have no problem with strings being sequences (they are in C++ too, for what it's worth), and it's just unfortunate that there are two different operations which could reasonably be described as "multiplying a sequence by an integer". Btw, if you want `"abc" * 5` to repeat, and you want per-element multiplication of lists, shouldn't `["a","b","c"]*5` be `["aaaaa","bbbbb","ccccc"]`? ;-) – Steve Jessop Jul 31 '10 at 01:44
  • Python's `*` operator on sequences is consistent with its `+` operator: `2 * a == a + a` whether a is a number or sequence. I personally would prefer having different operators for addition and concatenation, but it's too late to change it now. – dan04 Jul 31 '10 at 01:54
  • And strange operator operloading is unrelated to static typing. Look at C++, where `<<` means both "bitwise left shift" and "print to a file"! – dan04 Jul 31 '10 at 01:57
1

Weak and Strong are loaded terms. (Do you want to be a weak language programmer?) Dynamic and Static are better but I would imagine most people would rather be a dynamic programmer than a static one. I would call PHP a promiscuous language (Thats not a loaded term ;) )

PHP:

garbage_out = garbage_in * 3; // garbage_in was not initialized yet
"hello world" + 2; // does this make sense?

Allowing uninitialized variables creates very hard to find errors from misspellings. Allowing operations on unrelated types is also almost always an error that should be reported. Most interpreted dynamic languages do not allow these things for good reason. You can have dynamically typed language without allowing garbage.

nate c
  • 8,802
  • 2
  • 27
  • 28
  • "does this make sense" - yes it flipping does, and whenever Python has thrown an exception because I wrote `"hello world"+2`, I *always always always* meant `"hello world" + str(2)`. My theory is that it's a cunning trick to get people to use `%` (format strings) instead, since that can give you a free conversion to string. `"hello world%s" % 2` is fine. If it was such a trick, it basically worked on me ;-) – Steve Jessop Jul 31 '10 at 01:53
  • The first line is not valid PHP code. But even assuming it is, PHP will throw a warning about an undefined variable in your face. This does not demonstrate anything about weak or promiscuous or any typing. The second example is more legit, but then the use case is `$a = '2'; $b = $a + 1; // $b == 3`. It's not entirely nonsensical to allow `+` with strings. – deceze Jul 31 '10 at 02:14
1

See this table, showing the results of the PHP == operator applied to pairs of empty or other principal values of various types, such as 0, "0", NULL and "". The operator is non-commutative, and the table is quite unintuitive, to say the least. (Not that it would make much sense to ask if a string is equal to an array, in fairness -- but you can do that in a weakly typed language, which is the pitfall, and may Turing help you if the language tries to "help" you.)

aib
  • 45,516
  • 10
  • 73
  • 79
  • 1
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. – Rostyslav Dzinko Aug 21 '12 at 09:55
  • @RostyslavDzinko: You are absolutely correct. Thanks for the suggestion. – aib Aug 21 '12 at 18:49
0

You have to understand that PHP was created for the context of web applications. Everything in the context of the web is a string. Therefore it is very rare where strong typing would be beneficial.

Pete
  • 614
  • 6
  • 4
0

Straight from wikipedia:

The advantage claimed of weak typing is that it requires less effort on the part of the programmer than strong typing, because the compiler or interpreter implicitly performs certain kinds of conversions. However, one claimed disadvantage is that weakly typed programming systems catch fewer errors at compile time and some of these might still remain after testing has been completed.

That's about the same thing I would say. However beware of the relative ambiguity of these terms ("strong typing" and "weak typing") since implicit conversions blur the line.

Source: http://en.wikipedia.org/wiki/Weak_typing

Vasiliy Sharapov
  • 997
  • 1
  • 8
  • 27
  • 6
    I'm completely DISAGREE with `The advantage claimed of weak typing is that it requires less effort on the part of the programmer than strong typing, because the compiler or interpreter implicitly performs certain kinds of conversions`, instead, it adds much more effort for developers to debug the STUPID typing issues..... – Peter Lee Nov 29 '12 at 22:01
  • @PeterLee: Fair point, perhaps the optimistic assumption is that the programmer is mature and aware of common caveats. Your point rings very true in "Intro to CS" settings but gets less relevant as the developer gets better (if they get better). – Vasiliy Sharapov Dec 08 '12 at 20:08
  • 1
    I found that when I do more and more weakly typing language code, the programmer himself/herself is becoming a compiler, while from my point of view, the developers/programmers should be focusing on the business logic, not these typing caveats, which should be caught by the tools (the real compiler on your computer). The other part is, thanks to all the frameworks we have, such as JavaScript jQuery....... – Peter Lee Dec 09 '12 at 21:19