27

I've always considered C++ to be one of the most strongly typed languages out there.
So I was quite shocked to see Table 3 of this paper state that C++ is weakly typed.

Apparently,

C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

Is the existence of type casting all that matters? Does the explicit-ness of such casts not matter?

More generally, is it really generally accepted that C++ is weakly typed? Why?

chb
  • 1,727
  • 7
  • 25
  • 47
user541686
  • 205,094
  • 128
  • 528
  • 886
  • 6
    C++ isn't weakly typed, but you can subvert the type system if you want to. So one could argue that it isn't fully strongly typed. – juanchopanza Nov 05 '14 at 09:28
  • .. and if it allows you (type punning is tricky) – Karoly Horvath Nov 05 '14 at 09:29
  • So how is C# considered strongly typed? Can't you do exactly the same thing with `unsafe` or by using the `Marshal` class? – user541686 Nov 05 '14 at 09:31
  • 5
    @Mehrdad I think "weakly typed" is a quite subjective term. "Strictly typed" and "statically typed" vs. "loosely typed" and "dynamically typed" are more objective, more precise words. From what I can tell, generally people use "weakly typed" as a diminutive-pejorative term which means "I don't like the notion of types in this language". It's sort of an argumentum ad hominem (or rather, argumentum ad *linguam*) for those who can't bring up professional-technical arguments against a particular language. – The Paramagnetic Croissant Nov 05 '14 at 09:32
  • @TheParamagneticCroissant: I see. So what would "strictly typed" mean? – user541686 Nov 05 '14 at 09:34
  • Also the quote would suggest that it may be *dynamically typed*, rather than *weakly typed*. – juanchopanza Nov 05 '14 at 09:34
  • I believe you can do `int x = false;` because C++ allows conversion between `bool` and `int` (note that the compiler may WARN about it, but if it's not an error, then the language is weakly typed at least when it comes to `bool` and `int` combinations) – Mats Petersson Nov 05 '14 at 09:34
  • 1
    @Mehrdad It also has slightly different interpretations; the generally accepted meaning is "the compiler generates errors if types don't match up". Another interpretation is that "there are no or few implicit conversions". Based on this, C++ can actually be considered a strictly typed language, and most often it *is* considered as such. – The Paramagnetic Croissant Nov 05 '14 at 09:36
  • @Mehrdad Also, there are some programmers, especially beginners unfamiliar with a lot of languages, who don't intend to or can't make the distinction between "strict" and "static", "loose" and "dynamic", and conflate the two - otherwise orthogonal - concepts based on their limited experience (i. e. the correlation of dynamism and loose typing in popular scripting languages, for example). In reality, parts of C++ (virtual calls) impose the requirement that the type system be partially dynamic, but other things in the standard require that it be strict. Again, this is not a problem. – The Paramagnetic Croissant Nov 05 '14 at 09:41
  • 4
    Bjarne Stroustrup's book mentions that C++ is a strongly typed language on page 2. Who would know better than him :-) – Damon Nov 05 '14 at 11:04
  • @MatsPetersson : I think it's worth pointing out that warnings can also be promoted to errors (`-Werror` with g++ for example). – PaulR Nov 29 '18 at 21:02
  • That paper contradicts itself, and even cited quote contains two false states. I.e. "weakly typed" is a term they invented by inverting "strictly typed". Second, "one can interpret a field of a structure that was an integer as a pointer" seems to be oddly narrow case which almost never ever happens or used (and generally is undefined behaviour) and may happen only through explicit cast structure which must dodge all restrictions language rules use to prohibit exactly that – Swift - Friday Pie Nov 26 '22 at 12:31
  • They literally say that a gun is dangerous for its owner because owner can commit suicide with its help. – Swift - Friday Pie Nov 26 '22 at 12:33

6 Answers6

39

That paper first claims:

In contrast, a language is weakly-typed if type-confusion can occur silently (undetected), and eventually cause errors that are difficult to localize.

And then claims:

Also, C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

This seems like a contradiction to me. In C and C++, the type-confusion that can occur as a result of casts will not occur silently -- there's a cast! This does not demonstrate that either of those languages is weakly-typed, at least not by the definition in that paper.

That said, by the definition in the paper, C and C++ may still be considered weakly-typed. There are, as noted in the comments on the question already, cases where the language supports implicit type conversions. Many types can be implicitly converted to bool, a literal zero of type int can be silently converted to any pointer type, there are conversions between integers of varying sizes, etc, so this seems like a good reason to consider C and C++ weakly-typed for the purposes of the paper.

For C (but not C++), there are also more dangerous implicit conversions that are worth mentioning:

int main() {
  int i = 0;
  void *v = &i;
  char *c = v;
  return *c;
}

For the purposes of the paper, that must definitely be considered weakly-typed. The reinterpretation of bits happens silently, and can be made far worse by modifying it to use completely unrelated types, which has silent undefined behaviour that typically has the same effect as reinterpreting bits, but blows up in mysterious yet sometimes amusing ways when optimisations are enabled.

In general, though, I think there isn't a fixed definition of "strongly-typed" and "weakly-typed". There are various grades, a language that is strongly-typed compared to assembly may be weakly-typed compared to Pascal. To determine whether C or C++ is weakly-typed, you first have to ask what you want weakly-typed to mean.

  • 1
    +1 great point. But to answer the question you should also mention whether or not C++ is strongly typed by whatever the accepted definition is! – user541686 Nov 05 '14 at 09:35
  • 2
    @Mehrdad Agreed, and expanded my answer. –  Nov 05 '14 at 09:40
  • Casts in C and C++ can occur silently, you know. Actually, both languages are full of subtle type traps. In my experience, most invocations of `printf` (and family) written by an average C/C++ programmer contain undefined behavior :) – ach Nov 05 '14 at 09:53
  • 4
    @AndreyChernyakhovskiy No, they cannot. "Cast" means "explicit conversion" (or specifically, the syntax used to write an explicit conversion). There are implicit conversions, like I already noted in my answer, but they aren't called casts. –  Nov 05 '14 at 09:55
  • 1
    @AndreyChernyakhovskiy: Unsafe conversions don't occur implicitly in C++, but they do in C (e.g. pointer conversions). – user541686 Nov 05 '14 at 09:55
  • 1
    @Mehrdad Ah, yes, I think you may have a good point there. I should add that to my answer. –  Nov 05 '14 at 09:57
  • @hvd That's true, from the C/C++ language-lawyer point of view. However, even those of us who only program in C and C++ sometimes laxly use the word 'cast' with the meaning of 'conversion'. – ach Nov 05 '14 at 10:07
  • @AndreyChernyakhovskiy If that's what the authors of the paper meant, then I agree that you have a good point, but the impression I got was that they were after `void *p = ...; int i = (int)p;`. –  Nov 05 '14 at 10:09
  • Mehrdad, unfortunately, I have to disagree. `int i; printf("%p, %i", &i, sizeof i);`. There are two type system violations in this simple code. – ach Nov 05 '14 at 10:11
  • 1
    @AndreyChernyakhovskiy: I guess I was talking about the language itself, not the standard library. If you look at it that way, you might as well claim that every language that allows calling C functions allows unsafe conversions. – user541686 Nov 05 '14 at 10:11
  • @Mehrdad, but you could write an implementation of `printf` yourself, and it would still have the same vulnerabilities. Therefore it is the *language itself* that provokes these type system violations. – ach Nov 05 '14 at 10:15
  • @AndreyChernyakhovskiy: No. My implementation would have those vulnerabilities if it uses varargs, but `va_arg` performs an explicit conversion, not an implicit one. – user541686 Nov 05 '14 at 10:16
  • @Mehrdad, you couldn't avoid using varargs before C++11 to implement `printf`, could you? – ach Nov 05 '14 at 10:28
  • For C++ (but not C), there are also more dangerous implicit conversions that are worth mentioning: Create an array of derived class (`Derived a[10]`), then pass it to a function that takes an array of base class (`void f(Base x[]); f(a);`). Watch it crash and burn. In other words, the implicit `Derived *` to `Base *` conversion is unsafe due to the existence of native arrays. – melpomene May 02 '17 at 06:00
  • 1
    I think one distinction that is not being made here (despite the fact that many people here are experienced and aware of it), is beyond the technical differences in C and C++ (legal conversions), there is a huge difference in the strength of typing in idiomatic code. For instance, C ellipses arguments are legal C++, but not idiomatic (replaced by variadic templates which are strongly typed). Another example: user-stateful callbacks in C are handled by function pointer taking void*. In C++ handled by function object/std::function. Generally void* usage in C++ is far, far more rare. Etc. – Nir Friedman May 02 '17 at 06:13
  • Your example is compiler-wise great but it does not reach to an obvious failure. – ar2015 Aug 19 '18 at 04:23
  • Not to say it's wrong, but there are implicit casts in C and C++ (different rules, by the way). In C++it can be happen mainly with arithmetic types and boolean. In C it may also happen to` void*` pointer. Implicit and contextual casts are well-defined – Swift - Friday Pie Nov 26 '22 at 12:27
13

"weakly typed" is a quite subjective term. I prefer the terms "strictly typed" and "statically typed" vs. "loosely typed" and "dynamically typed", because they are more objective and more precise words.

From what I can tell, people generally use "weakly typed" as a diminutive-pejorative term which means "I don't like the notion of types in this language". It's sort of an argumentum ad hominem (or rather, argumentum ad linguam) for those who can't bring up professional or technical arguments against a particular language.

The term "strictly typed" also has slightly different interpretations; the generally accepted meaning, in my experience, is "the compiler generates errors if types don't match up". Another interpretation is that "there are no or few implicit conversions". Based on this, C++ can actually be considered a strictly typed language, and most often it is considered as such. I would say that the general consensus on C++ is that it is a strictly typed language.

Of course we could try a more nuanced approach to the question and say that parts of the language are strictly typed (this is the majority of the cases), other parts are loosely typed (a few implicit conversions, e. g. arithmetic conversions and the four types of explicit conversion).

Furthermore, there are some programmers, especially beginners who are not familiar with more than a few languages, who don't intend to or can't make the distinction between "strict" and "static", "loose" and "dynamic", and conflate the two - otherwise orthogonal - concepts based on their limited experience (usually the correlation of dynamism and loose typing in popular scripting languages, for example).

In reality, parts of C++ (virtual calls) impose the requirement that the type system be partially dynamic, but other things in the standard require that it be strict. Again, this is not a problem, since these are orthogonal concepts.

To sum up: probably no language fits completely, perfectly into one category or another, but we can say which particular property of a given language dominates. In C++, strictness definitely does dominate.

4

Well, since the creator of C++, Bjarne Stroustrup, says in The C++ Programming Language (4th edition) that the language is strongly typed, I would take his word for it:

C++ programming is based on strong static type checking, and most techniques aim at achieving a high level of abstraction and a direct representation of the programmer’s ideas. This can usually be done without compromising run-time and space efficiency compared to lower-level techniques. To gain the benefits of C++, programmers coming to it from a different language must learn and internalize idiomatic C++ programming style and technique. The same applies to programmers used to earlier and less expressive versions of C++.

In this video lecture from 1994 he also states that the weak type system of C really bothered him, and that's why he made C++ strongly typed: The Design of C++ , lecture by Bjarne Stroustrup

ruohola
  • 21,987
  • 6
  • 62
  • 97
3

In contrast, a language is weakly-typed if type-confusion can occur silently (undetected), and eventually cause errors that are difficult to localize.

Well, that can happen in C++, for example:

#define _USE_MATH_DEFINES
#include <iostream>
#include <cmath>
#include <limits>

void f(char n) { std::cout << "f(char)\n"; }
void f(int n) { std::cout << "f(int)\n"; }
void g(int n) { std::cout << "f(int)\n"; }

int main()
{
    float fl = M_PI;   // silent conversion to float may lose precision

    f(8 + '0'); // potentially unintended treatment as int

    unsigned n = std::numeric_limits<unsigned>::max();
    g(n);  // potentially unintended treatment as int
}

Also, C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

Ummmm... not via any implicit conversion, so that's a silly argument. C++ allows explicit casting between types, but that's hardly "weak" - it doesn't happen accidentally/silently as required by the site's own definition above.

Is the existence of type casting all that matters? Does the explicit-ness of such casts not matter?

Explicitness is a crucial consideration IMHO. Letting a programmer override the compiler's knowledge of types is one of the "power" features of C++, not some weakness. It's not prone to accidental use.

More generally, is it really generally accepted that C++ is weakly typed? Why?

No - I don't think it is accepted. C++ is reasonably strongly typed, and the ways in which it has been lenient that have historically caused trouble have been pruned back, such as implicit casts from void* to other pointer types, and finer grained control with explicit casting operators and constructors.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
1

In General:

There is a confusion around the subject. Some terms differ from book to book (not considering the internet here), and some may have changed over the years.

Below is what I've understood from the book "Engineering a Compiler" (2nd Edition).


1. Untyped Languages

Languages that have no types at all, like for example in assembly.


2. Weakly Typed Languages:

Languages that have poor type system. The definition here is intentionally ambiguous.


3. Strongly Typed Languages:

Languages where each expression have unambiguous type. PL can further categorised to:

  • A. Statically Typed: when every expression is assigned a type at compile time.
  • B. Dynamically Typed: when some expressions can only be typed at runtime.


What is C++ then?

Well, it's strongly typed for sure. And mostly it is statically typed. But as some expressions can only be typed at runtime, I guess it falls under the 3.B category.

PS1: A note from the book:

A strongly typed language, that can be statically checkable, might be implemented (for some reason) just with runtime checking.

PS2: Third Edition was recently released

I don't own it, so I don't know if anything had changed on this regard. But in general, the "Semantic Analysis" Chapter had changed both title and order in Table of Contents.

Paschalis
  • 11,929
  • 9
  • 52
  • 82
-1

Let me give you a simple example:

 if ( a + b )

C/C+= allows an implicit conversion from float to int to Boolean.

A strongly-typed language would not allow such an implicit conversion.

user3344003
  • 20,574
  • 3
  • 26
  • 62