76

The STL is a critical piece of the C++ world, most implementations derive from the initial efforts by Stepanov and Musser.

My question is given the criticality of the code, and it being one of the primary sources for people to view examples of well written C++ for both awe and learning purposes: Why are the various implementations of the STL so disgusting to look at - convoluted and generally good examples of how not to write C++ code from an aesthetics point of view.

The code examples below would not pass code-review at the places I've worked at for reasons varying from variable naming, layout, macros and uses of operators that require more than a simple glance to figure out what is actually occurring.

template<class _BidIt> inline
bool _Next_permutation(_BidIt _First, _BidIt _Last)
{  // permute and test for pure ascending, using operator<
_BidIt _Next = _Last;
if (_First == _Last || _First == --_Next)
   return (false);

for (; ; )
   {  // find rightmost element smaller than successor
   _BidIt _Next1 = _Next;
   if (_DEBUG_LT(*--_Next, *_Next1))
      {  // swap with rightmost element that's smaller, flip suffix
      _BidIt _Mid = _Last;
      for (; !_DEBUG_LT(*_Next, *--_Mid); )
         ;
      _STD iter_swap(_Next, _Mid);
      _STD reverse(_Next1, _Last);
      return (true);
      }

   if (_Next == _First)
      {  // pure descending, flip all
      _STD reverse(_First, _Last);
      return (false);
      }
   }
}


_Ty operator()()
   {  // return next value
   static _Ty _Zero = 0;   // to quiet diagnostics
   _Ty _Divisor = (_Ty)_Mx;

   _Prev = _Mx ? ((_Ity)_Ax * _Prev + (_Ty)_Cx) % _Divisor
      : ((_Ity)_Ax * _Prev + (_Ty)_Cx);
   if (_Prev < _Zero)
      _Prev += (_Ty)_Mx;
   return (_Prev);
   }

Please note I'm not critiquing the interface, as it is very well designed and applicable. What I'm concerned about is the readability of the implementation details.

A similar questions have been previously posed:

Is there a readable implementation of the STL

Why STL implementation is so unreadable? How C++ could have been improved here?

Note: The code presented above is taken from MSVC 2010 algorithm and queue headers.

Community
  • 1
  • 1
  • 2
    As I read it, the question is not really "Why is the STL so convoluted?" but is more along the lines of "Why do commonly used implementations of the STL have such apparently convoluted source code?" Is that correct? – James McNellis Nov 14 '10 at 22:53
  • @James: That is correct. I've further qualified my question with a statement at the bottom. –  Nov 14 '10 at 22:54
  • 3
    STL isn't exactly an object model that was widely adopted in other languages. But that's not the point of your complaint. This is the kind of crap that library designers have to go through when the language has a preprocessor. Oh, and having to write all of the implementation in a header. – Hans Passant Nov 14 '10 at 22:58
  • 25
    It was hard to write. It should be hard to read. – JUST MY correct OPINION Nov 14 '10 at 23:00
  • 7
    @Hans: They've had two decades to say, ok that its we're redoing this from scratch" - given that c++0x requires changes for move and rvalue semantics wouldn't it be a good time to say lets just re-write the whole thing, use only the language, use all the new good stuff being added etc. –  Nov 14 '10 at 23:01
  • 2
    @sonicoder: no. If that's what you want, use D or something. C++ is not going to totally break backward compatibility in favour of a more modern, cleaner compilation model. – Steve Jessop Nov 14 '10 at 23:11
  • 6
    "given ... it being one of the primary sources for people to view examples of well written C++ for both awe and learning purposes" While using STL can help programmers write simpler and perhaps clearer code, why do you assume it was ever intended for the STL's implementation details to be simple and clear and function as a good example of C++? – TheUndeadFish Nov 14 '10 at 23:13
  • @soni, high time for a new Alexandrescu. Go for it! – Hans Passant Nov 14 '10 at 23:15
  • 1
    @TheUndeadFish: The STL is definitely a source to learn about C++ template/metaprogramming. Its not an assumption, Libraries like Boost et al, at some point all base the experiences on use of the STL and the implementation details of STL. My argument is can it be made such that newcomers can enjoy its value too? Does there always have to be an Everest to climb? –  Nov 14 '10 at 23:17
  • 4
    @sonicoder: Someone undoubtedly owns the copyright to this code. At a bare minimum, please identify from where you have copied the code. – James McNellis Nov 14 '10 at 23:29
  • 1
    @James: I'm fairly certain this is the MSVC implementation. I've spent (too) much time in there. – John Dibling Nov 15 '10 at 00:14
  • 2
    @sonicoder: I find the STL quite readable, much to the contrary of Boost. It's complicated, but at least you have pretty much everything in a single place, whereas Boost requires you to trek among loads of headers... – Matthieu M. Nov 15 '10 at 07:59
  • 1
    @sonicoder: Regarding the indentation, it's called *Whitesmiths style* – Matthieu M. Nov 15 '10 at 08:03

6 Answers6

21

About the variables names, library implementors must use "crazy" naming conventions, such as names starting with an underscore followed by an uppercase letter, because such names are reserved for them. They cannot use "normal" names, because those may have been redefined by a user macro.

Section 17.6.3.3.2 "Global names" §1 states:

Certain sets of names and function signatures are always reserved to the implementation:

  • Each name that contains a double underscore or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.

  • Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

(Note that these rules forbid header guards like __MY_FILE_H which I have seen quite often.)

fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • @peoro, namespaces do not help for variables within a class. – Winston Ewert Nov 14 '10 at 23:03
  • 7
    @peoro namespaces don't protect you from preprocessor macros. `_Names` are reserved for the implementation, so if you have macros like that it's on your head. On the other hand, you shouldn't be forced to guess what an STL implementation might decide to use an identifier internally that you shouldn't make a macro, especially as this can vary from implementation to implementation. – Logan Capaldo Nov 14 '10 at 23:05
  • You are right. I skipped "macro" and thought about user variables/functions, which didn't make sense. – peoro Nov 14 '10 at 23:07
19

Neil Butterworth, now listed as "anon", provided a useful link in his answer to the SO question "Is there a readable implementation of the STL?". Quoting his answer there:

There is a book The C++ Standard Template Library, co-authored by the original STL designers Stepanov & Lee (together with P.J. Plauger and David Musser), which describes a possible implementation, complete with code - see http://www.amazon.co.uk/C-Standard-Template-Library/dp/0134376331.

See also the other answers in that thread.

Anyway, most of the STL code (by STL I here mean the STL-like subset of the C++ standard library) is template code, and as such must be header-only, and since it's used in almost every program it pays to have that code as short as possible.

Thus, the natural trade-off point between conciseness and readability is much farther over on the conciseness end of the scale than with "normal" code.

In addition, the standard library is where the system-independent view of application code is connected to the underlying system, utilizing all kinds of compiler-specific things that you as an application developer should best stay away from.

Donald Duck
  • 8,409
  • 22
  • 75
  • 99
Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • 1
    concision is one aspect, performance is another (which can also harm readability): people are always complaining of suspected fat and inefficiencies in STL implementations - over the years almost all has been whittled away, but there will have been occasional trade-offs in readability and elegance. – Tony Delroy Nov 15 '10 at 01:28
13

Variable names for the reason that this is standard library code, and it should use reserved names for implementation details in headers. The following should not break the standard libraries:

#define mid
#include <algorithm>

So standard library headers can't use mid as a variable name, hence _Mid. The STL was different - it wasn't part of the language specification, it was defined as "here are some headers, use them as you will"

Your code or mine, on the other hand, would be invalid if it used _Mid as a variable name since that's a reserved name - the implementation is allowed to do:

#define _Mid

if it feels like it.

Layout - meh. They probably have a style guide, they probably follow it, more or less. The fact that it doesn't match my style guide (and hence would fail my code review) is nothing to them.

Operators that are difficult to work out - difficult to whom? Code should be written for the people who maintain it, and GNU/Dinkumware/whoever probably don't want to let people loose on the standard libraries who can't puzzle out *--_Next at a glance. If you use that sort of expression, you get used to it, and if you don't you'll continue finding it hard.

I will give, you, though, that operator() overload is gibberish. [Edit: I get it, it's a linear congruential generator, done very generically, and if the modulus is "0" that means just use the natural wraparound of the arithmetic type.]

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • 2
    But this does break the standard libraries, of course it will. How can they define `std::pair` when `first` gets changed out to nothing? – Peter Alexander Nov 14 '10 at 23:04
  • @Peter: d'oh! I'll think of a better one. – Steve Jessop Nov 14 '10 at 23:06
  • 2
    Aaah I see, so its some kind of anti-piracy/anti-circumvention technique... :) –  Nov 14 '10 at 23:11
  • @sonicoder: yes, or if you've read "The City and the City", or "Neverwhere", or "Harry Potter", it's so that wizards and muggles can coexist without crashing into each other. – Steve Jessop Nov 14 '10 at 23:14
  • "GNU/Dinkumware/whoever probably don't want to let people loose on the standard libraries who can't puzzle out *--_Next at a glance." ... This is against the spirit of opensource software, and sadly, way too similar to microsoft's old (job)security through obscurity model. Made me really sad. – Spundun May 30 '14 at 21:10
  • 1
    FWIW, RMS more or less *invented* (or at least greatly developed) the spirit of open source software, and I'm pretty sure used the idiom in glibc. If GNU considers `*--` a reasonable operator combination and you don't, then *most* people are going to follow GNU, so will be able to read it. Personally I'm not that bothered: I don't think code should be wilfully obscure, of course, but equally I don't think that a component as important as the standard libraries, in a language as tricky as C++, should be put off inventing and using idioms just because beginners find them initially hard. – Steve Jessop Jun 01 '14 at 00:20
4

Implementations vary. libc++ for example, is much easier on the eyes. There's still a bit of underscore noise though. As others have noted, the leading underscores are unfortunately required. Here's the same function in libc++:

template <class _Compare, class _BidirectionalIterator>
bool
__next_permutation(_BidirectionalIterator __first, _BidirectionalIterator __last, _Compare __comp)
{
    _BidirectionalIterator __i = __last;
    if (__first == __last || __first == --__i)
        return false;
    while (true)
    {
        _BidirectionalIterator __ip1 = __i;
        if (__comp(*--__i, *__ip1))
        {
            _BidirectionalIterator __j = __last;
            while (!__comp(*__i, *--__j))
                ;
            swap(*__i, *__j);
            _STD::reverse(__ip1, __last);
            return true;
        }
        if (__i == __first)
        {
            _STD::reverse(__first, __last);
            return false;
        }
    }
}
ergosys
  • 47,835
  • 5
  • 49
  • 70
  • 3
    They are only required for compiler specific keywords and macros not class/struct members. though the convention has been to other prepend and "M_" or "m_" or append an "_" to denote class member as opposed to local or global variable. –  Nov 14 '10 at 23:09
  • Well, I guess there is room for improvement :) – ergosys Nov 14 '10 at 23:13
  • 1
    This code really isn't all that different from the code in the original question, though. It just reformats the indentation, uses `while` instead of `for` for the loops, and cuts out the debug support macros. All of those fall into the "trivial coding standard issues" category and none are particularly important (well, with the exception of the debug support macros, but if you want debugging support, you either need to write two implementations or use macros to conditionally compile it in). – James McNellis Nov 14 '10 at 23:13
  • 1
    I maintain only that it is easier to read, and an example to my point that there are more readable implementations. I get a headache looking at the OP's example, but this one isn't too bad. – ergosys Nov 14 '10 at 23:56
  • 6
    @sonicoder: no, the underscores are required on *everything*, for the reasons @Steve Jessop and others have mentioned: anything non-reserved name could theoretically be redefined as a macro by the library user, which would break the standard library code. So they have to use leading underscores, because those are the names reserved for use by the implementation. – jalf Nov 15 '10 at 02:37
2

I suspect part of the reason is that the code in the STL is highly optimized. The sort of code being implemented has performance being much more important then readability. Because they are so widely used it makes sense to make them as fast as possible.

Winston Ewert
  • 44,070
  • 10
  • 68
  • 83
  • 8
    you can write really optimial code and not make it convoluted, the STL's going around are just variations of the library started 2 decades ago. –  Nov 14 '10 at 23:03
0

To add on what people have said already, the style you see is the GNU style. Ugly? Perhaps, that's in the eye of the beholder. But it's a strictly-defined style, and it does make all code look similar, as opposed to resistant to getting used to.

wilhelmtell
  • 57,473
  • 20
  • 96
  • 131