Why are there no hashtables in the C standard library?

Question

Why is that there is no Hashtable support as part of Standard C Library? Is there any specific reason for this?

it is part of a strategy to increase employment among C developers. — Cheeso, May 25 '11 at 01:14
How to version: https://stackoverflow.com/questions/4384359/quick-way-to-implement-dictionary-in-c — Ciro Santilli OurBigBook.com, Aug 20 '17 at 18:36

score 70 · Answer 1 · edited Apr 19 '23 at 12:10

70

C seems unusual by today's standards because there are no useful data structures defined. None. Not even strings — and if you think a C string is a data structure, well, we'll have to disagree on what a "data structure" is.

If you like C, then think of it as a "blank slate"... your entire application is made of code written by you and libraries you choose to pull in, plus a few fairly primitive standard library functions, with maybe one or two exceptions like qsort. People use C these days to implement things like Python, Ruby, Apache, or the Linux kernel. These are projects that use all of their own data structures anyway, and they wouldn't be likely to use something like the STL.

Many C libraries implement generic hash tables. There are tradeoffs, and you can pick your favorite. Some of them are configurable using callbacks.

Glib has a hash table object (documentation)
Apache Portable Runtime has a hash table (documentation)
Apple's Core Foundation library has a hash table (documentation) Note: Yes you can insert any object as key or value.
UTHash is a hash table library (documentation)
Another hash table library (link)

With all of these libraries that do what you want, what's the point of adding a hash table to the C standard?

edited Apr 19 '23 at 12:10

Hi-Angel

4,933
8
63
86

answered May 25 '11 at 01:26

Dietrich Epp

205,541
37
345
415

9

Hmm, not sure I agree with the "C string is not a data structure" bit (not enough to downvote though) - they have a defined structure and functions for manipulating them. Other than complexity, I'm not sure that's any different to a BTree or a HashTable. In any case, we also have `FILE *` :-) – paxdiablo May 25 '11 at 02:43
+1 for an answer that actually does a decent job of answering the "why" part. – R.. GitHub STOP HELPING ICE May 25 '11 at 03:34
@paxdiablo: I thought about mentioning `FILE*`... that's a good point. – Dietrich Epp May 25 '11 at 05:04
1

Best answer. There's no need to create a standard hash table library because at this level, you really care which one you use, so you should go look at some open source libraries or write your own. – sudo Jul 26 '17 at 10:38
I love `UTHash`; would recommend it if you just want something quickly. It's super easy to learn and works without making you create a bunch of objects. And yeah, the author totally abused macros to make it work, but it's small and simple enough that I don't care. – sudo Jul 26 '17 at 10:50

score 55 · Accepted Answer · edited Oct 04 '17 at 23:51

There is no hashtable in the standard C library because either:

no-one has submitted a proposal to the working group; or
the working group has deemed it unnecessary.

That's the way ISO works. Proposals are put forward and accepted or rejected.

You have to be careful what you add to the standard library since you have two conflicting groups. As a user, you might want every data structure under the sun to be added to the standard to make the language more useful.

But, as a language implementor (as an aside, these are probably the people that tend to make up most of the various working groups so their view is likely to have more impact), you don't really want the hassle of having to implement stuff that may not be used by everyone. All the stuff that was there when C89 appeared was to do with the fact that the primary purpose was to codify existing practice rather than introduce new practices. All iterations of the standards since then have been a little freer in what they can do but backwards compatibility is still an important issue.

Myself, I also have conflicts. I'd love to have all the features of the Java, C++ or Python libraries at my disposal in C. Of course, that would make it so much harder to learn everything for newcomers and, as one commenter stated, probably make it so any code monkey can pump out useful code, reducing my value in the process :-)

And I pretty much have all the data structures I'll ever need, from my long and (mostly) illustrious career. You're not limited to the standard library for this sort of stuff. There are plenty of third-party tools you can get to do the job and (like me) you can also roll your own.

If you want to know why certain decisions were made in each iteration, ISO (and ANSI originally, before ISO took over) usually publish rationale documents. The C89 one from ANSI can be found here. It contains this little beauty in the scope:

This Rationale focuses primarily on additions, clarifications, and changes made to the language as described in the Base Documents. It is not a rationale for the C language as a whole: the Committee was charged with codifying an existing language, not designing a new one. No attempt is made in this Rationale to defend the pre-existing syntax of the language, such as the syntax of declarations or the binding of operators.

I especially enjoy the admission that they're not responsible for any unholy mess that may have predated their attempts to standardise.

But, perhaps the real answer to your question lies in this bit, one of the guiding principles:

Keep the spirit of C. The Committee kept as a major goal to preserve the traditional spirit of C. There are many facets of the spirit of C, but the essence is a community sentiment of the underlying principles upon which the C language is based. Some of the facets of the spirit of C can be summarized in phrases like:

Trust the programmer.
Don't prevent the programmer from doing what needs to be done.
Keep the language small and simple.
Provide only one way to do an operation.
Make it fast, even if it is not guaranteed to be portable.

That third one is probably the main reason why the library wasn't massively expanded with the initial standardisation effort - that, and the fact that such an expansion from a committee would probably have resulted in ANSI C being labeled C2038 rather than C89.

"Keep the language small and simple. Provide only one way to do an operation." In this still C you are talking about? int or or signed or signed int, if-else or ?:, if-else or switch, function-like macros or functions, continue or break or goto or return ;, const int or int const, typedef struct or struct, enum or const or #define, malloc or calloc, NULL or 0 or '\0', *(ptr+n) or ptr[n]. Etc, etc, this is just the top of the iceberg I could go on all day, really. — Lundin, May 25 '11 at 06:22
Also, the committee really made an admirable effort to make the language needlessly complex. When was the last time you used the following headers: complex.h, errno.h, fenv.h, float.h, inttypes.h, iso646.h, locale.h, signal.h, wctype.h. To me, a header file containing a hash table seems about one hundred times more useful than these. — Lundin, May 25 '11 at 06:29
@Lundin: see the section containing "charged with codifying an _existing_ language, not designing a _new_ one". Most of that stuff was already there pre-ANSI. Of those headers, I've often used `errno`, `locale` and `signal` - I don't think I've _ever_ used any of the others but, then again, I'm not the only C coder on the planet :-) Things like `complex`, `fenv`, `inttypes`, `iso646` and `wctype` came later so have nothing to do with the C89 rationale. You would have to look at the rationales for the specific iterations. — paxdiablo, May 25 '11 at 06:45
@Lundin: for example: "A new feature of C99, complex types were added to C as part of the effort to make C suitable and attractive for general numerical programming. Complex arithmetic is used heavily in certain important application areas." - from http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf — paxdiablo, May 25 '11 at 06:49
@paxdiablo What I've heard through rumours from people who were in the C99 committee, complex numbers were only added because there was a single representative some a big important company demanding them, and not for the benefit of the general public. — Lundin, May 25 '11 at 07:02
It's all there in the C99 rationale I provided a link to, starting on page 3: internationalisation, further codification for areas C89 was deficient, minimise incompatibilities with C89 and C++, maintain conceptual simplicity. If you think you can do a better job, feel free to apply for the next WG :-) And, re your rumour comment, see my paragraph beginning "As an aside". — paxdiablo, May 25 '11 at 07:03
Keep in mind I'm not saying their decisions were _right,_ just explaining how the process works. All those libraries would have been a godsend in my early career. Not so much now since I have all my own stuff. — paxdiablo, May 25 '11 at 07:06
Try writing software to design or simulate electronics components without a good library for complex arithmetic — evanflash, Apr 19 '12 at 14:55
"One way to do everything" is where they kinda failed, but they tried. They didn't try with C++... — sudo, Jul 26 '17 at 10:44

caf · Answer 3 · 2015-08-10T03:40:12.040

17

The standard C library doesn't include any large, persistent data structures - neither lists, nor trees, nor stacks, nor hashtables.

It's not really possible to give a definitive answer without asking the authors of the original C library. However, a plausible explanation is that the implementation of such data structures involves various tradeoffs, and only the author of the application is in the correct position to make those tradeoffs.

Note that the POSIX standard C library does specify generic hashtable functions: hcreate(), hsearch() and hdestroy(); and note also that their "one size fits all" nature tends to render them inadequate for most real-world use cases, supporting the argument above.

edited Aug 10 '15 at 03:40

answered May 25 '11 at 01:25

caf

233,326
40
323
462

6

Hm, looks like the POSIX hash tables only let you create one hash table at a time. That sounds *very* useful. – Dietrich Epp May 25 '11 at 01:30
4

The POSIX hash table functions are pretty much useless in real-world code. – R.. GitHub STOP HELPING ICE May 25 '11 at 03:33
5

See `hcreate_r` which does not have this restriction (but is a non-POSIX extension). – jjg Feb 10 '14 at 10:52

Ciro Santilli OurBigBook.com · Answer 4 · 2017-09-07T18:07:59.957

3

Due to the lack of templates

This is a guess, but not having templates in the language like C++ does makes implementing containers very inelegant, as you'd need dozens of definitions to cover all possible types, not to mention user defined types.

There are C strategies to mitigate this like playing around with void *, but they lose compile time type checks.

GLib and gnulib are my recommended implementations at the moment: Quick Way to Implement Dictionary in C

edited Sep 07 '17 at 18:07

answered Aug 20 '17 at 18:36

Ciro Santilli OurBigBook.com

347,512
102
1,199
985

2

Why not just use macros to generate code like how templates do? – wingerse Nov 18 '18 at 16:57
@WingerSendon perhaps that is possible, and I've never tried, but I have the gut feeling that the resulting codebase would be insaner than templates, if that is even possible :-) Glad to be proven wrong. – Ciro Santilli OurBigBook.com Nov 18 '18 at 22:53
1

C does not have template only for those who don't know how to use them. Have a look at openssl hash table implementation, they do it very well using preprocessor. – Yuri Nudelman Jun 09 '20 at 11:59
1

@CiroSantilli新疆再教育营六四事件法轮功郝海东 I find using the pre-processor to do `cat` as Kernighan and Ritchie, 1988, A12.3 p231, is more natural when one [has a lot of code](https://github.com/neil-edelman/table). – Neil Feb 01 '22 at 08:46
1

I've done basic "templates" in C before simply with a `#define TEMPLATE_TYPE int` followed by `#include "templated_binary_tree.h", one for each type you need to support. The header file simply uses whatever `TEMPLATE_TYPE` exists on inclusion to create functions tailored to the type. It's not beautiful but it can be made to work :-) – paxdiablo Feb 01 '23 at 23:27
@paxdiablo as said our friend Hendrix: "With the power, of soul, anything is possible!" https://youtu.be/W-M16K6UlQg?t=297 XD – Ciro Santilli OurBigBook.com Feb 02 '23 at 08:06

Why are there no hashtables in the C standard library?

4 Answers4

Linked