2

Am using the C JSON library under Ubuntu (json-c/json.h). I need to parse JSON strings on multiple POSIX threads. Am currently using the json_tokener_parse() method - is this multi-thread safe or do I need to use something else?

thnx

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
dbassu
  • 23
  • 3

1 Answers1

4

I looked through the code: https://github.com/json-c/json-c/blob/master/json_tokener.c

It appears to be thread-safe with one exception:

#ifdef HAVE_SETLOCALE
  char *oldlocale=NULL, *tmplocale;

  tmplocale = setlocale(LC_NUMERIC, NULL);
  if (tmplocale) oldlocale = strdup(tmplocale);
  setlocale(LC_NUMERIC, "C");
#endif

So if HAVE_SETLOCALE is defined (and it probably will be), setlocale() will be called and it will set the process-wide LC_NUMERIC to "C". And of course it undoes this at the end. This will cause problems if your LC_NUMERIC is not "C" or a compatible locale at the beginning, because one thread will "restore" your locale while another one may still be parsing and expecting the "C" locale to be in effect.

Fortunately it is guaranteed that the locale will be "C" on program start, so you just need to make sure that neither you nor any other library you're using sets LC_NUMERIC (or LC_ALL of course) to a locale incompatible with "C". You could then rebuild the library with HAVE_SETLOCALE undefined if you want, but this probably doesn't matter, as its calls to setlocale() will have no real effect.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • 1
    Good catch. Doesn't the C library do the equivalent of `setlocale(LC_ALL, "C")` at startup anyway? – Jonathan Leffler Oct 15 '14 at 04:36
  • 1
    This could be fixed via proper use of `newlocale` and `uselocale`, but the fact that `LC_NUMERIC` can mess up the radix point for floating point parsing and printing is a huge mess that's hard to solve in a way that's thread-safe, portable, and efficient. – R.. GitHub STOP HELPING ICE Oct 15 '14 at 04:41
  • @JonathanLeffler: aha, good catch by you as well. You're right that `"C"` is guaranteed to be the locale on startup. I'll update my answer to reflect this. – John Zwinck Oct 15 '14 at 04:42
  • @JonathanLeffler: I don't see how the fact that the "C" locale is active by default when you don't call `setlocale` is relevant. The call to `setlocale` is still a problem. – R.. GitHub STOP HELPING ICE Oct 15 '14 at 04:43
  • @R..: the call to `setlocale()` is only a problem if your code does an explicit call to `setlocale()` somewhere with a value other than `"C"` (such as `""`, the 'locale-specific native environment'). If your code does not mess with the locale at all, it isn't necessary to worry about it being called in the library. There could be data races to worry about, though. I didn't say that `setlocale()` was not a problem; I just pointed out that it is not necessary to do `setlocale(LC_NUMERIC, "C");` yourself, as originally suggested, unless you've changed the locale settings somehow. – Jonathan Leffler Oct 15 '14 at 04:48
  • @R..: from here https://github.com/json-c/json-c/commit/a01b659ace168d85a3e9e47848eaaba2bea31078 it looks like `setlocale()` is only used to accommodate one call to `sscanf()` with the format `"%lf"`. This is wasteful anyway; `strtod` could probably be used instead, but that respects the locale too. Easier might be to simply include a number parsing function that ignores locale as required. That is discussed here: http://stackoverflow.com/questions/1994658/locale-independent-strtod-implementation – John Zwinck Oct 15 '14 at 04:48
  • It would be easier, in some respects, if all the functions that are affected by locale had a variant that took some opaque 'locale' pointer (`struct lconv` from `` is not sufficient, in general) as an explicit argument. The functions that are currently defined without a locale would become simple cover functions that arrange to pass the 'current locale' to the functions with an explicit locale. Similar support for time zones (perhaps an aspect of locale, but in some respects rather different) would also be useful. This would allow libraries such as _json-c_ to work better. – Jonathan Leffler Oct 15 '14 at 04:53
  • 1
    @JohnZwinck: `strtod` is highly nontrivial to reimplement from scratch, which is the whole reason it's so bad that it uses a locale-specific radix character. You *can* however obtain that character via the `localeconv` function, so it's possible that you could "reformat" the string to use the locale's radix character before calling `strtod`. – R.. GitHub STOP HELPING ICE Oct 15 '14 at 04:55
  • @R..: of course we wouldn't implement `strtod` from scratch. We'd use the one from Ruby linked from that other topic I mentioned, or some other existing implementation. Certainly there are plenty out there, a few of them even trustworthy. :) – John Zwinck Oct 15 '14 at 05:06