25

I left a record processing program of mine running for a few minutes under strace.

This showed in those minutes over 200 000 000 calls to stat("/etc/localtime",..) which sounds a bit excessive and unneeded.

The strace output looks like this:

write(1, "C137015 393393093052629137110 47"..., 16384) = 16384
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2225, ...}) = 0
read(0, "\224q\1\207\0\0\202\1\4\203\1\4\204\1\1\205\1\1\206\1\7\207\1\6\211\1\22\212\1\22\213\1"..., 16384) = 16384

Essentially it turned out to be 1 stat() call for every record processed and the culprit turned out to be this quite ordinary line of code

strftime(call->date_time,DATELEN,"%Y%m%d %H%M%S",&tm_buf);

So - how can I avoid strftime() calling stat(/etc/localtime) at every call?

nos
  • 223,662
  • 58
  • 417
  • 506
  • 4
    Do you have any proof that this is actually causing a measurable problem (i.e. significant runtime cost measured by a profiler)? You shouldn't be trying to optimize based on *feelings* (like "sounds a bit excessive"). Get measurements and use facts. – nobody Dec 29 '10 at 14:13
  • 3
    +1 for question about getting to the bottom of glibc bloat. :-) – R.. GitHub STOP HELPING ICE Dec 29 '10 at 15:14
  • Adding `export TZ=:/etc/localtime` to my shell script eliminated the repeated syscalls for me. – shuckc Apr 07 '13 at 14:57
  • And while this question is old: AndrewMedico: Yes, ofcourse. This was measured to have about 10% performance impact. @R.. It also seems this is required in order to e.g. catch time zone changes, and posix mandates that certain functions behave as if tzset() is called – nos Jun 18 '14 at 07:35
  • @nos: `tzset` is required to respond to changes to the `TZ` variable. It's not required to check whether the file on disk changed when the variable did not change. – R.. GitHub STOP HELPING ICE Jun 18 '14 at 07:39
  • @R.. I suspect tzset() in glibc then checks /etc/localtime when TZ isn't present. It's convenient as it responds to system changes of the timezone without having to restart programs, but comes at a cost. – nos Jun 18 '14 at 07:45
  • @nos: That cost is not just performance but consistency. Subsequent calls to `localtime` when the application has not changed the timezone and expects consistent results could be inconsistent. That seems highly undesirable. – R.. GitHub STOP HELPING ICE Jun 19 '14 at 01:40
  • My rep won't allow me to comment on the previous answer, but thought I'd drop this here as it explains quite well [*why* setting `TZ=:/etc/localtime` has the impact it does](https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/). That being said, setting `TZ=:/etc/localtime` should solve the repeated calls as standard pull of `/etc/localtime` without having the `TZ` var set will trigger 9 `stat` calls per lookup. – Michael Perret Feb 22 '17 at 03:21

2 Answers2

23

It might be doing that because your timezone isn't set. strftime queries /etc/localtime to find it.

Try setting the TZ environment variable.

Here's a link for that behavior.

alexmuller
  • 2,207
  • 3
  • 23
  • 36
Linus Kleen
  • 33,871
  • 11
  • 91
  • 99
  • 1
    setting TZ=":/etc/localtime" does "fix" this. However that page also states "You should not normally need to set TZ" – nos Dec 29 '10 at 13:10
  • 1
    @nos *In the GNU C library, the default time zone is like the specification 'TZ=:/etc/localtime' [...]*. There's the source of the repeated `stat` calls. – Linus Kleen Dec 29 '10 at 13:14
  • Im not sure that explains the stat() calls, if I explicittly set TZ=:/etc/localtime , the stat() calls disappear. – nos Dec 29 '10 at 20:14
  • Okay. For clarity's sake. What has it been set to before? – Linus Kleen Dec 29 '10 at 20:28
  • TZ was unset, RHEL (or for that matter, most linux installations I've seen) doesn't set TZ – nos Dec 30 '10 at 18:03
  • Can you please explain why setting TZ explicitly will eliminate the stats calls? Wouldn't the system go to the default location anyway, which is symlinked by /etc/localtime as suggested by man page of tzset. – swang Aug 21 '13 at 11:49
  • @swang This post is rather "old". Do you have a reference for this? – Linus Kleen Aug 21 '13 at 20:52
  • 3
    @awang apparently not. If the TZ variable is set, glibc will use the TZ variable and it will read the timezone from a file matching what's set in the TZ variable once(or until tzset() is called) - on the first call in a program that needs that information . If the TZ env variable is not set, glibc will go look at the /etc/localtime file every time (or rather it will check if /etc/localtime has changed since the last time it was read - which is what the stat() calls does) – nos Sep 24 '14 at 15:38
  • 1
    This situation has not changed in six years, despite this recent and somewhat publicized investigation that confirms it: https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/ I do not know of any distribution that sets TZ at this time. – Law29 Jul 12 '17 at 12:55
  • 1
    To those interested, I just tried `setenv("TZ", ":/etc/localtime", 0);` at the start of a C program I'm writing, and it appears to work. Comments welcome; I vaguely recall hand-wavy notions about poking the environment from inside a program for purposes other than `exec()` being *bad*, but this appears to work, and now I don't need a wrapper script for my binary. – i336_ Sep 29 '19 at 08:03
1

With GNU libc library, here is another tip to solve this when /etc/localtime is not configured (this is often the case on tiny embedded systems). This consists to set TZ environment variable to the empty string. This makes the library default to the UTC.

Here is an example strace output on some embedded system calling a logging function which calls localtime() and strftime() when neither /etc/localtime nor TZ are set. With glibc 2.23, this triggers several unsuccessful calls to "openat(/etc/localtime)":

[...]
[pid 200113] gettid() = 200113
[pid 200113] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 200113] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 200113] gettid() = 200113
[pid 200113] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 200113] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 200113] getpid() = 200113
[pid 200113] writev(4, [{iov_base="DUH\1\1\0\0\0", iov_len=8}, {iov_base="=\2\0\261ECU1\0\3\r\261\r\32\265\3431\2SWMCSWMC", iov_len=26}, {iov_base="\0\2\0\0'\0Tue Mar 1 13:49:20 2022 _"..., iov_len=151}], 3) = 185
[pid 200113] gettid() = 200113
[pid 200113] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 200113] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 200113] gettid() = 200113
[...]

And here is the same program's output with strace when TZ is set to the empty string:

$ export TZ=
[...]
[pid 201837] gettid() = 201837
[pid 201837] gettid() = 201837
[pid 201837] getpid() = 201837
[pid 201837] writev(4, [{iov_base="DUH\1\1\0\0\0", iov_len=8}, {iov_base="=\2\0\261ECU1\0\3\24m\r7\253\3171\2SWMCSWMC", iov_len=26}, {iov_base="\0\2\0\0'\0Tue Mar 1 13:52:30 2022 _"..., iov_len=151}], 3) = 185
[pid 201837] gettid() = 201837
[pid 201857] <... read resumed>"DUH\1\f\0\0\0\0DUH\1\6\0\0\0\5\0\0\0\0\0DUH\1\6\0\0\0\5"..., 10024) = 37
[pid 201837] gettid( <unfinished ...>
[pid 201857] read(3, <unfinished ...>
[pid 201837] <... gettid resumed>) = 201837
[pid 201837] getpid() = 201837
[pid 201837] writev(4, [{iov_base="DUH\1\1\0\0\0", iov_len=8}, {iov_base="=\3\0\257ECU1\0\3\24m\r7\253\364A\2SWMCSWMC", iov_len=26}, {iov_base="\0\2\0\0'\0Tue Mar 1 13:52:30 2022 _"..., iov_len=149}], 3) = 183
[...]

N.B.: Here is the corresponding glibc's source code snippet in time/tzset.c which sets UTC when TZ is empty:

/* Interpret the TZ envariable. */
static void
internal_function
tzset_internal (int always, int explicit)
{
 static int is_initialized;
 const char *tz;
if (is_initialized && !always)
 return;
 is_initialized = 1;
/* Examine the TZ environment variable. */
 tz = getenv ("TZ");
 if (tz == NULL && !explicit)
 /* Use the site-wide default. This is a file name which means we
 would not see changes to the file if we compare only the file
 name for change. We want to notice file changes if tzset() has
 been called explicitly. Leave TZ as NULL in this case. */
 tz = TZDEFAULT;
 if (tz && *tz == '\0')
 /* User specified the empty string; use UTC explicitly. */
 tz = "Universal";
[...]
Rachid K.
  • 4,490
  • 3
  • 11
  • 30