Say I have a C program that traverses a directory and stores the directory entry metadata in a struct dirent *
named dir
. The program accesses the field dir->d_name
several times. I'm wondering if setting an auxiliary character pointer (e.g. char *str = dir->d_name
) will make the program faster. I know that dereferencing pointers is a relatively expensive operation. The thing is, if I set an auxiliary variable, I'm still dereferencing a pointer; the only difference is in one case I'm dereferencing a pointer to a struct while in the second case I'm dereferencing a pointer to a string. So I guess the crucial question here is, how expensive is accessing individual fields of a struct? I imagine that at the machine level, this would involve first dereferencing to pointer to the struct to get its address, and then incrementing that address by the offset of the desired field. In the case of the auxiliary pointer, you'd dereferencing, at which point you already have the start address for the string.

- 363
- 2
- 9
-
Is this C or C++ ? – Mark Benningfield Jan 25 '18 at 15:44
-
There won't be any performance impact I'd suppose, since the dereferencing is done at compile time. – Jan 25 '18 at 15:44
-
1@MarkBenningfield This question is applicable to either language. – ack Jan 25 '18 at 15:45
-
4I think the performance difference would be absolutely imperceptible (although you could try to write a program that does each of these operation a few million times and compare the time it took) – Gab Jan 25 '18 at 15:46
-
1Generally it doesn't matter, but occasionally [there are surprising exceptions](https://stackoverflow.com/questions/48303172/why-dont-c-compilers-optimize-away-reads-and-writes-to-struct-data-members-as). – nwp Jan 25 '18 at 15:47
-
2*" I know that dereferencing pointers is a relatively expensive operation..."* - It's not as simple as said; And it could cheaper than you think - Never predict the optimizations made by aggressively optimizing compilers – WhiZTiM Jan 25 '18 at 15:47
-
1Benchmark and stare at the assembly. Not much else could be confidently said, but I'd guess it makes zero difference on any capable optimizer. Also, without a [mcve], there can be no constructive discussion – Passer By Jan 25 '18 at 15:49
-
Not sure. Because 1. may the cpu's cache(L1-L3) will cache you pointer 2. in the modern cpu, *(p + offset) is as fast as *p, because they often do this by a hardware component, so visit `dir->d_name` is as fast as `str` which direct point to `d_name`, I think. – superK Jan 25 '18 at 15:53
-
It depends on what variables you are manipulating within the function scope. You would be surprise how much the presence of a `char*` (whose memory address isn't known at compile-time) inhibits optimizations. The fact modifying anything under a `char*` can have side effects on everything else (again whose memory address isn't known at compile-time), makes the compiler not to cache a lot of variables within the affected scope. So you should see the generated assembly to assess that, and you must equally benchmark - surprises may await you. – WhiZTiM Jan 25 '18 at 15:56
-
5Is your program too slow? Otherwise, do not optimize prematurely. There are reasons why under some circumstances I might prefer to create a local variable such as you describe, but they don't involve performance. My first guess would be that any performance gain realized would be very small. – John Bollinger Jan 25 '18 at 16:04
-
1If it's really important for your performance: Don't theoritize, measure instead. Things like these are implementation-dependent - it can be the same speed, faster or slower. No standard tells us anything about what it will be. And yes, things like these _can_ have a baffling performance impact. See [this post](https://stackoverflow.com/q/47683375/2328447) for example. – user2328447 Jan 25 '18 at 16:37
-
'Say I have a C program that traverses a directory': stop right there. Your program is I/O bound, and searching for micro-optimizations in the CPU-bound part of it is completely pointless. The I/O is several orders of magnitude slower. – user207421 Jan 25 '18 at 19:06
1 Answers
The only answer I can give you is it depends. With a trivial compiler from the 70's (note only K&R C...), or a hand made instruction per instruction compiler, usage of a register variable as an auxilliary pointer can lead to a true optimization.
With a decent optimizing compiler things go really tougher. Because of the so called as-if rule, a conformant compiler is free to reorder or suppress anything, provided the observable behaviour or the program is the same as the one of an abstract machine in which all expressions are evaluated as specified by the semantics (ref n1570 draft for C11 5.1.2.3 Program execution). So for low level optimizations, you can only benchmark different versions of code against a specific compiler with specific options, or control the generated assembly (or machine) code. The problem is that it can vary for different compilers of different options. That's the reason why best practices recommend not to care for low level optimizations unless profiling has identified a bottleneck.

- 143,923
- 11
- 122
- 252