Edit: ST does not allow to post more than two links for newbies. Sorry for the missing references.
I'm trying to reduce locking overhead in a C application where detecting changes on a global state is performance relevant. Even though I've been reading quite a lot on the topic lately (e.g. a lot from H. Sutter, and many more) I fail to be confident about my implementation. I would like to use a combination of a CAS like operation and DCL for a check on a Cache-Line Aligned global variable, thus avoiding false-sharing, to update thread local data from data shared among multiple threads. My lack of confidence is mainly due to
- me failing to interpret the GNU documentation on Type-Attributes
- I seem not being able to find any literature and examples that I could easily translate to C, such as aligning-to-cache-line-and-knowing-the-cache-line-size on ST or 1 (although 1 seems to answer my question somewhat I'm not confident with my implementation)
- my experience with C is limited
My questions:
The Type-Attributes documentation states:
This attribute specifies a minimum alignment (in bytes) for variables of the specified type. For example, the declarations:
(please see Type-Attributes documentation for declaration)
force the compiler to insure (as far as it can) that each variable whose type is
struct S
ormore_aligned_int
will be allocated and aligned at least on a8-byte
boundary. On a SPARC, having all variables of typestruct S
aligned to8-byte
boundaries allows the compiler to use the ldd and std (doubleword load and store) instructions when copying one variable of type struct S to another, thus improving run-time efficiency.Does that mean that the beginning of
struct S
ormore_aligned_int
will always be aligned to8-byte
boundary? It does not mean the data will be padded to use exactly 64 bytes, right?Assuming 1. is true that every instance of
struct cache_line_aligned
(see code Example 1 below) aligns on64-byte
boundaries and utilize exactly one cache-line (assuming cache-lines are64 bytes
in length)Using
typedef
for the type declaration does not alter the semantics of__attribute__ ((aligned (64)))
(see code Example 2 below)I do not need to use
aligned_malloc
when instantiating the struct if struct is declared with__attribute__ ...
// Example 1
struct cache_line_aligned {
int version;
char padding[60];
} __attribute__ ((aligned (64)));
// Example 2
typedef struct {
int version;
// place '__attribute__ ((aligned (64)))' after 'int version'
// or at the end of the declaration
char padding[60];
} cache_line_aligned2 __attribute__ ((aligned (64)));
And finally a sketch of a function that uses the cache-line aligned approach to efficiently check if global state has been modified by some other thread:
void lazy_update_if_changed(int &t_version, char *t_data) {
// Assuming 'g_cache_line_aligned' is an instance of
// 'struct cache_line_aligned' or 'struct cache_line_aligned2'
// and variables prefixed with 't_' being thread local
if(g_cache_line_aligned.version == t_version) {
// do nothing and return
} else {
// enter critical section (acquire lock e.g. with pthread_mutex_lock)
t_version = g_cache_line_aligned.version
// read other data that requires locking where changes are notified
// by modifying 'g_cache_line_aligned.version', e.g. t_data
// leave critical section
}
}
Sorry for the long post.
Thank you!