Update: as I should have expected, the community's sound advice in response to this question was to "measure it and see." chibacity posted an answer with some really nice tests that did this for me; meanwhile, I wrote a test of my own; and the performance difference I saw was actually so huge that I felt compelled to write a blog post about it.
However, I should also acknowledge Hans's explanation that the ThreadStatic
attribute is indeed not free and in fact relies on a CLR helper method to work its magic. This makes it far from obvious whether it would be an appropriate optimization to apply in any arbitrary case.
The good news for me is that, in my case, it seems to have made a big improvement.
I have a method which (among many other things) instantiates some medium-size arrays (~50 elements) for a few local variables.
After some profiling I've identified this method as something of a performance bottleneck. It isn't that the method takes an extremely long time to call; rather, it is simply called many times, very quickly (hundreds of thousands to millions of times in a session, which will be several hours). So even relatively small improvements to its performance should be worthwhile.
It occurred to me that maybe instead of allocating a new array on each call, I could use fields marked [ThreadStatic]
; whenever the method is called, it will check if the field is initialized on the current thread, and if not, initialize it. From that point on all calls on the same thread will have an array all ready to go at that point.
(The method initializes every element in the array itself, so having "stale" elements in the array should not be an issue.)
My question is simply this: does this seem like a good idea? Are there pitfalls to using the ThreadStatic
attribute in this way (i.e., as a performance optimization to mitigate the cost of instantiating new objects for local variables) that I should know about? Is the performance of a ThreadStatic
field itself perhaps not great; e.g., is there a lot of extra "stuff" happening in the background, with its own set of costs, to make this feature possible?
It's also quite plausible to me that I'm wrong to even try to optimize something as cheap (?) as a 50-element array—and if that's so, definitely let me know—but the general question still holds.