like my last question said(How do I deal with a data race in OpenMP?) there are three solution to do an aggregation . like @wolfpack88's answer ,but the perfromance of the three solutions are differernt, the reduction is twice as fast as the others.
so my question is why it happend and how can I use the other critical and automic to get the same performance.