2

I am trying to find out whether, on Ivy Bridge, its possible to write a 256-bit object which consists of various data types (int, double, float etc)?

I have had a look at the Intel Manual and ctrl+f for "32-byte" but the results were all discussing 256-bits of the same data type (so 4x doubles or 8x floats etc).

I am doing this as part of a lock-free design to ensure data consistency- load all 256 bits of data together, then extract each of the various components separately.

user997112
  • 29,025
  • 43
  • 182
  • 361
  • 1
    You can do large atomic operations on some versions of Haswell, if the chip has the TSX http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions support. – Zan Lynx Nov 13 '14 at 19:44
  • @ZanLynx As the Wikipedia article notes, there is a bug in Haswell's implementation of TSX. Its recommended use is now limited to development. –  Nov 14 '14 at 01:52

1 Answers1

3

I did a Web search, and it appears that Intel does not guarantee that a 32 byte write is atomic. I found this which suggests that not even regular 8 byte writes are guaranteed atomic.

Intel provides the compare and exchange 8 byte instruction which is atomic.

Bottom line is that I think you will need to take another approach.

EDIT: I forgot about the x86 lock prefix. Looking at this, it says that byte memory operations are guaranteed atomic, while larger operations are not unless the LOCK prefix is used on the read/write instruction.

Community
  • 1
  • 1
Craig S. Anderson
  • 6,966
  • 4
  • 33
  • 46
  • It should be noted that the linked page concerned *unaligned* accesses not being guaranteed to be atomic. –  Nov 14 '14 at 01:57
  • Aligned 8-byte reads and writes (but not read-writes, except with lock) are also atomic, as of Pentium. Unaligned reads and writes of size 2, 4 or 8 that don't cross a cache line boundary are atomic as of P6. It's a moving target, perhaps not all online sources are using the newest specs (not that P6 is particularly new..). – harold Nov 14 '14 at 09:03
  • @PaulA.Clayton I can align the data to a 32-byte boundary. Which two instructions would I require for loading and storing atomically (separately)? – user997112 Nov 14 '14 at 09:25
  • @user997112 I don't think VMOVDQA guarantees atomicity (with respect to other threads, it guarantees atomicity with respect to interrupts). Since Haswell has 256-bit L1 Dcache accesses (2 reads and 1 write), it would *presumably* provide such atomicity (for aligned accesses at least), but Sandy Bridge (and so Ivy Bridge) only has 128-bit L1 Dcache accesses so the processor might not guarantee that a read or write is fully done before the cache line is invalidated (for a read with a write from another core) or changed to F-state (for a write with a read from another core). **NOT AN EXPERT** –  Nov 16 '14 at 21:21