Working with NYU's fork of MIT's xv6 operating system, we found we would get crashes under GCC 11 & 12 due to default usage of SSE2 instructions under -O0.
Problem is I don't know why. Issue is first encountered during an entirely innocent struct copy here.
When compiled with -mno-sse
under GCC 12.2 the result is:
*np->tf = *proc->tf;
801047c3: 65 a1 04 00 00 00 mov %gs:0x4,%eax
801047c9: 8b 48 18 mov 0x18(%eax),%ecx
801047cc: 8b 45 e0 mov -0x20(%ebp),%eax
801047cf: 8b 40 18 mov 0x18(%eax),%eax
801047d2: 89 c2 mov %eax,%edx
801047d4: 89 cb mov %ecx,%ebx
801047d6: b8 13 00 00 00 mov $0x13,%eax
801047db: 89 d7 mov %edx,%edi
801047dd: 89 de mov %ebx,%esi
801047df: 89 c1 mov %eax,%ecx
801047e1: f3 a5 rep movsl %ds:(%esi),%es:(%edi)
And this works fine, when compiled without disabling SSE the result is:
*np->tf = *proc->tf;
8010479f: 65 a1 04 00 00 00 mov %gs:0x4,%eax
801047a5: 8b 50 18 mov 0x18(%eax),%edx
801047a8: 8b 45 f0 mov -0x10(%ebp),%eax
801047ab: 8b 40 18 mov 0x18(%eax),%eax
801047ae: f3 0f 6f 02 movdqu (%edx),%xmm0
801047b2: 0f 11 00 movups %xmm0,(%eax)
801047b5: f3 0f 6f 42 10 movdqu 0x10(%edx),%xmm0
801047ba: 0f 11 40 10 movups %xmm0,0x10(%eax)
801047be: f3 0f 6f 42 20 movdqu 0x20(%edx),%xmm0
801047c3: 0f 11 40 20 movups %xmm0,0x20(%eax)
801047c7: f3 0f 6f 42 30 movdqu 0x30(%edx),%xmm0
801047cc: 0f 11 40 30 movups %xmm0,0x30(%eax)
801047d0: f3 0f 6f 42 3c movdqu 0x3c(%edx),%xmm0
801047d5: 0f 11 40 3c movups %xmm0,0x3c(%eax)
And this traps on invalid opcode at the first SSE instruction, 801047ae:
unexpected trap 6 from cpu 0 eip 801047ae (cr2=0x0)
So uh, what gives? These are all unaligned access instructions, so alignment shouldn't be an issue. I've tested under both qemu-system-i386
and qemu-system-x86_64
, same results. Tested with -machine accel=kvm -cpu max
, same results.