According to Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 (August 2012 version, but this should not have changed), Section 11.3.1, the buffer must be flushed:
The protocol for evicting the WC buffers is implementation dependent and should not be relied on by software for system memory coherency. When using the WC memory type, software must be sensitive to the fact that the writing of data to system memory is being delayed and must deliberately empty the WC buffers when system memory coherency is required.
If the graphics drivers did not actually flush the write combining buffers, then they were depending on system specific timing and/or buffer sizes (while assuming that subsequent WC writes will be allocated to the buffer, this is not architecturally guaranteed). This may work (or appear to work) on existing systems under ordinary workloads, but it is not architecturally guaranteed to work.
Since a broad range of serializing events will flush the write combining buffers, it is quite possible that the flush operation/event is present but not obvious (as an SFENCE would be). From Intel® 64 and IA-32 Architectures Software Developer’s Manual (version 052, September 2014), Volume 3, Section 11.3 Methods of Caching Available:
If the WC buffer is partially filled, the writes may be delayed until the next occurrence of a serializing event; such as, an SFENCE or MFENCE instruction, CPUID execution, a read or write to uncached memory, an interrupt occurrence, or a LOCK instruction execution.
For example, a write to a GPU register (if mapped to uncached memory) would flush the write combining buffer.