CUDA C Programming Guide section on Asynchronous Current Execution
A stream is a sequence of commands (possibly issued by different host
threads) that execute in order. Different streams, on the other hand,
may execute their commands out of order with respect to one another or
concurrently; this behavior is not guaranteed and should therefore not
be relied upon for correctness (e.g., inter-kernel communication is
undefined).
If the application relied on Compute Capability 2.* and 3.0 implementation of streams then the program violates the definition of streams and any change to the CUDA driver (e.g. queuing of per stream requests) or new hardware will break the program.
If you need a temporary workaround then I would suggest moving all work to a single user defined stream. This may impact performance but it is likely the only temporary workaround.