Is it because you are guaranteed to have the exact same state after
running the reducers?
Yes, this is what makes pure-reducers the "gold standard". If the output depends only on the input, then it is very easy to test, replay, keep history, etc...
If so, surely even side-effectful (ie. non-pure) reducers could have
this property?
(not the popular answer). This is also correct. non-pure reducers could also have these same properties, if you are careful. However, it is much more error prone, and (conceptually) doesn't make much sense to do. The idea that (I think) you are getting at is that everything is just input and output. You could change the line of thinking a bit and consider the internal state of your "non-pure" reducer as one more input into your reducer.
In that sense, you could imagine tracking your application state, your actions, and the internal state of your reducers, and end up with the same playback, etc, properties of your pure reducers (although you'd need a lot more code to handle that).
However, here's the rub: now you have your actual application state and your reducer's internal (and hidden) state. Who wants to keep track of two sets of state? That is what really makes the testing, reasoning, and implementation more difficult. There are more "kinds" of things to keep track of, and it is easier to miss/forget key details. In essence, if you already have a large portion of your application dedicated to keeping track of state, why would you want to keep more state hidden in your reducers?
So even ignoring doing things right for the sake of "rightness", it is conceptually simpler for your overall system architecture to keep all of your state in just one place.