My answer will be quicker, because BoltClock's one and the spec explain it minutely.
It is a matter of efficiency.
If you think about the fact that the UA should "[use the] values of that BODY element's background properties are their initial values", it makes sense to assign these values to the body tag.
If you use the html tag, UA may use a kind of fallback to display it, which is time consuming.
Of course, it's nothing for nowadays computers, but what about mobile devices, displaying let's said a background image with a cover background-size?
Furthermore, the order of html tags is a logical one, the link tags are defined in the head, in order to be used for the body and his childrens that come after. But if you apply a css rule to the html tag, the UA may go backward... And then forward.
So what are the advantages of specifying the canvas background for the BODY element? Quite nothing, for you, but some microseconds for the user, some microwatts for the client and the servers, but small streams make big rivers.
The w3c may have decided to do the opposite, it will have been ok too, it's just the purpose of a standard, that you make what a browser expects of you.
What? iOS does not respect the rules? (they can, they are apple). Don't change the correct way to do that, use media queries.