One possibility would be to not divide each term by number of items, but divide the sum once afterward, in order to accumulate less error...
Of course MaxValue+MaxValue would overflow in this case. Yes, but your library would cleverly mitigate this problem by detecting overflow (arrange to trap the exception), and scale the operands by 1/2 in this case.
At the end after the division, you would apply inverse scale by appropriate power of 2.
Yes but the sum of MaxValue,3,-MaxValue might be exhibit very bad accuracy (like answering zero instead of 1).
Ah, no problem, you can have a perfectly exact sum with code like this Precise sum of floating point numbers, that's easy, modulo the scale protection you'll have to mix with...
A small collateral effect is that your average is no more O(n) just a bit more exensive (hem...). Oh bad luck, some real-time application might expect O(n), and since you're writing a general purpose library...
So, in order to balance both expectations, you might choose to sum with kind of Kahan sum, and sacrifice some accuracy for speed...
Oh but why? What are the expectations exactly? This is the main question you'd have to answer... Do you prefer a library that guaranty best accuracy possible at the price of speed (think of crlibm vs libm), best speed at the price of a few corner cases, exception free behaviour whatever the illness of inputs (your original question), or a mix of above?
Unfortunately, you can't have them all together...
In all case, as Patricia said, document it.