Oddly enough, at least as things are really intended to work, none of this should directly involve iostreams and/or streambufs at all.
I would think of an iostream as a match-maker class. An iostream has a streambuf which provides a buffered interface to some sort of external source/sink of data. It also has a locale, which handles all the formatting. The iostream is little more than the playground supervisor that keeps those two playing together nicely (so to speak). Since you're dealing with data formatting, all of this is (or should be) handled in the locale.
A locale isn't monolithic though -- it's composed of a number of facet
s, each devoted to one particular part of data formatting. In this case, the part you probably care about is the codecvt
facet, which is used (almost exclusively) to translate between the external and internal representations of data being read from/written to iostreams.
For better or worse, however, a locale can only contain one codecvt facet at a time, not a chain of them like you're contemplating. As such, what you really need/want is a wrapper class that provides a codecvt as its external interface, but allows you to chain some arbitrary set of transforms to be done to the data during I/O.
For the utf-to-utf conversion, Boost.locale provides a utf_to_utf function, and codecvt wrapper code, so doing this part of the conversion is simple and straightforward.
Lest anybody suggest that such things be done with ICU, I'll add that Boost.Locale is pretty much a wrapper around ICU, so this is more or less the same answer, but in a form that's much more friendly to C++ (whereas ICU by itself is rather Java-like, and all but overtly hostile to C++).
The other side of things is that writing a codecvt facet adds a great deal of complexity to a fairly simple task. A filtering streambuf (for one example) is generally a lot simpler to write. It's still not as easy as you'd like, but not nearly as bad as a codecvt facet. As @Flexo already mentioned, the Boost iostreams library already includes a filtering streambuf that does zip compression. Doing roughly the same with lzma (or lzh, arithmetic, etc. compression) is relatively easy, at least assuming you have compression functions that are easy to use (you basically just supply them with a buffer of input, and they supply a buffer of results).