I'm running Hive + Tez on EMR and I'd like some clarity for how Tez interacts with YARN.
I read in this article:
Set tez.am.resource.memory.mb to be the same as yarn.scheduler.minimum-allocation-mb (the YARN minimum container size)
Set hive.tez.container.size to be the same as or a small multiple (1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb but NEVER more than yarn.scheduler.maximum-allocation-mb. You want to have headroom for multiple containers to be spun up.
This makes it sound like the Tez containers are configured separately from YARN containers. Is that true? From the general documentation, it seems like Tez is a replacement for YARN containers, which would mean that you set the Tez container size and can ignore the original YARN container size.
In short: Do Tez containers run inside of YARN containers, or do Tez containers run instead of YARN containers?