I'm following this to run a pyspark job in EMR with custom libraries. The problem comes when my package dependencies conflict with EMR preinstalled packages, like aiobotocore
. Another possible situation is when two different users tries to use the cluster with versions conflicts.
Thess issues are typically solved using virtual envs or docker images. Which is the correct approach to solve this python conflicts on EMR?