1

Can anyone help me figure out why it took around 20G of my C disk to install QIIME2 through Docker? Thank you! enter image description here

Before installing QIIME2, I had 30GB in my C disk, but only remains 8GB after installation.

Rowan
  • 31
  • 2
  • BTW, I followed up with the QIIME team with the observations from the answer below, and removed 6GB from the installed size. Should be in the next update. https://github.com/qiime2/vm-playbooks/pull/90 – Nick ODell Nov 22 '22 at 16:07

1 Answers1

1

The short answer to that question is: QIIME2 is pretty big. But I'm sure you knew that already, so let's dig into the details.

First, the QIIME image is roughly 12GB when uncompressed. (This raises the question of where the other 8GB went if you lost 20GB in total. I don't have an answer to that.)

Using a tool called dive, I can explore the QIIME image, and see where that disk space is going. There's one entry that stands out in the log:

5.9 GB  |1 QIIME2_RELEASE=2022.8 /bin/sh -c chmod -R a+rwx /opt/conda 

For reference, the chmod command is a command which changes the permissions on a directory, without changing the directory itself. Yet, this command is responsible for half the size of the image. It turns out that due to the way docker works internally. If a layer changes the metadata or permissions of a file, then the original file must be re-included into the layer. More information

The remainder is 6GB, which comes mostly from a step where QIIME installs all of its dependencies. That's fairly reasonable for a project packaged with conda.

To summarize, it's an intersection of three factors:

  1. Conda is fairly space-hungry, compared to equivalent pip packages.

  2. QIIME has a lot of features and dependencies.

  3. Every dependency is included twice.

    Edit: this is now fixed in version 2022.11.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66