225

For research purposes I'm trying to crawl the public Docker registry ( https://registry.hub.docker.com/ ) and find out 1) how many layers an average image has and 2) the sizes of these layers to get an idea of the distribution.

However I studied the API and public libraries as well as the details on the github but I cant find any method to:

  • retrieve all the public repositories/images (even if those are thousands I still need a starting list to iterate through)
  • find all the layers of an image
  • find the size for a layer (so not an image but for the individual layer).

Can anyone help me find a way to retrieve this information?

EDIT: is anyone able to verify that searching for '*' in Docker registry is returning all the repositories and not just anything that mentions '*' anywhere? https://registry.hub.docker.com/search?q=\*

BenMorel
  • 34,448
  • 50
  • 182
  • 322
user134589
  • 2,499
  • 2
  • 16
  • 12
  • 13
    >>>find all the layers of an image , if you do not use the API, you can do a `docker history myimage` and you will see the size of each layer. More generally, on an image, you can do `docker history myimage | awk 'NR>1 {print $1}' | xargs docker inspect --format '{{ ((index .ContainerConfig.Cmd ) 0) }}'` to see what commands were issued to create the image – user2915097 Apr 17 '15 at 10:33
  • This is already a great help for step 2 although that requires me to download every image through Docker to my local machine. I guess that is an option but only if I find a way to retrieve a list of 'myimages' to start with (e.g. every image in public registry in step 1). I'll definitely explore this option, thank you! – user134589 Apr 17 '15 at 12:16
  • `https://registry.hub.docker.com/search?q=*` shows for me 87031 repositories, – user2915097 Apr 17 '15 at 13:49

12 Answers12

191

Check out dive written in golang.

Awesome tool!

Levon
  • 10,408
  • 4
  • 47
  • 42
133

You can first find the image ID using:

$ docker images -a

Then find the image's layers and their sizes:

$ docker history --no-trunc <Image ID>

Note: I'm using Docker version 1.13.1

$ docker -v
Docker version 1.13.1, build 092cba3
Yuci
  • 27,235
  • 10
  • 114
  • 113
  • 3
    +1 I had to remove the --no-trunc as the output was unusable on my terminal, but this still gives great info. – Sherwin F Apr 24 '23 at 20:18
102

You can find the layers of the images in the folder /var/lib/docker/aufs/layers; provide if you configured for storage-driver as aufs (default option)

Example:

 docker ps -a
 CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                      PORTS               NAMES
 0ca502fa6aae        ubuntu              "/bin/bash"         44 minutes ago      Exited (0) 44 seconds ago                       DockerTest

Now to view the layers of the containers that were created with the image "Ubuntu"; go to /var/lib/docker/aufs/layers directory and cat the file starts with the container ID (here it is 0ca502fa6aae*)

 root@viswesn-vm2:/var/lib/docker/aufs/layers# cat    0ca502fa6aaefc89f690736609b54b2f0fdebfe8452902ca383020e3b0d266f9-init 
 d2a0ecffe6fa4ef3de9646a75cc629bbd9da7eead7f767cb810f9808d6b3ecb6
 29460ac934423a55802fcad24856827050697b4a9f33550bd93c82762fb6db8f
 b670fb0c7ecd3d2c401fbfd1fa4d7a872fbada0a4b8c2516d0be18911c6b25d6
 83e4dde6b9cfddf46b75a07ec8d65ad87a748b98cf27de7d5b3298c1f3455ae4

This will show the result of same by running

root@viswesn-vm2:/var/lib/docker/aufs/layers# docker history ubuntu
IMAGE               CREATED             CREATED BY                                         SIZE                COMMENT
d2a0ecffe6fa        13 days ago         /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
29460ac93442        13 days ago         /bin/sh -c sed -i 's/^#\s*\   (deb.*universe\)$/   1.895 kB            
b670fb0c7ecd        13 days ago         /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
83e4dde6b9cf        13 days ago         /bin/sh -c #(nop) ADD file:c8f078961a543cdefa   188.2 MB 

To view the full layer ID; run with --no-trunc option as part of history command.

docker history --no-trunc ubuntu
Viswesn
  • 4,674
  • 2
  • 28
  • 45
  • 6
    This is no longer the case with docker version 1.10 onwards. `docker history` command won't give the image layers as shown in the /var/lib/docker/aufs/layers folder. Read the update [here](https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/#copying-makes-containers-efficient). – Ruifeng Ma Mar 27 '17 at 15:32
  • 13
    Since Docker version 1.10, with introduction of the content addressable storage, images and layers are now separated. `docker history` command no longer tells the actual layer disk storage information on the docker host. Check this [blog](http://windsock.io/explaining-docker-image-ids/) – Ruifeng Ma Mar 28 '17 at 09:47
60

In my opinion, docker history <image> is sufficient. This returns the size of each layer:

$ docker history jenkinsci-jnlp-slave:2019-1-9c
IMAGE        CREATED    CREATED BY                                    SIZE  COMMENT
93f48953d298 42 min ago /bin/sh -c #(nop)  USER jenkins               0B
6305b07d4650 42 min ago /bin/sh -c chown jenkins:jenkins -R /home/je… 1.45GB
030
  • 10,842
  • 12
  • 78
  • 123
13

This will inspect the docker image and print the layers:

$ docker image inspect nginx -f '{{.RootFS.Layers}}'
[sha256:d626a8ad97a1f9c1f2c4db3814751ada64f60aed927764a3f994fcd88363b659 sha256:82b81d779f8352b20e52295afc6d0eab7e61c0ec7af96d85b8cda7800285d97d sha256:7ab428981537aa7d0c79bc1acbf208c71e57d9678f7deca4267cc03fba26b9c8]
lvthillo
  • 28,263
  • 13
  • 94
  • 127
12

They have a very good answer here: https://stackoverflow.com/a/32455275/165865

Just run below images:

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock nate/dockviz images -t
Andy
  • 17,423
  • 9
  • 52
  • 69
sunnycmf
  • 531
  • 1
  • 5
  • 18
  • 1
    Hi @bummi , sorry I think this question initially is looking for solution in docker registry, and i found that solution we provided above direct to the layers of docker image. so I try to supplement another solution (which i think easier) – sunnycmf Jan 15 '16 at 09:19
3
  1. https://hub.docker.com/search?q=* shows all the images in the entire Docker hub, it's not possible to get this via the search command as it doesnt accept wildcards.

  2. As of v1.10 you can find all the layers in an image by pulling it and using these commands:

    docker pull ubuntu
    ID=$(sudo docker inspect -f {{.Id}} ubuntu)
    jq .rootfs.diff_ids /var/lib/docker/image/aufs/imagedb/content/$(echo $ID|tr ':' '/')
    

3) The size can be found in /var/lib/docker/image/aufs/layerdb/sha256/{LAYERID}/size although LAYERID != the diff_ids found with the previous command. For this you need to look at /var/lib/docker/image/aufs/layerdb/sha256/{LAYERID}/diff and compare with the previous command output to properly match the correct diff_id and size.

halfer
  • 19,824
  • 17
  • 99
  • 186
Piet
  • 517
  • 1
  • 4
  • 8
3

It's indeed doable to query the manifest or blob info from docker registry server without pulling the image to local disk.

You can refer to the Registry v2 API to fetch the manifest of image.

GET /v2/<name>/manifests/<reference>

Note, you have to handle different manifest version. For v2 you can directly get the size of layer and digest of blob. For v1 manifest, you can HEAD the blob download url to get the actual layer size.

There is a simple script for handling above cases that will be continuously maintained.

Kane
  • 8,035
  • 7
  • 46
  • 75
2

one more tool : https://github.com/CenturyLinkLabs/dockerfile-from-image

GUI using ImageLayers.io

Community
  • 1
  • 1
resultsway
  • 12,299
  • 7
  • 36
  • 43
2

To find all the layers of an image and to find the size for a layer, you can display the manifest from the docker hub registry via the "manifest" experimental feature:

docker manifest inspect ubuntu

The result is a JSON file (only the first lines are shown here):

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 529,
         "digest": "sha256:10cbddb6cf8568f56584ccb6c866203e68ab8e621bb87038e254f6f27f955bbe",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 529,
         "digest": "sha256:dd375524d7eda25a69f9f9790cd3e28855be7908e04162360dd462794035ebf7",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v7"
Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
0

Not exactly the original question but to find the sum total of all the images without double-counting shared layers, the following is useful (ubuntu 18):

sudo du -h -d1  /var/lib/docker/overlay2 | sort -h
Oliver
  • 27,510
  • 9
  • 72
  • 103
-5

I've solved this problem by using the search function on Docker's website where '*' is a valid search that returns 200k repositories and then I crawled each invididual page. HTML parsing allows me to extract all the image names on each page.

Piet
  • 517
  • 1
  • 4
  • 8