14

An external organization that I work with has given me access to a private (auth token protected) docker registry, and eventually I would like to be able to query this registry, using docker's HTTP API V2, in order to obtain a list of all the repositories and/or images available in the registry.

But before I do that, I'd first like to get some basic practice with constructing these types of API queries on a public registry such as Docker Hub. So I've gone ahead and registered myself with a username and password on Docker Hub, and also consulted the API V2 documentation, which states that one may request an API version check as:

GET /v2/

or request a list of repositories as:

GET /v2/_catalog

Using curl, together with the username and password that I used in order to register my Docker Hub account, I attempt to construct a GET request at the command line:

stachyra> curl -u stachyra:<my_password> -X GET https://index.docker.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
stachyra> curl -u stachyra:<my_password> -X GET https://index.docker.io/v2/_catalog
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"registry","Class":"","Name":"catalog","Action":"*"}]}]}

where of course, in place of <my_password>, I substituted my actual account password.

The response that I had been expecting from this query was a giant json message, listing thousands of repository names, but instead it appears that the API is rejecting my Docker Hub credentials.

Question 1: Do I even have the correct URL (index.docker.io) for the docker hub registry? (I made this assumption in the first place based upon the status information returned by the command line tool docker info, so I have good reason to think it's correct.)

Question 2: Assuming I have the correct URL for the registry service itself, why does my query return an "UNAUTHORIZED" error code? My account credentials work just fine when I attempt to login via the web at hub.docker.com, so what's the difference between the two cases?

stachyra
  • 4,423
  • 4
  • 20
  • 34

5 Answers5

16

Do I even have the correct URL

  • "Docker" is a protocol, "DockerHub" is product that implements the Docker protocol but is not limited to it. Docker APIs are also implemented by other providers like:
    • GitLab (registry.gitlab.com)
    • GitHub CR (ghcr.io)
    • GCP GCR (gcr.io)
    • AWS ECR (public.ecr.aws & <account_id>.dkr.ecr..amazonaws.com)
    • Azure ACR (<registry_name>.azurecr.io)
  • index.docker.io hosts the Docker implementation by DockerHub.
  • hub.docker.com hosts the rich DockerHub specific APIs.
  • NOTE: DockerHub implements the generic Docker HTTP API V2 but it doesn't implement _catalog API from the generic API set.

why does my query return an "UNAUTHORIZED" error code?

In order to use the Docker V2 API, a JWT auth token needs to be generated from https://auth.docker.io/token for each call and that token has to be used as Bearer token in the DockerHub calls at index.docker.io

When we hit the DockerHub APIs like this: https://index.docker.io/v2/library/alpine/tags/list, it returns 401 with info on the missing pre-flight auth call. We look for www-authenticate response header in the failed request.

eg: www-authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:library/alpine:pull",error="invalid_token"

This means, we need to explicitly call following API to obtain the auth token.

https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/alpine:pull

The https://auth.docker.io/token works without any auth for public repos. To access a private repo, we need to add basic http auth to the request.

https://<username>:<password>@auth.docker.io/token?service=registry.docker.io&scope=repository:<repo>:pull

NOTE: auth.docker.io will generate a token even if the request is not valid (invalid creds or scope or anything). To validate the token, we can parse the JWT (eg: from jwt.io) and check access field in the payload, it should be containing requested scope references.

Kanak Singhal
  • 3,074
  • 1
  • 19
  • 17
  • By using a `GET` request with the correct `parameters` you can get access? How can it be mitigated? – fp007 Jul 28 '23 at 16:40
9

Here is an example program to read repositories from a registry. I used it as a learning aid with Docker Hub.

#!/bin/bash

set -e

# set username and password
UNAME="username"
UPASS="password"

# get token to be able to talk to Docker Hub
TOKEN=$(curl -s -H "Content-Type: application/json" -X POST -d '{"username": "'${UNAME}'", "password": "'${UPASS}'"}' https://hub.docker.com/v2/users/login/ | jq -r .token)

# get list of repos for that user account
REPO_LIST=$(curl -s -H "Authorization: JWT ${TOKEN}" 
https://hub.docker.com/v2/repositories/${UNAME}/?page_size=10000 | jq -r '.results|.[]|.name')

# build a list of all images & tags
for i in ${REPO_LIST}
do
  # get tags for repo
  IMAGE_TAGS=$(curl -s -H "Authorization: JWT ${TOKEN}" 
  https://hub.docker.com/v2/repositories/${UNAME}/${i}/tags/?page_size=10000 | jq -r '.results|.[]|.name')

  # build a list of images from tags
  for j in ${IMAGE_TAGS}
  do
    # add each tag to list
    FULL_IMAGE_LIST="${FULL_IMAGE_LIST} ${UNAME}/${i}:${j}"
  done
done

# output list of all docker images
for i in ${FULL_IMAGE_LIST}
do
  echo ${i}
done

(this comes from an article on Docker site that describes how to use the API.)

In essence...

  • get a token
  • pass the token as a header Authorization: JWT <token> with any API calls you make
  • the api call you want to use to list repositories is https://hub.docker.com/v2/repositories/<username>/
William Desportes
  • 1,412
  • 1
  • 22
  • 31
starfry
  • 9,273
  • 7
  • 66
  • 96
  • Great explanation, thank you! I noticed that by default it has access to only public repositories (like `curl -s -H "Authorization: TOKEN" https://hub.docker.com/v2/repositories/USERNAME/` and `curl -s -H "Authorization: TOKEN" https://hub.docker.com/v2/repositories/USERNAME/?is_private=true` does output the same result). Unfortunately with this parameter `?is_private=true` nothing outputs : do you know if it's possible? Regards! – lboix Mar 09 '20 at 22:25
  • Sorry I don't think that program is very robust, it's just an example. It only lists repositories having tags (which you may not have if you're just testing). I have just tried it and it returns all repos by default. If you added `?is_private=true` then you only get private ones, and similarly if you pass `false`. Add some prints to the program so it outs the values of `i` (repo) and `j` (tag) and you should see what you expect. – starfry Mar 10 '20 at 10:44
  • Thanks for your quick reply! Unfortunately in my case using this URL without this parameter or with the value false works (my only public repo outputs), but when using this parameter with the value true none of my private repos outputs I just see `{"count": 0, "next": null, "previous": null, "results": []}` I will still try to dig here and let you know if I find something. Have a great day! – lboix Mar 10 '20 at 14:38
3

This site says we cannot :(

Dockerhub hosts a mix of public and private repositories, but does not expose a catalog endpoint to programmatically list them.

hudac
  • 2,584
  • 6
  • 34
  • 57
0

I have modified https://stackoverflow.com/a/60549026/7281491 so i can search for any other user/org dockerhub image list:

#!/bin/bash

set -e

# User to search for
UNAME=${1}


# Put your own docker hub TOKEN.
# You can use pass command or 1password cli to store pat 
TOKEN=dckr_pat_XXXXXXXXXXXXXXXXXXXXXXXx


# get list of namespaces accessible by user (not in use right now)
#NAMESPACES=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/namespaces/ | jq -r '.namespaces|.[]')

# get list of repos for that user account
REPO_LIST=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/${UNAME}/?page_size=10000 | jq -r '.results|.[]|.name')

# build a list of all images & tags
for i in ${REPO_LIST}
do
  # get tags for repo
  IMAGE_TAGS=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/${UNAME}/${i}/tags/?page_size=10000 | jq -r '.results|.[]|.name')

  # build a list of images from tags
  for j in ${IMAGE_TAGS}
  do
    # add each tag to list
    FULL_IMAGE_LIST="${FULL_IMAGE_LIST} ${UNAME}/${i}:${j}"
  done
done

# output list of all docker images
for i in ${FULL_IMAGE_LIST}
do
  echo ${i}
done

Sample output:

gitlab/gitlab-ce:latest
gitlab/gitlab-ce:nightly
gitlab/gitlab-ce:15.5.9-ce.0
gitlab/gitlab-ce:15.6.6-ce.0
gitlab/gitlab-ce:rc
gitlab/gitlab-ce:15.7.5-ce.0
gitlab/gitlab-ce:15.7.3-ce.0
gitlab/gitlab-ce:15.5.7-ce.0
gitlab/gitlab-ce:15.6.4-ce.0
gitlab/gitlab-ce:15.7.2-ce.0
gitlab/gitlab-ce:15.7.1-ce.0
gitlab/gitlab-ce:15.7.0-ce.0
gitlab/gitlab-ce:15.6.3-ce.0
gitlab/gitlab-ce:15.5.6-ce.0
gitlab/gitlab-ce:15.6.2-ce.0
gitlab/gitlab-ce:15.4.6-ce.0
gitlab/gitlab-ce:15.5.5-ce.0
.....
Omer Sen
  • 49
  • 4
  • 1
    I have also modified to match requested docker image https://gist.github.com/omerfsen/fc2b2b32d4c91dddbaf391aeb385acc9 – Omer Sen Jan 21 '23 at 17:21
0

Here's python code to do the very same. This can access both your organization and your own private repos.

Side note, I have another bunch of code that can access manifests, but only on private/public USER repos, but nor organizational level repos, anyone know why that is?

docker_username = ""
docker_password = ""
docker_organization = ""


auth_url = "https://hub.docker.com/v2/users/login/"
auth_data = {
    "username": docker_username,
    "password": docker_password
}
auth_response = requests.post(auth_url, json=auth_data)
auth_response.raise_for_status()
docker_hub_token = auth_response.json()["token"]

repositories_list = f"https://hub.docker.com/v2/repositories/{docker_username}/?page_size=100"
# repositories_list = f"https://hub.docker.com/v2/repositories/{docker_organization}/?page_size=100"
repos_headers = {
    "Authorization": f"JWT {docker_hub_token}"
}
repos_response = requests.get(repositories_list, headers=repos_headers)
repository_list = repos_response.json()["results"]
for repo in repository_list:
    namespace = repo["namespace"]
    repo_name = repo["name"]
    combined_name = f"{namespace}/{repo_name}"
    print(combined_name)

beeeliu
  • 99
  • 1
  • 1
  • 6