13

I have several files on a server that I need to download from an ansible playbook, but because the connection has good chances of interruption I would like to check their integrity after download.

I'm considering two approaches:

  1. Store the md5 of those files in ansible as vars
  2. Store the md5 of those files on the server as files with the extension .md5. Such a pair would look like: file.extension and file.extension.md5.

The first approach introduces overhead in maintaining the md5s in ansible. So everytime someone adds a new file, he needs to make sure he adds the md5 in the right place.

But as an advantage, there is a solution for this, using the built in check from get_url action in conjunction with checksum=md5. E.g.:

action: get_url: url=http://example.com/path/file.conf dest=/etc/foo.conf checksum=md5:66dffb5228a211e61d6d7ef4a86f5758

The second approach is more elegant and the narrows the responsibility. When someone adds a new file on the server, he will make sure to add the .md5 as well and won't even need to use the ansible playbooks.

Is there a way to use the checksum approach to match the md5 from a file?

5 Answers5

22

If you wish to go with your method of storing the checksum in files on the server, you can definitely use the get_url checksum arg to validate it.

Download the .md5 file and read it into a var:

- set_fact:
    md5_value: "{{ lookup('file', '/etc/myfile.md5') }}"

And then when you download the file, pass the contents of md5_value to get_url:

- get_url:
    url: http://example.com
    dest: /my/dest/file
    checksum: "md5:{{ md5_value }}"
    force: true

Note that it is vital to specify a path to a file in dest; if you set this to a directory (and have a filename in url), the behavior changes significantly.

Note also that you probably need the force: true. This will cause a new file to download every time you run it. The checksum is only triggered when files are downloaded. If the file already exists on your host it won't bother to validate the sum of the existing file, which might not be desirable.

To avoid the download every time you could stat to see if the file already exists, see what its sum is, and set the force param conditionally.

- stat:
    path: /my/dest/file
  register: existing_file

- set_fact:
    force_new_download: "{{ existing_file.stat.md5 != md5_value }}"
  when: existing_file.stat.exists

- get_url:
    url: http://example.com
    dest: /my/dest/file
    checksum: "md5:{{ md5_value }}"
    force:  "{{ force_new_download | default ('false') }}"

Also, if you are pulling the sums/artifacts from some sort of web server you can actually get the value of the sum right from the url without having to actually download the file to the host. Here is an example using a Nexus server that would host the artifacts and their sums:

- set_fact:
    md5_value: "{{ item }}"
  with_url: http://my_nexus_server.com:8081/nexus/service/local/artifact/maven/content?g=log4j&a=log4j&v=1.2.9&r=central&e=jar.md5

This could be used in place of using get_url to download the md5 file and then using lookup to read from it.

ssc
  • 9,528
  • 10
  • 64
  • 94
barnesm999
  • 421
  • 3
  • 13
  • this cannot be right. The file lookup plugin only works on localhost. It CANNOT lookup remote files. the get_url store files at remote dest. Thus the whole concept is actually wrong. – Wang Dec 28 '18 at 15:57
  • At least add `delegate_to: 127.0.0.1` to your get_url task – Wang Dec 28 '18 at 16:07
  • when using `stat`, also set `checksum_algorithm`, added in 2.0 of ansible.builtin – Iron Bishop Mar 31 '22 at 15:44
  • In the [latest docs for `get_url`](https://docs.ansible.com/ansible/latest/collections/ansible/builtin/get_url_module.html) it's written that _If the checksum does not equal destination_checksum, the destination file is deleted._ So it's safe to use it without `force: true`, assuming nobody else writes the file. – Petr Jul 08 '22 at 10:33
3

With the stat module:

- stat:
    path: "path/to/your/file"
  register: your_file_info

- debug:
    var: your_file_info.stat.md5
modle13
  • 1,242
  • 2
  • 16
  • 16
2

The elegant solution will be using the below 3 modules provided by ansible itself

  1. http://docs.ansible.com/ansible/stat_module.html

    use the stat module to extract the md5 value and register it in a variable

  2. http://docs.ansible.com/ansible/copy_module.html

    while using the copy module to copy the file from the server, register the return value of md5 in another variable

  3. http://docs.ansible.com/ansible/playbooks_conditionals.html

    use this conditional module to compare the above 2 variables and print the results whether the file is copied properly or not

Community
  • 1
  • 1
1

Another solution is to use url lookup (tested on ansible-2.3.1.0):

- name: Download
  get_url:
    url: "http://localhost/file"
    dest: "/tmp/file"
    checksum: "md5:{{ lookup('url', 'http://localhost/file.md5') }}"
alius.miles
  • 61
  • 1
  • 3
0

Wrote an ansible module with the help of https://pypi.org/project/checksumdir

The module can be found here

Example:

- get_checksum: 
    path: path/to/directory
    checksum_type: sha1/md5/sha256/sha512
  register: checksum
Rahul
  • 11
  • 2