0

i want to get the md5, the sha1 and the sh256 of all the file of my computer.

The expected out put is "the file name","the md5","the size".

main()
{
    liste=`sudo ls -R`
  for l in $liste
  do
    #echo $l
    g=`md5sum $l`
    printf "\"$l\","
    echo $g | awk '{printf("\"%s\",",$1)}'
    ls -lh $l | awk '{printf("\"%s\",",$5)}'
    printf "\n"
done
}
cd /
main

this is not working because it can't use md5sum in an other directory.

so I get this error message :

md5sum: rc6.d: No such file or directory
"rc6.d","",ls: cannot access 'rc6.d': No such file or directory

how do i get acces to the file ?

I tried :

#!/usr/bin/env bash
main()
{
    liste=`sudo find`
  for l in $liste
  do
    #echo $l
    g=`md5sum $l`
    printf "\"$l\","
    echo $g | awk '{printf("\"%s\",",$1)}'
    ls -lh $l | awk '{printf("\"%s\",",$5)}'
    printf "\n"
done
}
cd /
main

But i get this :

find: ‘./mnt/c/ProgramData/Microsoft/Windows NT/MSFax’: Permission denied
find: ‘./mnt/c/ProgramData/Packages’: Permission denied
find: ‘./mnt/c/ProgramData/VMware/VMware USB Arbitration Service’: Permission denied
find: ‘./mnt/c/ProgramData/WindowsHolographicDevices’: Permission denied
find: ‘./mnt/c/System Volume Information’: Permission denied
find: ‘./mnt/c/Users/cypri/AppData/Local/Packages/CanonicalGroupLimited.Ubuntu_79rhkp1fndgsc/LocalState/rootfs’: Permission denied
find: ‘./mnt/c/Users/cypri/AppData/Local/Packages/CanonicalGroupLimited.Ubuntu_79rhkp1fndgsc/LocalState/temp/{05418818-9381-4d3c-9934-ac417ee93067}’: Permission denied
find: ‘./mnt/c/Users/cypri/AppData/Local/Temp/WYU9188.tmp.dir’: Permission denied
find: ‘./mnt/c/Windows/appcompat/Programs’: Permission denied
find: ‘./mnt/c/Windows/CSC’: Permission denied

the best commande i fond so far is :

find -type f -readable -printf '%kkB ' -exec md5sum -- {} \;

how do i get the sha1 and the md5 on the same line : "sha1","md5",

i tried : find -type f -readable -printf '%kkB ' -exec md5sum -exec sha1 -- {} \; but it didn't work.

akmot
  • 63
  • 8
  • Please [edit] your question and show an example of the existing files and directories and the expected and actual output matching these files. Parsing the output of `ls` might be problematic. Did you check what `echo $l` will show in your loop? I suggest to read the documentation of `find` and its predicate `-type f`. – Bodo Apr 05 '22 at 14:06
  • I replaced ls -R by find and i get acces denied. I think the md5sum doesn't work with a path as an argument. – akmot Apr 05 '22 at 14:27
  • 1
    md5sum doesn't like broken symbolic links. – Yann Droneaud Apr 05 '22 at 14:37
  • If you use `find` instead of `ls`, you can tell it to filter only for real files, _and_ for readable ones only. – Charles Duffy Apr 05 '22 at 14:38
  • 2
    ...your current code also has serious problems with names with spaces, which are something else `find` can avoid when used correctly. (`liste=$(sudo find)` is **not** correct, because you can't store a list of arbitrary filenames in a string -- the only correct variable type to store a list of filenames is an _array_; see also [BashPitfalls #1](https://mywiki.wooledge.org/BashPitfalls#for_f_in_.24.28ls_.2A.mp3.29)). – Charles Duffy Apr 05 '22 at 14:39
  • 3
    See [Why you shouldn't parse the output of `ls`](https://mywiki.wooledge.org/ParsingLs), and [Using Find](https://mywiki.wooledge.org/UsingFind). And run your code through http://shellcheck.net/, and read the links associated with each warning it throws. – Charles Duffy Apr 05 '22 at 14:39
  • 2
    `find /mnt/c -type f -readable -exec md5sum -- {} +` is not a bad place to start, for your stated goal. – Charles Duffy Apr 05 '22 at 14:42
  • @cyp Your code ```liste=`sudo find` ``` will perform word-splitting on find's output, which will be problematic if you have file names with spaces. Use `find`'s `-exec` action instead, or, if you have GNU `find` and `bash`, use `-print0` and `IFS= read -r -d $'\0'`, see e.g. https://stackoverflow.com/a/1120952/10622916. The `Permission denied` error results from the fact that you ececute `find` as `root` but all other commands as a normal user. There may be files that are not readable for your normal user. Furthermore you should use `find`'s predicates to find regular files only. – Bodo Apr 06 '22 at 08:44

3 Answers3

1
find / -type f -not \( -path '/dev/*' -or -path '/proc/*' -or -path '/sys/devices/*' \) -print0 |
  xargs -0 bash -c 'paste -d " " <(md5sum "$@") <(sha1sum "$@") <(sha256sum "$@") <(du -lh "$@")' bash |
  tr -s ' ' | cut -d ' ' -f 1,3,5,7-

Here I use docker for demonstration:

docker run --rm -i ubuntu:20.04 bash <<'SCRIPT'
find / -type f -not \( -path '/dev/*' -or -path '/proc/*' -or -path '/sys/devices/*' \) -print0 |
  xargs -0 bash -c 'paste -d " " <(md5sum "$@") <(sha1sum "$@") <(sha256sum "$@") <(du -lh "$@")' bash |
  tr -s ' ' | cut -d ' ' -f 1,3,5,7-
SCRIPT
d41d8cd98f00b204e9800998ecf8427e da39a3ee5e6b4b0d3255bfef95601890afd80709 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0    /etc/subgid
3aa8b92d1dd6ddf4daaedc019662f1dc cbd89fb1fa310fc4bc46866081454d5747922cd2 29128d49b590338131373ec431a59c0b5318330050aac9ac61d5098517ac9a25 4.0K /etc/bash.bashrc
f9a1deea3a8fde4f992cc63ff939d923 c2a6d3cf400902bf8e4e97f84be440f607498e7c a49ab4136679f3fa4760385cb2e4e1c060afacf3c9e1a46d6be8717f59339184 4.0K /etc/apt/sources.list
0081b49fa709cfd95827c75297b75ddd 76a4711ebfd6282a9fd9e8fffd8a83ca3fc66baa 4f4226614def131b475bcdaf1e5dda59a1af62fee838cbfc02a01c25e614efc1 4.0K /etc/apt/apt.conf.d/docker-autoremove-suggests
ab6540f7278a05a4b7f9e58afcaa5f46 1a02e2f81f99a2fb621baf0cce0b332694982366 93e7e6d2fdb36b04cb10127e3b0d1b9d19d822327fd959484639bbbd65cce004 4.0K /etc/apt/apt.conf.d/01autoremove
b02a49af378158c2c45158e991f9a987 2897d70bff6558284d8b6c7e806028ecca62d449 49d72455b2eaa50fa4b09b3ecffff753f3b90193266b2999f30e3f696403fde2 4.0K /etc/apt/apt.conf.d/docker-clean
c69ce53f5f0755e5ac4441702e820505 43742ca9cbd8e8c18241a9b38aa302d92b0fa51c 364d5eeac5475b7dddfd629899ea88b91ed8d8e8e319c29bb9dbd6772e87ed55 4.0K /etc/apt/apt.conf.d/01-vendor-ubuntu
7e9d09d5801a42b4926b736b8eeabb73 8d02d7c5507330294f8eba69adc413e35c70225b db749e19baf3b72ca2c157c70c52522cae23d94bc8b2dc5793fd43d427445367 4.0K /etc/apt/apt.conf.d/70debconf
1caa06055a7a74a29c4476dccfb4726a 171bf894becc10c148b20d93056b2060ae113733 151044925acbe5c83434424d1b628e208790ecb2db0aabbe83dc8622495cf846 4.0K /etc/apt/apt.conf.d/docker-gzip-indexes
341190f50b907d798bcb98c9e0d9cb07 abbf3fba2dbc78da2423167d3bfdda895a09e100 5970f921a86106d617c00deb6b2c7e5a37922fda8f2296bf43eebb83a72a2a4a 4.0K /etc/apt/apt.conf.d/docker-no-languages
.
.
.

0d5b70e555bab97581ee2b1d9943c976 11777fb50cde6629882c35ced448a4e0c91115dc 45e22485d11dde5533823fac333c06ed0b6a661672e4d6eef21ed31f34bb1696 68K  /usr/bin/iconv
cb3a8469b0946d8012c0f8d7d051ffeb 1e04edc86fc20a35bbdb37865a492630727a1062 7f2e960e493c4586fbe6b11af0df148bdc93ead01e07598f281525df6edbef34 108K /usr/bin/du
d8d3ce4d7f4b1e3ac3c3e7c9790f22ca 65324834933228e25e4a1b31cd7277214b6208ab 74af07c3d96e5fda903ac3df226b555ea6cad5d3b0aa250f3cea7713128bddf2 44K  /usr/bin/nohup
bc5d246b4aedfd673f7c21fa7755a1cc 2336f2d23914d92d5c1c4019da797a7a8c24dda4 0c2134b4ee553b4fca4c011b5d5e6f0beb9394da7470cb925e5224196034e5e7 24K  /usr/bin/getopt
40cc72ca80c0257ba6f5983af29fd589 f3f087097eab01d63aec5e500d82098eb563aac5 9b7c3f8459d41c2809f695040da541e17c057179c31c4f0be79a156104c8b633 52K  /usr/bin/sha1sum
ac3b723b669e5d18ff9d9e5eecbf8869 f93e9b203d649d17dc1f18171abe6c64eff73b6d 6fbe7ade17043109f971ef60063700e92055e7c155210c2945b0851e96356dee 24K  /usr/bin/tload
57b95ef16bac660b9d04b8fbfba65302 070fa89ff800e77eaece5bdf987b73f050d39488 71d27457a20148a13eda31216cd7bd56324a443295c8e15f958ed768f18061e4 16K  /usr/bin/locale-check
cf277664b1771217d7006acdea006db1 17d380175c89fb145357edd7f1356f6274bfc762 34fbc467b8c624d92abcdf3edcf35ee46032618a6f23b210efab0e6824978126 4.0K /root/.bashrc
d68ce7c7d7d2bb7d48aeb2f137b828e4 8e5d66ea938b5118633a4bd8c1d1e93376cd4e9d bbee58b1e0787bb851e7f7a4d0c187a8122d68eb67e5fa464696310398ac005b 4.0K /root/.profile
d41d8cd98f00b204e9800998ecf8427e da39a3ee5e6b4b0d3255bfef95601890afd80709 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0    /.dockerenv

How it works?

  1. You might want to exclude folders such as /dev and /proc because files like /dev/console and /dev/random would cause md5sum runs forever. Here we use find -not \( -path ... \) to skip those folder.

  2. find -print0 and xargs -0 help you handle filenames that contain white spaces.

  3. paste is useful to merge files line by line, for example:

    $ paste <(seq 3) <(seq 3)
    1       1
    2       2
    3       3
    
  4. du by default doesn't count links. We need to make sure du "$@" and md5sum "$@" generate the same number of lines by applying du -l so that each output has the same alignment for paste to process.

  5. The trailing bash in xargs bash -c '...' bash would be $0 when using -c flag. Though it can be any string but you should always make a sensible name for $0.

Weihang Jian
  • 7,826
  • 4
  • 44
  • 55
  • Thanks a lot ! why folders such as /dev and /proc because files like /dev/console and /dev/random would cause md5sum runs forever ? – akmot Apr 12 '22 at 13:07
  • They are not real files, they are character special files or block special files. You can try to `cat /dev/random` to see what would happen, and also Google "device file" for more details. – Weihang Jian Apr 12 '22 at 13:21
  • ``` find .. -type f -not \( -path '/dev/*' -or -path '/proc/*' -or -path '/sys/devices/*' \) -print0 | xargs -0 bash -c 'paste -d " " <(md5sum "$@") <(sha1sum "$@") <(sha256sum "$@") <(du -lh "$@")' bash | tr -s ' ' | cut -d ' ' -f 1,3,5,7- ``` For this script sometimes is doesn't fine the MD5 or the SHA1. is it possible to print "none" in the case of the scipt doesn't find a value ? – akmot Jul 12 '22 at 08:44
  • Hello, sometimes it puts the SHA1 value in the MD5 slot. how would you prevent that from happening ? – akmot Jul 12 '22 at 12:56
  • How can I parse my data ? So I can get : – akmot Jul 13 '22 at 12:50
  • ;0d5b70e555bab97581ee2b1d9943c976 ;11777fb50cde6629882c35ced448a4e0c91115dc ;45e22485d11dde5533823fac333c06ed0b6a661672e4d6eef21ed31f34bb1696; 68K; ;/usr/bin/iconv; – akmot Jul 13 '22 at 12:51
  • " tr -s ' ' | cut -d ' ' -f 1,3,5,7- " is generating error when parsing data with space in there name. like : /user/bin/hello world/toto.txt – akmot Jul 13 '22 at 12:53
0
find /mnt/c -type f -readable -exec md5sum -- {} +

It's the best answer i fond.

I tried to add %s to get the size of the file too but I didn't find how to put it in the commande line.

akmot
  • 63
  • 8
  • ``` ls -Ralh | awk '{printf("\"%s\",\"%s\",\n",$9,$5)}' ``` i tried this – akmot Apr 08 '22 at 10:11
  • i had an idea i want to concatenate the line of this commande with the line of the commande ```find -type f -readable -exec md5sum -- {} +``` – akmot Apr 08 '22 at 10:16
  • Try ... `-exec sh -c 'for file; do md5sum "$file"; stat "$file"; done' _ {} +`, possibly with additional options and/or formatting. – tripleee Apr 10 '22 at 21:04
  • I tested your command line and it taks too long to do all the files. i fond this one ```-printf "%k KB"``` but it gives me the size of all the file then the checksum and the file name. The out put expected is "the file name","the md5","the size". – akmot Apr 11 '22 at 07:05
  • You have to `stat` the file one way or another. `find` already does that, so if you have e.g. GNU `find` which supports `-printf`, more power to you. Traversing the files is the part which really takes time here, but if you can avoid running `stat` separately, I would guess that can shave off a few per cent of your run time. – tripleee Apr 11 '22 at 07:07
  • Probably ask a new question rather than pile on additional requirements. – tripleee Apr 11 '22 at 07:37
0

If you have GNU find you can use -printf followed by -exec to output the file size followed by the md5sum output on one line.

find /mnt/c -type f -readable -printf '%k kiB\t' -exec md5sum -- {} \;

Because you combine -printf and -exec you want to execute md5sum separately on each file (\; instead of + as the terminator for -exec), which probably slows you down somewhat.

If you really wanted to squeeze performance to the max, perhaps write a simple Perl or Python script to perform both calculations in the same process.

import os
from hashlib import md5
import logging

for dir, subdirs, files in os.walk("/mnt/c"):
    for file in files:
        path = os.path.join(dir, file)
        try:
            size = os.path.getsize(path)
            with open(path, "rb") as contents:
                md5sum = md5(contents.read())
            print("%s  %s  %s" % (size, md5sum.hexdigest(), path))
        except (OSError, PermissionError) as exc:
            logging.warning("%s: %s", path, str(exc))

The result from getsize is the number of bytes in the file; if you want to display human-readable units, you will obviously need to add logic for that. (Adding it isn't hard, but I don't want to guess whether you want 1024-byte kilos or 1000-byte ones, etc.)

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Fortunately, I don't have a Windows system to test this on. I'm guessing there could be more weird and wonderful exceptions from Redmond. – tripleee Apr 11 '22 at 07:31
  • Im on linux, i can't use anything else than bash command. – akmot Apr 11 '22 at 07:53
  • It's not like `find` or `md5sum` are part of Bash either. But my first solution should definitely fit within your constraints, however poorly defined they are, if you really are on Linux. (`/mnt/c` looks like you are really on Windows, though.) – tripleee Apr 11 '22 at 08:27
  • Most modern Linux distros include at least Perl and Python 2, though my code is for Python 3, and will need some ugly tweaks if you are forced to use the obsolete version 2. With Perl, you also get `find2perl` which would probably allow you to write a quite succinct script, though perhaps not a one-liner. – tripleee Apr 11 '22 at 08:33
  • (Looks ike `find2perl` was actually removed from Perl a while back, but you might be on a distro which still has it.) – tripleee Apr 11 '22 at 08:39
  • `find /mnt/c -type f -readable -printf '%k kiB\t' -exec md5sum -- {} \;` is the best command line i have so far. thx – akmot Apr 11 '22 at 09:53