-1

I am reading lines of a file, converting each one to an md5 hash, and writing it to a second file. I have been getting different results based on using printf and echo.

printf $line | md5sum | awk '{print $1}' >> md5File.txt

echo $line | md5sum | awk '{print $1}' >> md5File.txt

For printf 00000 becomes dcddb75469b4b4875094e14561e573d8, but for echo 00000 becomes 81b4e43a7bcd862f3ac58b5f8568a668.

I verified and the correct md5 hash sum for 00000 is dcddb75469b4b4875094e14561e573d8 but I am confused on why?

Jav_Py
  • 41
  • 8
  • 1
    echo puts a newline on the end of the string. Use the `-n` option of echo to prevent that – Jerry Jeremiah Mar 02 '21 at 22:46
  • so if I use `echo -n` it would be basically the same as `printf`? Is there a difference between using `echo -n` or `printf`? – Jav_Py Mar 02 '21 at 22:47
  • 1
    Not for your specific case. printf is way more flexibility and many more options - but you aren't using any of them. – Jerry Jeremiah Mar 02 '21 at 22:48
  • Ahhh, okay thanks for the help! – Jav_Py Mar 02 '21 at 22:50
  • Use `echo -n`. If you use `printf` and your file contains a line that contains a special sequence such as `%d`, you will get wrong results. – k314159 Mar 02 '21 at 22:59
  • 1
    You should nearly always use `printf '%s' "$line"` instead of `printf $line`. `printf '00000\n' | md5sum` will give you the same hash as that of `echo 00000 | md5sum` – M. Nejat Aydin Mar 02 '21 at 23:17
  • Does this answer your question? [Difference between printf and echo in Bash](https://stackoverflow.com/questions/35603323/difference-between-printf-and-echo-in-bash) – Léa Gris Mar 02 '21 at 23:42

1 Answers1

3

Using echo $line automatically includes a newline - which you can suppress with the -n option (sometimes - see below). So this doesn't work:

echo $line | md5sum | awk '{print $1}' >> md5File.txt

But on bash this does:

echo -n $line | md5sum | awk '{print $1}' >> md5File.txt

But Not all versions of echo have an -n option. The echo documentation says:

If the first operand is -n, or if any of the operands contain a backslash ( '\' ) character, the results are implementation-defined. ... On XSI-conformant systems, if the first operand is -n, it shall be treated as a string, not an option.

Another alternative is using bash's printf command. The printf documentation says:

The printf utility was added to provide functionality that has historically been provided by echo. However, due to irreconcilable differences in the various versions of echo extant, ...

So printf is the reliably portable way to go. Here is a related answer with more details: https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo

But using printf is dangerous if you don't specify a format string, so even though this seems to work:

printf $line | md5sum | awk '{print $1}' >> md5File.txt

It will fail spectacularly when $line contains a percent sign or a backslash. The first argument to printf is the format string and is treated specially. If the format string is invalid then printf produces an error to stderr and an empty string to stdout, which gives the wrong answer. So instead you need:

printf "%s" "$line" | md5sum | awk '{print $1}' >> md5File.txt

The %s tells printf to expect one more string parameter (which just happens to be $line) and you get the right output.

Fun fact: if you did want printf to add a trailing newline (you don't in this case) then you would

printf "%s\n" "$line" | md5sum | awk '{print $1}' >> md5File.txt
Jerry Jeremiah
  • 9,045
  • 2
  • 23
  • 32
  • 1
    What if your `line` contains whitespace? –  Mar 02 '21 at 23:27
  • I see you edited your answer after my comment. –  Mar 02 '21 at 23:32
  • @Roadowl Sure - you were right. I edited it quickly to try to fit in the 5-minute window (but I wasn't quite that fast) and then did a bit more testing for other corner cases before making a comment that I had applied your suggestion. – Jerry Jeremiah Mar 02 '21 at 23:36
  • 2
    It's even worse than this; some versions of `echo` will just print `-n` as part of their output (and then go ahead and add a newline anyway). And they may do weird things with any backslashes in the string. `printf "%s" "$line"` is the way to go. – Gordon Davisson Mar 03 '21 at 03:22
  • @GordonDavisson Good catch. I don't have my ksh-based RS6000 running AIX anymore, but if anything could do something unlike the way bash would do it, that would - According to https://www.ibm.com/support/knowledgecenter/ssw_aix_71/e_commands/echo.html `echo -n` probably wouldn't have worked on that box. On the other hand, it might not have had printf either... I've added that warning to the answer. – Jerry Jeremiah Mar 03 '21 at 04:25
  • 2
    @JerryJeremiah Even bash's `echo` builtin will print the "-n", depending on certain run- and/or compile-time options. I know this because Mac OS X v10.5.0 shipped with bash compiled to do this by default, and it broke a bunch of my scripts. And that is when I joined the church of `printf`. – Gordon Davisson Mar 03 '21 at 04:42
  • 3
    Which are all good reasons just to use `printf` and forget `echo`. Yes, you will be tempted to cheat because typing `echo "stuff"` is shorter than `printf "%s\n" "stuff"` -- don't give in. There is a bash 12-step program to help people free themselves from the `echo` addiction `:)` – David C. Rankin Mar 03 '21 at 04:42