-2

I'm migrating many bash shell scripts from old versions of raspbian and ubuntu to the current raspbian version. I've made a brand new system installation, including various configuration (text) files that I've created for these scripts. I found to my horror that awk-print and awk-printf APPEAR to have changed in the latest version, as evidenced by bash variable-type errors when the values are used. What's going on ?

Now that I know the answer, I can explain what happened so others can avoid it. That's why I said, awk-print APPEARS to have changed. It didn't, as I discovered when I checked the version of awk on all three machines. Running:

awk -W version

on all three systems gave the same version, mawk 1.3.3 Nov 1996.

When a text file is small, I find it the simplest to cat the file to a variable, grep that variable for a keyword that identifies a particular line and by extension a particular variable, and use 'tr' and 'awk print' to split the line and assign the value to a variable. Here's an example line from which I want to assign '5' to a variable:

"keyword=5"<line terminator>

That line is one of several read from a text file, so there's at least one line terminator after each line. That line terminator is the key to the problem.

I execute the following commands to read the file, find the line with 'keyword', split the line at '=', and assign the value from that line to bar:

file_contents="$(cat "$filename")"

bar="$(echo -e "$file_contents" | grep "keyword" | tr "=" " " | awk '{print $2}')"

Here's the subtle part. Unknownst to me, in the process of creating a new system, the line terminators in some of my text files changed from linux format, with a single line terminator (\n), to DOS format, with two line terminators (\n\r), for each line, when I set up the new system. When, working from the keyboard, I grepped the text file to get the desired line, this caused the value that awk-print assigned to 'bar' to have a line terminator (\r). This terminator does NOT appear on screen because bash supplies one. It's only evident if one executes:

echo ${#bar}

to get the length of the string, or does:

echo -e "$bar"

The hidden terminator shows up as one additional character.

So, the solution to the problem was either to use 'fromdos' to remove the second line terminator before processing the files, or to remove the unwanted '\r' that was being assigned to each variable. One helpful comment noted that 'cat -vE $file" would show every character in the file. Sure enough, the dual terminators were present.

Another helpful comment noted that using I was causing multiple sub-processes to run when I parsed each line, slowing execution time, and that a bashism:

${foo//*=/}

could avoid it. That bashism helped parse each line but did not remove the offending '\r'. A second bashism:

${foo//$'\r'/}

removed that '\r'.

CASE SOLVED

NewtownGuy
  • 37
  • 4
  • 1
    `print` in `awk` has always added a newline. If you want to print without a newline you have to use `printf()` – Barmar Sep 05 '22 at 22:37
  • 1
    How can you tell whether there's a newline? `$()` removes the last newline from the output. – Barmar Sep 05 '22 at 22:39
  • 1
    when I run your code w/ `awk 5.1.1` I just get `5` (no `\n` on the end): `typeset -p bar` ==> `declare -- bar="5"`; please update the question with the output from running `typeset -p bar` on both the old and new systems; also please update the question with your `awk` versions from the old and new systems – markp-fuso Sep 05 '22 at 22:41
  • 1
    might want to consider a review and rewrite of the old code for simplicity and performance reasons; current code spawns 3 subprocesses to populate `bar` while the same thing can be done with no subprocesses using parameter expansion/substituion, eg, both `bar="${foo//*=/}"` and `bar="${foo#*=}"` leave you with `bar=5` without the expensive overhead of spawning subprocesses – markp-fuso Sep 05 '22 at 23:19
  • 1
    Nothing that you describe changed in awk. I suspect your input now contains CRs when it didn't previously, see [why-does-my-tool-output-overwrite-itself-and-how-do-i-fix-it](https://stackoverflow.com/questions/45772525/why-does-my-tool-output-overwrite-itself-and-how-do-i-fix-it). – Ed Morton Sep 06 '22 at 02:14
  • Why are you using a *command substitution* to begin with when `bar="${foo#*=}"` will do without spawning unnecessary subshells? – David C. Rankin Sep 06 '22 at 04:23
  • I know how to do command substitution. – NewtownGuy Sep 06 '22 at 11:59

3 Answers3

0
#!/bin/sh -x

echo "value=5" | tr "=" "\n" > temp
echo "1,2p" | ed -s temp

I have come to view Ed as UNIX's answer to the lightsaber.

petrus4
  • 616
  • 4
  • 7
0

I found a format string, '"%c", $2' to use with printf in the current awk, but I have to use '"%s", $2 in the old version. Note '%c' vs '%s'.

%c behavior does depend on type of argument you feed - if it is numeric you will get character corresponding to given ASCII code, if it is string you will get first character of it, example

mawk 'BEGIN{printf "%c", 42}' emptyfile

does give output

*

and

mawk 'BEGIN{printf "%c", "HelloWorld"}' emptyfile

does give output

H

Apparently your 2nd field is digit and some junk characters, which is considered to be string, thus second option is used. But is taking first character correct action in all your use-cases? Is behavior compliant with requirement for multi-digit numbers, e.g. 555?

(tested in mawk 1.3.3)

Daweo
  • 31,313
  • 3
  • 12
  • 25
0

I found the problem thanks to several of the responses. It's rudimentary, I know, but I grepped a text file to extract a line with a keyword. I used tr to split the line and awk print to extract one argument, a numeric value, from that. That text file, once copied to the new machine, had a CR LF at the end of each line. Originally, it just had a newline character, which worked fine. But with the CR LF pair, every numeric value that I assigned to a variable using awk print had a newline character. This was not obvious onscreen, caused every arithmetic statement and numeric IF statement using it to fail, and caused the issues I reported about awk print.

NewtownGuy
  • 37
  • 4
  • I've been voted down because I didn't know the answer to my question... It was only when Ed suggested checking the text file for DOS formatting that a solution eventually emerged. Another suggestion, to parse a line using ${foo#*=} was helpful but did NOT solve the problem after all because it left a \r after the value and that was the crux of the problem. It was not until I found another post, to use ${foo//$'\n'/} to remove a newline character, which in my case was \r, that the problem was solved. It all started when I created a text file on linux, copied it to windows, and back to linux. – NewtownGuy Sep 06 '22 at 21:00