2

I have a textfile with a bunch of data and lines like SID: 1 - SN: 0123456789 scattered all over the file. All lines are delimited with CR/LF (Windows)
In bash I create an array with unique Serial Numbers:

sn=($(cat ./serials |awk '/SN: / { print $3 }' FS=': '|sort -u;))

So far so good, but each array member contains a CR at the end:

echo "${sn[0]}:test"

prints :test56789 instead of 0123456789:test

I can fix it with tr -d '\r' like this:

sn=($(cat ./serials |tr -d '\r'|awk '/SN: / { print $3 }' FS=': '|sort -u;))

but I doubt if this is the best approach. Is there a way to remove the CR in the awk command?

SBF
  • 1,252
  • 3
  • 12
  • 21
  • You could also use `dos2unix` on the file first, but I can't think of any way to get around having to fix the line endings at some point. – Benjamin W. Jun 27 '22 at 16:22
  • Does this answer your question? [Remove carriage return in Unix](https://stackoverflow.com/questions/800030/remove-carriage-return-in-unix) – miken32 Jun 29 '22 at 03:06
  • Also https://stackoverflow.com/questions/11680815/removing-windows-newlines-on-linux-sed-vs-awk – miken32 Jun 29 '22 at 03:08

2 Answers2

3

Is there a way to remove the LF in the awk command?

Sure you can have awk like this:

awk -F ': *' '{sub(/\r$/, "")} /SN: / {print $3}' serials

Your complete solution to read awk output into a bash array:

readarray -t sn < <(
awk -F ': ' '{sub(/\r$/, "")} /SN: / {print $3}' serials | sort -u)

# check bash array
declare -p sn
anubhava
  • 761,203
  • 64
  • 569
  • 643
0

3 approaches of the same idea, tested and working on mawk, gawk, and nawk

awk '++_[$!++NF]<--NF' FS='^.* SN: ' OFS= RS='\r?\n'
awk '!_[$!++NF]++<--NF'
awk 'NF>++_[$(NF=NF)]'

No risk of decrementing NF to negative zone, because the incrementing action precedes it.

RARE Kpop Manifesto
  • 2,453
  • 3
  • 11