I decided not to add sed
commands because I don't know what you might be doing in your $remapper
, and didn't want to alter your possible logging.
awk
handles record counts, math and output control better than sed
, so I used that for a simple single-process wrapper.
$: awk 'NR==FNR{onepct=NR/100} # define how many records is one percent
NR>FNR && 0==FNR%onepct { printf "\r%02d%%", ++pctcnt }
END{printf "\n"} # add a clean newline at the end
' "$original" <( sed -un -f "$remapper" "$original" )
It could certainly be improved, but the awk
silently scans though your original file once to get the number of lines and assign what one percent of those would be. It then reads through the output of your sed
command as-is, and uses modulo division to only print once on each percent-without-remainder line. I used a simple increment for percentage output.
Keeping the operations in memory and generating fewer processes and less overall IO should speed things up if it's a sizeable dataset.
A simple pipe works just as well, just use a dash as the second input:
$: sed -un -f "$remapper" "$original" |
awk 'NR==FNR{onepct=FNR/100} # define how many records is one percent
NR>FNR && 0==NR%onepct { printf "\r%02d%%", ++pctcnt }
END{printf "\n"} # add a clean newline at the end
' "$original" -
If you prefer to handle the record counting with wc
, it's even simpler.
awk -v pct=$(($(wc -l<"$original")/100)) '0==NR%pct{printf "\r%02d%%",++pctcnt} END{printf "\n"}' <(sed -unf "$remapper" "$original")
or
sed -unf "$remapper" "$original" |
awk -v pct=$(($(wc -l<"$original")/100)) '0==NR%pct{printf "\r%02d%%",++pctcnt} END{printf "\n"}'
Of course, to preserve your output you either need the tee
or could have awk
do it.
sed -unf "$remapper" "$original" | tee "${mapping_file}" |
awk -v pct=$(($(wc -l<"$original")/100)) '0==NR%pct{printf "\r%02d%%",++pctcnt} END{printf "\n"}'
or
sed -unf "$remapper" "$original" |
awk -v pct=$(($(wc -l<"$original")/100)) -v out="$mapping_file" '
{ print > out }
0==NR%pct { printf "\r%02d%%", ++pctcnt }
END { printf "\n"; close(out); } '