In a bash file, I have logfileA.txt
that contains output from wget
that I'd like to run grep
on to check for any instances of the words "error" or "fail", etc, as so:
grep -ni --color=never -e "error" -e "fail" logfileA.txt | awk -F: '{print "Line "$1": "$2}'
# grep -n line number, -i ignore case; awk to add better format to the line numbers (https://stackoverflow.com/questions/3968103)
Trouble is though, I think the wget
output in logfileA.txt
is full of characters that may be messing up the input for grep
, as I'm not getting reliable matches.
Troubleshooting this, I cannot even cat
the contents of the log file reliably. For instance, with cat logfileA.txt
, all I get is the last line which is garbled:
FINISHED --2019-05-29 17:08:52--me@here:/home/n$ 71913592/3871913592]atmed out). Retrying.
The contents of logfileA.txt
is:
--2019-05-29 15:26:50-- http://somesite.com/somepath/a0_FooBar/BarFile.dat
Reusing existing connection to somesite.com:80.
HTTP request sent, awaiting response... 302 Found
Location: http://cdn.somesite.com/storage/a0_FooBar/BarFile.dat [following]
--2019-05-29 15:26:50-- http://cdn.somesite.com/storage/a0_FooBar/BarFile.dat
Resolving cdn.somesite.com (cdn.somesite.com)... xxx.xxx.xx.xx
Connecting to cdn.somesite.com (cdn.somesite.com)|xxx.xxx.xx.xx|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3871913592 (3.6G) [application/octet-stream]
Saving to: 'a0_FooBar/BarFile.dat’
a0_FooBar/BarFile.dat 0%[ ] 0 --.-KB/s
a0_FooBar/BarFile.dat 0%[ ] 15.47K 70.5KB/s
...
a0_FooBar/BarFile.dat 49%[========> ] 1.80G --.-KB/s in 50m 32s
2019-05-29 16:17:23 (622 KB/s) - Read error at byte 1931163840/3871913592 (Connection timed out). Retrying.
--2019-05-29 16:17:24-- (try: 2) http://cdn.somesite.com/storage/a0_FooBar/BarFile.dat
Connecting to cdn.somesite.com (cdn.somesite.com)|xxx.xxx.xx.xx|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 3871913592 (3.6G), 1940749752 (1.8G) remaining [application/octet-stream]
Saving to: 'a0_FooBar/BarFile.dat’
a0_FooBar/BarFile.dat 49%[+++++++++ ] 1.80G --.-KB/s
...
a0_FooBar/BarFile.dat 100%[+++++++++==========>] 3.61G 1.09MB/s in 34m 44s
2019-05-29 16:52:09 (909 KB/s) - 'a0_FooBar/BarFile.dat’ saved [3871913592/3871913592]
FINISHED --2019-05-29 17:08:52--
I assume the problem could be the /
s or ---
s or >
s or ==>
s or |
s?
But since the output from wget
could vary, how do I anticipate and escape anything problematical for grep
?
Command:
grep -ni --color=never -e "error" -e "fail" logfileA.txt | awk -F: '{print "Line "$1": "$2}'
Expected output:
Line 17: 2019-05-29 16:17:23 (622 KB/s) - Read error at byte 1931163840/3871913592 (Connection timed out). Retrying.
Also, would an ack
line be better at this job? And if so, what/how?