1

Linux Debian Testing 64.

I wish to grep or awk the following...

ExifListAll = (below)

DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

I'll use Column 3 time 12:54:33 to start, search for 1 second before and 1 second after, Column 4 = "On" and Column 5 = 1, 2, or 3

I've tried this so far;

echo "$ExifListAll" | grep -E '2014-07-21.*12:45:3[3-4].*On.*[1-3]'

Can I use an awk 1 liner more efficiantly ?

Am I doing this correctly ?

echo "$ExifListAll" | awk '$4 == "On" && $5~/1/,$5~/3/'

Thank you.

voiczed
  • 65
  • 2
  • 6
  • 2
    can you provide your desired output? – qwwqwwq Sep 08 '14 at 02:11
  • Don;t use range expressions in awk. They make trivial tasks very slightly briefer and marginally more interesting tasks need a complete rewrite. Use `/start/{f=1} f; /end/{f=0}` instead of `/start/,/end/`. – Ed Morton Sep 08 '14 at 03:00
  • @qwwqwwq. Desired output if any one the above items used to start, a search of the list above will occur using 1 second prior, 1 second later (column 3), then make sure column 4 is "On'. If DSCF3567.JPG is used, then it will find all items above (lines 1-6). – voiczed Sep 08 '14 at 03:34
  • @Ed Morton. How would you change to incorporate you suggestion? e.g. echo "$ExifListAll" | awk '$4 == "On" && $5~/1/,$5~/3/' – voiczed Sep 08 '14 at 03:37
  • It depends what you think that statement means but I'd guess maybe `awk '$5~/1/{f=1} f && ($4=="On"); $5~/3/{f=0}'`. – Ed Morton Sep 08 '14 at 03:50
  • @Ed Morton. Thank you. Excuse me, I'm unable to understand why the following is unacceptable, it appears to produce identical results; echo "$ExifListAll" | awk '$4 == "On" && $5~/1/,$5~/3/', or why this is better than the grep example; echo "$ExifListAll" | grep -E '2014-07-21.*12:45:3[3-4].*On.*[1-3]' – voiczed Sep 08 '14 at 04:19
  • I didn't say it was better than the grep example, I just said don't use range expressions in awk, see http://stackoverflow.com/questions/23934486/is-a-start-end-range-expression-ever-useful-in-awk for a discussion on that. – Ed Morton Sep 08 '14 at 12:23

3 Answers3

1

grep will work fine for your purposes. You are just having a challenge with the syntax. Primarily, it is easier to use the pattern \s* to match zero or more spaces between fields. You are using .* which (since regular expressions are greedy) will match every character to the end of the line. Also, character classes mean characters contained within. I.e. to match 1, 2, or 3, use [123]. With those changes, the following accomplishes what your intent appears to be:

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"

output:

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*On\s*[123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3

Is this not the output you were expecting? 12:54:34 had Off & a 0 which I interpreted from your question as not wanted. If you want the states On/Off regardless, and included the0` corresponding to 12:54:34 Off 0, then use:

echo "$ExifListAll" | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"

output:

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[34]\s*(On|Off)\s*[0123]"
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
DSCF3569.JPG    2014-07-21 12:54:34 Off 0

per comment that lines 1-6 are desired:

cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"

output

$ cat grepdat.dat | grep -E "2014-07-21\s*12:54:3[234]\s*On\s*[123]"
DSCF3566.JPG    2014-07-21 12:54:32 On  1
DSCF3566.RAF    2014-07-21 12:54:32 On  1
DSCF3567.JPG    2014-07-21 12:54:33 On  2
DSCF3567.RAF    2014-07-21 12:54:33 On  2
DSCF3568.JPG    2014-07-21 12:54:33 On  3
DSCF3568.RAF    2014-07-21 12:54:33 On  3
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Thank you, unfortunately your code result misses "2014-07-21 12:54:34"... can't figure why. Would it be possible to do math on the last 2 numbers of the time field by plus 1 and minus 1 to achieve 12:54:33, 12:54:34, 12:54:35 ?. Thanks again. – voiczed Sep 08 '14 at 05:01
  • Lines 1-6 in the 1st message I posted is the desired output, if any of the times are used as a starting point. Thanks for the time. You are correct, column 5 '0' is omited. I'm struggling how to make this work if the last 2 digits in the time field are unknown but a +1 and -1 second are needed to be included. – voiczed Sep 08 '14 at 07:49
  • Really? That's a simple tweak. See the answer now. Think about what each part of the expression does... You could have tweaked it by simply adding `2` to the first character class making it `[234]`. Take some time and get familiar with the answer, don't just use it to solve a problem, lest the intended learning be lost. – David C. Rankin Sep 08 '14 at 19:16
1

You can NOT use range or flag to retrieve more than one rows which matched the /end/ block. For a more general solution with awk, you can convert the time to epoch time and then set up the comparison:

mydatetime="2014-07-21 12:54:33"
awk -v expected_time=$(date -d"$mydatetime" +%s) '
  { t = $2" "$3; gsub(/[:-]/," ",t); t1 = mktime(t) }
  t1 >= expected_time-1 && t1 <= expected_time+1 && $4 =="On" && $5 ~ /^[123]$/
' file.txt

Note:

  1. line-1: setup the expected_time to be epoch timestamp with the -v expected_time=$(...)
  2. convert the entrytime ($2" "$3) of each record into the format "YYYY mm dd HH MM SS" and then feed into mktime() to generate epoch timestamp with awk.
  3. compare the time and make sure $4 is 'On' and $5 is 1, 2, or 3.

If you know exactly the expected_time as you mentioned, then just use your grep line, much simpler and faster than the awk one.

grep -E '2014-07-21.*12:54:3[2-4].*On.*[1-3]' file.txt
lihao
  • 583
  • 3
  • 6
  • Thanks. If I wanted to go with grep, how can I perform math on the 'seconds' time for -1 second and +1 second from whatever time is used as a starting point ? – voiczed Sep 08 '14 at 07:52
  • I am afraid grep is not the right tool to do math. you can probably calculate them in BASH and then feed them into the regex to grep with alternation i.e. ($time1|$time2|$time3). – lihao Sep 08 '14 at 15:13
0

Thank you all for your suggestions.

I have used an alternate more direct method using 'exiftool' It reads all the metadata from images.

I selected any image in a directory, then give the previous 1 second and the next one second. I'm not sure yet how to substitue the info provided but I will sort it out from your help.

DateTimeOrigFirst="$(exiftool -T -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecMinus="$(exiftool -T -globalTimeShift "-0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"
DateTimeOrig1SecPlus="$(exiftool -T -globalTimeShift "+0:0:0 0:0:1" -d '%F %T' -DateTimeOriginal DSCF3567.RAF)"

I can then produce images 1-6 in my 1st example with;

printf %s\\n "$ExifListAll" | tr '\t' ' ' | grep \
-E "$DateTimeOrigFirst|$DateTimeOrig1SecMinus|$DateTimeOrig1SecPlus"

Thanks again.

voiczed
  • 65
  • 2
  • 6