1

I am trying to work on a script where it has a column concatenated with date and some string. I want to substring the date part and compare with today's date. If it is older than today, I want to replace with today's date. Here is an example.

cat test.txt
aaaa              RR  242       644126              20161030O00001.0000
bbbb              RR  242       644126              20161225O00001.0000
aaaa              RR  242       644126              20161012O00001.0000
aaaa              RR  242       644126              20170129O00001.0000
aaaa              RR  242       644126              20170326O00001.0000
aaaa              RR  242       644126              20170430O00001.0000
aaaa              RR  242       644126              20161015O00001.0000

I want the output as below by changing date stamp for colum5 - row 3 and 7. Please help. I am looking for single command to make it work if it is possible.

aaaa              RR  242       644126              20161030O00001.0000
bbbb              RR  242       644126              20161225O00001.0000
aaaa              RR  242       644126              20161020000001.0000
aaaa              RR  242       644126              20170129O00001.0000
aaaa              RR  242       644126              20170326O00001.0000
aaaa              RR  242       644126              20170430O00001.0000
aaaa              RR  242       644126              20161020000001.0000
Kate
  • 11
  • 2
  • 1
    So, given that today's date is 2016-10-20, if the date portion of the data corresponds to a date older than 2016-10-20, you want to replace the date with 2016-10-20, but if it is a future date in 2016, or in 2017 or beyond, leave it alone? And dates could possibly be from 2015 or earlier? Is the letter after the date field always `O` as in your example? – Jonathan Leffler Oct 20 '16 at 23:18
  • I tried to salvage the formatting and removed empty lines between every line in the examples. I imagine the asterisks are also not actually part of the data; could you please review, and [edit] your question to clarify? – tripleee Oct 21 '16 at 04:01
  • Splitting the date from the other data would seem like a way to promote your own sanity and simplify further downstream processing. – tripleee Oct 21 '16 at 04:03
  • @tripleee: I left the data alone because the `**` was used to highlight the parts that were changed. No: I'm sure that the `**` you now have on display are markup and not part of the original data. – Jonathan Leffler Oct 21 '16 at 05:06
  • Thanks @triplee for formatting. Asterisks are not part of data. – Kate Oct 21 '16 at 20:33

4 Answers4

2

Using plain Awk, and hence not assuming built-in date support:

awk -v refdate="$(date +%Y%m%d)" '{ if ($5 < refdate) $5 = refdate substr($5, 9); print}'

Given data file and current date 2016-10-20:

aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161225O00001.0000
aaaa RR 242 644126 20161012O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161015O00001.0000

The output is:

aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161225O00001.0000
aaaa RR 242 644126 20161020O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161020O00001.0000
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • I am using this in solaris. Looks like syntax is not accepting – Kate Oct 21 '16 at 20:30
  • awk -v refdate="$(date +%Y%m%d)" '{ if ($5 < refdate) $5 = refdate substr($5, 9); print}' test.txt awk: syntax error near line 1 awk: bailing out near line 1 – Kate Oct 21 '16 at 20:30
  • If you're on Solaris, the problem may be that the `$(…)` is still not recognized by the shell you're using (`/bin/sh` on Solaris is a couple of decades behind the times). You may need to use: ```awk -v refdate=`date +%Y%m%d` '{ … }' test.txt```. If that doesn't fix the problem, you'll need to look at the Solaris Awk man page and see whether `substr` is recognized. I think that the back-ticks in lieu of `$(…)` will probably solve the problem, but there are more ways to debug things if it isn't sufficient. – Jonathan Leffler Oct 21 '16 at 20:36
0

If perl is okay... Note: It is 2016-10-21 already in my part of world ;)


To get today's date using Time::Piece module:

$ perl -MTime::Piece -le '$d = localtime->ymd(""); print $d'
20161021


Sample input:

$ cat ip.txt 
aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161225O00001.0000
aaaa RR 242 644126 20161012O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161015O00001.0000

Solution:

$ perl -MTime::Piece -pe 'BEGIN{$d=localtime->ymd("")} s/.* \K\d+/$& < $d ? $d : $&/e' ip.txt
aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161225O00001.0000
aaaa RR 242 644126 20161021O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161021O00001.0000
  • BEGIN{$d=localtime->ymd("")} save today's date in $d variable, do this only at start of script
  • s/.* \K\d+/$& < $d ? $d : $&/e extract date to be replaced, compare against $d and replace with $d if extracted date is earlier than $d

    • .* \K match until last space in line, by using \K, matched text up to this point is discarded

    • $& contains the matched string from \d+

    • e flag allows the use of code $& < $d ? $d : $& instead of string in replacement section


Using date command instead of Time::Piece module:

perl -pe 'BEGIN{chomp($d=`date +%Y%m%d`)} s/.* \K\d+/$& < $d ? $d : $&/e' ip.txt


Further reading:

Graham
  • 7,431
  • 18
  • 59
  • 84
Sundeep
  • 23,246
  • 2
  • 28
  • 103
0

In Gnu awk (see last comment in the explanations):

$ awk -v a="$(date +%Y%m%d)" '(b=substr($5,1,8)) && sub(/^.{8}/,(b<a?a:b),$5)' file
aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161225O00001.0000
aaaa RR 242 644126 20161021O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161021O00001.0000

Explained:

awk -v a="$(date +%Y%m%d)"  # set the date to a var
'                           # '
(b = substr($5,1,8) ) &&    # read first 8 chars of 5th field to b var
sub(/^.{8}/, (b<a?a:b), $5) # replace 8 first chars with a if b is less than a
                            # to make it compatible with other awks, change
                            # /^.{8}/ to /^......../
' file
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

I figured out finally! Thanks James, Jonathan and others.

Here is my command in solaris.

$cat test.txt
aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161012O00001.0000
aaaa RR 242 644126 20161013O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161014O00001.0000

$ /usr/xpg4/bin/awk -v a=`date +%Y%m%d` '(b=substr($5,1,8)) && gsub(/^.{8}/,(b<a?a:b),$5)' test.txt    
aaaa RR 242 644126 20161030O00001.0000
bbbb RR 242 644126 20161022O00001.0000
aaaa RR 242 644126 20161022O00001.0000
aaaa RR 242 644126 20170129O00001.0000
aaaa RR 242 644126 20170326O00001.0000
aaaa RR 242 644126 20170430O00001.0000
aaaa RR 242 644126 20161022O00001.0000
Sundeep
  • 23,246
  • 2
  • 28
  • 103
Kate
  • 11
  • 2