2

I have the following data located in a .csv file that changes as new data is downloaded. The syntax of the data is always YYYY-MM-DDTHHMMSS, examples below:

2017-12-08T194949
2017-12-08T194952
2017-12-08T195000
2017-12-08T195007
2017-12-08T195007
2017-12-08T195014
2017-12-08T195016
2017-12-08T195016
2017-12-08T195016
2017-12-08T195016
2017-12-08T195021
2017-12-08T195026
2017-12-08T195029
2017-12-08T195030
2017-12-08T195030
2017-12-08T195034
2017-12-08T195051
2017-12-08T195101
2017-12-08T195105
2017-12-08T195135
2017-12-08T195138
2017-12-08T195140
2017-12-08T195144
2017-12-08T195148
2017-12-08T195154
2017-12-08T195204
2017-12-08T195205
2017-12-08T195219
2017-12-08T195223
2017-12-08T195224
2017-12-08T195225

Currently, I define my datestrings using:

lower_bound=`date -d '1 day ago' "+%Y-%m-%dT%H%M%S"`
upper_bound=`date -d '12 hours ago' "+%Y-%m-%dT%H%M%S"`

Where the amount of minutes I lookback into the file is dependent on the system time. I can set the amount I lookback to be arbitrary.

I think I have gotten close with sed/awk as follows:

sed -n "/$lower_bound/,/$upper_bound/p" data.csv
awk -v a="$lower_bound" -v b="$upper_bound" '/a/{flag=1;next}/b/{flag=0}flag' data.csv

Given those lookback strings, the commands above should print out the range of dates in between the two variables, $lower_bound and $upper_bound. Obviously, I have experimented with different lookback times in the aforementioned variables.

Any ideas to why the range of dates aren't printing? Any help would be greatly appreciated; thank you in advance.

  • $date was in error - I have supplied more sample data and changed the date definition syntax (see above) –  Dec 09 '17 at 07:34
  • 1
    Not optimal, but should be better than what you have: `awk -v a="$lower_bound" -v b="$upper_bound" '$1>=a && $1<=b'`. – gniourf_gniourf Dec 09 '17 at 07:44
  • 1
    @gniourf_gniourf - Thank you, this works. For future reference, why do you make the comparison with $1? –  Dec 09 '17 at 07:54
  • 1
    I think you need to learn a little bit about `awk`. There are some resources linked in the [tag info page](https://stackoverflow.com/tags/awk/info). – gniourf_gniourf Dec 09 '17 at 08:04
  • `/a/` matches `a` literally as a regex not `a` as a variable. – anubhava Dec 09 '17 at 08:06

1 Answers1

1

This: /a/ will match the literal "a". This: $0 ~ a will match the string you have stored in variable a, so your command should be:

awk -v a="$lower_bound" -v b="$upper_bound"
    '$0 ~ a {flag=1;next} $0 ~ b {flag=0} flag' data.csv

But these awk/sed commands will not give you what you want because only accidentally they could match lines, in case the exact datetime bounds exist in your logs. More probably, the exact lower bound will not exist, so flag will never be set.

If you want to print for that date range then you should make an alphabetical comparison of these dates, that means $0 > a and $0 < b

awk -v a="$lower_bound" -v b="$upper_bound" '$0 > a && $0 < b' data.csv
thanasisp
  • 5,855
  • 3
  • 14
  • 31