2

I've a log file with many lines, I've to extract lines from session start to session end using a bash script, for further analysis.

...
...

## TSM-INSTALL SESSION (pid) started at yyyy/mm/dd hh:mm:ss for host (variable) ##
...
...
...
...
...
...
...
## TSM-INSTALL SESSION (pid) ended at yyyy/mm/dd hh:mm:ss for host (variable) ##

...
...

I've googled and found a sed expression to extract the lines

sed '/start_pattern_here/,/end_pattern_here/!d' inputfile

But I'm unable to find a correct reg expression pattern to extract the info.

I'm pretty novice to reg exp. I'm also adding all the expressions (silly ones too) I've tried inside the script.

sed '/\.* started at \.* $server ##/,/\.* ended at \.* $server ##/!d' file

sed '/## TSM-INSTALL SESSION [0-9]\+ started at [0-9|\\|:]\+ for host $server ##/,/## TSM-INSTALL SESSION [0-9]\+ ended at [0-9|\\|:]\+ for host $server ##/!d' file

sed '/.\{30\}started{34\}$server ##$/,/.\{30\}ended{34\}$server ##$/!d' file

sed '/.## TSM-INSTALL SESSION\{6\}started at\{31\}$server ##$/,/.## TSM-INSTALL SESSION\{6\}ended at\{31\}$server ##$/!d' file

sed '/## TSM-INSTALL SESSION [0-9]+ started at .* $server/,/## TSM-INSTALL SESSION [0-9]+ ended at .* $server/!d' file

sed '/## TSM-INSTALL SESSION \.\.\.\.\. started at \.\.\.\.\.\.\.\.\.\. \.\.\.\.\.\.\.\. for host $server ##/,/## TSM-INSTALL SESSION \.\.\.\.\. ended at \.\.\.\.\.\.\.\.\.\. \.\.\.\.\.\.\.\. for host $server ##/!d' file
Hemanth
  • 159
  • 1
  • 12
  • 4
    you need double quotes to substitute value of `$server`... using `\.` means matching literal dot character but I think you are trying to use `.` as meta character to match any character.. this might work `sed -n "/started at.*$server/,/ended at.*$server/p" file` – Sundeep Mar 04 '17 at 15:02
  • 2
    Inside a script you can call `sed` just like from the command line. The sonstruction `$(..)` is for calling a command inside another command – Walter A Mar 04 '17 at 19:27
  • 1
    And if you use `$( )` outside of any other commands in your script, then your script will try to execute whatever you get out of your `sed` command, which you probably don't want either. – Nathaniel Verhaaren Mar 04 '17 at 20:06
  • 1
    @Sundeep You are right. I'm literally searching for dot character and I'm using single quote instead of double quote which didn't help either. – Hemanth Mar 06 '17 at 11:42
  • @NathanielVerhaaren I want to save the info to a variable for processing. So I've used $( ). Is there any other method I'm unaware of.? – Hemanth Mar 06 '17 at 11:46
  • @mklement0 I'm trying to extract multiple blocks including the comment lines. As I've specified in the question, that can be accomplished using `sed '/start_pattern_here/,/end_pattern_here/!d' inputfile` I needed help for regex. The accepted answer provided the exp where as Sundeep comment enabled me to rectify my error. – Hemanth Mar 06 '17 at 11:55
  • @Hemanth: My comment was posted before you accepted an answer. Generally, once you reach 15 points, you can just flag a comment as obsolete if you feel it is no longer relevant. As for the case at hand: The `sed` idiom you first reference would generally extract _multiple_ blocks, but your specific `sed` solution attempts contain variable reference `$server`, which suggests that _maybe_ only a _specific_ block is of interest. The `sed` idiom also _implies_ that the range endpoints are _included_, but it's better to be _explicit_ about this in your question. This helps future readers too. – mklement0 Mar 06 '17 at 13:01
  • @Hemanth: And to reiterate my original point: If there are clarifications to be made, it's important to _directly update the question_, so that they're not buried in comments, which readers tend to ignore. – mklement0 Mar 06 '17 at 13:09
  • @mklement0 thanks for the info.. I'm a novice here.. And now there is a situation where I need a specific block, I'll update the same in the question.. – Hemanth Mar 06 '17 at 15:28
  • @Hemanth: Thanks. And I also encourage you to get rid of the enclosing `$(...)`: in isolation, without a `var=` to the left, they would misbehave, as has been pointed out, and, more importantly, capturing the output in a variable is _incidental_ to your question. Once that is done, I will flag the comments re`$(...)` as obsolete. – mklement0 Mar 06 '17 at 15:30
  • 1
    @Hemanth: I apologize for not reading your previous comment more carefully: _Changing the requirements after answers have been posted_ and especially after you've _accepted_ one is what you should _never_ do. I suggest you revert your edit, re-accept eewanco's answer and _post a new question_. – mklement0 Mar 06 '17 at 16:34
  • Possible duplicate of [How to select lines between two marker patterns which may occur multiple times with awk/sed](http://stackoverflow.com/q/17988756/1255289) – miken32 Mar 06 '17 at 21:11
  • @miken32 I've already stumbled upon the question, however I wasn't able to get what I want or was able to comprehend how to get it using sed.. I just want the last block.. Should I create a new question.? – Hemanth Mar 07 '17 at 02:52
  • 1
    @mklement0 thanks for tutorship :) I'll never repeat these mistakes.! – Hemanth Mar 07 '17 at 02:54

2 Answers2

3

Why not:

$(sed "/^## TSM-INSTALL SESSION .* started .* $server ##/,/^## TSM-INSTALL SESSION .* ended .* $server ##/!d" file)

You don't need to get fancy with the regexps. All you care about is the leading TSM-INSTALL SESSION, the started or ended, and the hostname, so use .* to mean "whatever in-between".

Vercingatorix
  • 1,838
  • 1
  • 13
  • 22
0

If you stick this in a file called file.sed

/^## TSM-INSTALL SESSION ([0-9][0-9]*) started at [0-9][0-9]*\/[0-9][0-9]\/[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9] or host ([^)]*) ##/,/^## TSM-INSTALL SESSION ([0-9][0-9]*) ended at [0-9][0-9]*\/[0-9][0-9]\/[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9] or host ([^)]*) ##/p

and then call it like

sed -n -f file.sed inputfile 

I think it will do what you want.

The -n makes sed not print, so only the lines matched by expression will get printed.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
kdubs
  • 1,596
  • 1
  • 21
  • 36