1

I am pretty new to regular expressions. I've been trying to use 'grep' alongside with regular expressions to extract a line from a file. I thought some of you might help:

My file has the lines:

c total length of integration
ntotal= 0
c total number of updates
ntotal= 20
c total number of outputs 
ntotal= 10

So I was wondering how do I extract the first occurrence of 'ntotal= 0'

I tried with grep ^c.*tot.*\n? 'filename' , but that did not work.

Any ideas?

Regards,

Abedin

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
user3578925
  • 881
  • 3
  • 16
  • 26
  • 2
    [`^ntotal=\s*\d+$`](http://regex101.com/r/nL7dK7) or [`^c\s*total.*$`](http://regex101.com/r/mE5tM1)? – Sam Apr 27 '14 at 18:29
  • Hi Sam, the first case returns nothing. The second case returns the line starting with 'c'. I want the line after the first occurrence of 'c', that is I want the line immediately following the 'c total length of integration' – user3578925 Apr 27 '14 at 18:39
  • [`^c.*\R\K.*$`](http://regex101.com/r/gQ9uC1) is this what you want? Note it **literally** takes the next line, so it won't work with whitespace. Let me know if this needs to be modified, otherwise I can post as an answer w/explanation. – Sam Apr 27 '14 at 18:41
  • What this last thing did was to skip everything and jumped over the line after 'ntotal = 10' and printed its content – user3578925 Apr 27 '14 at 18:47
  • Hmm, I've never looked into the differences of `grep` expressions, so I think there is a problem there. Try making the `.*` lazy with a `?`: [`^c.*?\R\K.*$`](http://regex101.com/r/xU1vN1) – Sam Apr 27 '14 at 18:50
  • Hi again, I tried your last suggestion ^c.*?\R\K.*$ but this time it did not return anything. – user3578925 Apr 27 '14 at 18:55
  • I now see that my post has been edited. Just to clarify something - there is a blank line after all instances of 'ntotal'. – user3578925 Apr 27 '14 at 19:04
  • Ah, I just read up on `grep` expressions and turns out they [match line-by-line](http://stackoverflow.com/a/12652676/703229). So it is impossible to look at the next line, the best you can do is just see if it starts with `ntotal=`. – Sam Apr 27 '14 at 19:05

2 Answers2

2

grep has a flag -m that sets the maximum number of occurrences to match,

thus

grep -m 1 -P '^ntotal *= *[0-9]+$' < filename

Starting the expression with ^ means that this is the start of a line, $ means the end of the line. The -P flag means that extended regular expression patterns are enabled (Perl style).

I've added * so that there is an arbitrary amount of spaces between ntotal and = allowed (zero or more).

< filename means that you use the content of the file called filename as input.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • Don't think this does exactly what OP wanted, but I think its the best you can do with `grep` and will give the intended output. – Sam Apr 27 '14 at 19:09
  • But regexes are, especially when it comes to newlines etc., not standardized. For instance C# has special flags to allow new line parsing, `sed` uses special characters, etc. – Willem Van Onsem Apr 27 '14 at 19:11
  • 1
    Yea, I had never really worked with `grep` expressions but just read that it goes line-by-line so there really isn't any new line parsing (making OP's direct request somewhat impossible). I gave a +1 as I think it does answer the question, I also learned about the `-m` flag. – Sam Apr 27 '14 at 19:12
  • That worked!!! But I am still wondering if there is a way to tell 'grep' to match next line after a given line. For instance, something like: grep -m 1 -P '^c.*tot.*$\r' – user3578925 Apr 27 '14 at 19:12
  • I am sorry guys. I just saw your comments! Perhaps, it is impossible with 'grep' to parse the next line after a given line. However, I'd like to thank all of you for your help! – user3578925 Apr 27 '14 at 19:15
  • @user3578925: as sam already pointed out, `grep` goes line-by-line. So you can't match a string that spans over two or more lines. If you want to do this, you need to use `sed` or `awk`. – Willem Van Onsem Apr 27 '14 at 19:15
  • Thank you guys! I think I should sit down and learn sed or awk. Kind regards, Abedin. – user3578925 Apr 27 '14 at 19:20
1

To get the first ntotal you can use awk like this:

awk '/ntotal/ {print $0;exit}' file
ntotal= 0

It search for ntotal, if found, print the line and then exit.

Jotne
  • 40,548
  • 12
  • 51
  • 55