How would I find values in a file, but only on lines that don't start with #?

Question

I've got a document that looks something like this:

# Document ID 8934
# Last updated 2018-05-06
52 84 12 70 23 2 7 20 1 5
4 2 7 81 32 98 2 0 77 6
(..and so on..)

In other words, it starts off with a few comment lines, then the rest of the document is just a bunch of numbers separated by spaces.

I'm trying to write a regex that gets all digits on all lines that don't start with #, but I can't seem to get it.

I've read over answers such as

Regular Expressions: Is there an AND operator?
Regex: Find a character anywhere in a document but only on lines that begin with a specific word

and pawed through sites such as http://regular-expressions.info, but I still can't get an expression that works (the best I can get is a lengthy version of ^[^#].*

So how can I match digits (or text, or whatever) in a string, but only on lines that don't start with a certain character?

@rkta That just gets all lines that start with a hash. I want to combine that with something like `\d+` to get all digits on lines that don't start with a hash, as my question stated — Grayda, May 06 '18 at 07:51
which is the context of this? Probably, it would be a better idea to first check if the line begins with `#`, and then, use `\d+` on each line that doesn't begin with `#` — alsjflalasjf.dev, May 06 '18 at 08:06
Question to ask yourself is : why use regex when the problem can be solved writing two lines of code? — Yassin Hajaj, May 06 '18 at 08:28
@YassinHajaj part of this is because it's a challenge that'd test my regex knowledge. I've worked around it by using two lines, but now I'm curious about a one-regex answer. — Grayda, May 06 '18 at 08:33
As easy as `(?<!^#.*?)\d+`, use with PyPi Python `regex` module, .NET regex, or in the latest versions of Chrome (JavaScript compatible with ECMAScript 2018 standard). — Wiktor Stribiżew, May 06 '18 at 09:41

The fourth bird · Accepted Answer · 2018-05-06T11:04:57.797

Your regex ^[^#].* uses a negated character class which matches not a # from the start of the string ^ and after that matches any character zero or more times. This would for example also match t test

What you might do is use an alternation to match a whole line ^#.*$ that starts with a # or capture in a group one or more digits (\d+)

Your digits are captured group 1. You could change the (\d+) to for example a character class ([\w+.]+) to match more than only digits.

(?:^#.*$|(\d+))

Details

(?: Non capturing group
- ^#.*$ Match from the start of the line ^ a # followed by any character zero or more times .* until the end of the string $
- | Or
- (\d+) capture one or more digits in a group
) Close non capturing group

score 0 · Answer 2 · answered May 06 '18 at 07:58

0

I think a way simpler method would be to replace the lines with "" first with this regex:

^#.*

And then you can just match all the numbers with this:

-?\d+ (-? is for negative)

answered May 06 '18 at 07:58

Sweeper

213,210
22
193
313

My code eventually used two lines as you suggested (though slightly different, using `^[^#]+` to catch all non-hashed lines then simply `\d+` for the digits), though I was interested to see if there was a "one liner", which there was via @the fourth bird's answer. – Grayda May 06 '18 at 12:37

How would I find values in a file, but only on lines that don't start with #?

2 Answers2