0

I am trying to get syntax parsing for a custom file format working. I am looking to highlight only numbers but the numbers can be in several types of formats. Although the numbers should not be in a variable name or word of some sort.

To keep things simple lets say I am looking for any number of this type [0-9][\.0-9]* and I am looking to keep just the number and not it's padding. However the only problem is that this picks up more cases than I want it to. Theses are some of the fail cases that I am looking to avoid variable123 or variable_123name or _123_, where some of the acceptable cases are |123| or +123+ $123 123% ext... 11 2123 123% 0.12 1.1.3.4.4 12 1 23452 23423| ext...

I am basically looking to only get rid of the unnecessary variable highlighting while keeping numbers highlighted in a more relaxed case where these numbers can be in lists surrounded by many other random characters. I have tried with lookaheads with several examples from this site but have come up with no good solution. I have a few fail criteria and I want the hilighting criteria to be loose. There are a lot of numbers but the only important cases to get rid of is when the number is imbedded into an id tag typically like a variable decleration. This means the string should have no letters in it [a-zA-Z] as well as no underscores [_a-zA-Z] and tried to use this schema to eliminate those cases using lookup. Although this didn't solve my problem. Here is a link to my problem (regex101.com).

user2716722
  • 93
  • 11

2 Answers2

1

update

After re-reading your question, it should be able to be done with this
https://regex101.com/r/K6fQXy/1

(?<!\w)\d[.\d]*(?!\w)

Formatted

 (?<! \w )
 \d [.\d]* 
 (?! \w )

Or, if you don't want to match a trailing dot after digits, this one
https://regex101.com/r/K6fQXy/2

(?<!\w)\d(?:\.?\d)*(?!\w)


Just roll your own regex.
Use assertions to qualify the digits to highlight.

https://regex101.com/r/XwDEkj/1

(?<=\|)\d+(?=\|)|(?<=\+)\d+(?=\+)|(?<=\$)\d+|\d+(?=%)

Formatted

     (?<= \| )
     \d+ 
     (?= \| )
  |  
     (?<= \+ )
     \d+ 
     (?= \+ )
  |  
     (?<= \$ )
     \d+ 
  |  
     \d+ 
     (?= % )
  • I don't know, what engine does vi use? I remember vi from 30 years ago, has it never been updated? –  Jun 16 '17 at 22:12
  • Well it is not actually vi but is a spin off of vi to be exact and the answer is no. It supports only a small set of regex expressions for parsing syntax and our programming language is grown in house so I want to make a parser for it. I'm sorry though I mislead you a bit with my examples. There are more passing cases to consider. I will add those. – user2716722 Jun 17 '17 at 01:49
  • I appreciate this answer. It helped to clarify a little bit about positive look ahead. I still have a couple issues, and this is because of how limited my system is. I don't have support for character classes (\d or \w) as well as lookaheads. Is there any easier solution that what your original statement was? Just rolling out every pass case? – user2716722 Jun 19 '17 at 13:26
  • 1
    @user2716722 - https://regex101.com/r/K6fQXy/3 This makes 3 capture groups. If you are writing back control codes to do the highlighting: Groups 1 and 3 are written back unchanged. Group 2 is what you'd want to add control codes to. For example, the replacement would be `\1` + + `\2` + + `\3`. However, since this matches surrounding characters, it will miss some adjacent matches. This is what happens without the lookahead/behind's. –  Jun 19 '17 at 19:44
0

If available, use the word border matches

\<[0-9][\.0-9]*\>
NetMage
  • 26,163
  • 3
  • 34
  • 55