31

I'm using WP-GeSHi in WordPress, and largely I'm very happy with it. There are, however, a few minor scenarios where the color highlighting is too aggressive when a keyword is:

  1. a variable name (denoted by a leading @)
  2. part of another word (e.g. IN in INSERTED)
  3. the combination (part of a variable name, e.g. JOIN and IN in @JOINBING)
  4. inside square brackets (e.g. [status])

Certain keywords are case sensitive, and others are not. The below screenshot sums up the various cases where this goes wrong:

enter image description here

Now, the code in GeSHi.php is pretty verbose, and I am by no means a PHP expert. I'm not afraid to get my hands a little dirty here, but I'm hoping someone else has made corrections to this code and can provide some pointers. I already implemented a workaround to prevent @@ROWCOUNT from being highlighted incorrectly, but this was easy, since @@ROWCOUNT is defined - I just shuffled the arrays around so that it was found before ROWCOUNT.

What I'd like is for GeSHi to completely ignore keywords that aren't whole words (whether they are prefixed by @ or immediately surrounded by other letters/numbers). JOIN should be grey, but @JOIN and JOINS should not. I'd also like it to ignore keywords that are inside square brackets (after all, this is how we tell Management Studio to not color highlight it, and it's also how we tell the SQL engine to ignore reserved words, keywords, and invalid identifiers).

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490

2 Answers2

31

You can do this by adding a PARSER_CONTROL control to the end of the array:

'PARSER_CONTROL' => array(
    'KEYWORDS' => array(
        1 => array( // "1" maps to the main keywords near the start of the array
            'DISALLOWED_BEFORE' => '(?![\(\w])',
            'DISALLOWED_AFTER' => '(?![\(\w])'
        ),
        5 => array( // "5" maps to the shorter keywords like "IN" that are further down
            'DISALLOWED_BEFORE' => '(?![\(\w])',
            'DISALLOWED_AFTER' => '(?![\(\w])'
        ),
    )
)

Edit

I've modified your gist to move some of the keywords you added to SYMBOLS back to KEYWORDS (though in their own group and with your custom style), and I updated the PARSER_CONTROL array to match the new keyword array indexes and also to include the default regex that geshi generates. Here is the link:

https://gist.github.com/jamend/07e60bf0b9acdfdeee7a

ale
  • 10,012
  • 5
  • 40
  • 49
Jonathan Amend
  • 12,715
  • 3
  • 22
  • 29
  • so I have one other really minor thing I'd like to fix, and it might be very easy picking for you. I'd like variables (anything starting with `@`) to be highlighted a certain color. Is this easy or hard? – Aaron Bertrand May 21 '14 at 17:43
  • 8
    Genius. Thanks again so much. (And sorry for not stating all of my requirements up front - I thought the question was already overwhelming without those additional, largely unrelated changes.) Now, one last question: why doesn't SO let me give more than 500 rep in an individual bounty? Do you have any idea how long I've been scratching my head about these minor changes? – Aaron Bertrand May 21 '14 at 21:01
  • 1
    Hey, I am the author WP-GeSHi-Highlight. I am sure you have realized that I am just *using* the fabulous GeSHi library in there. Now, did you consider contributing your input to the GeSHi project? Here you go: https://github.com/GeSHi/geshi-1.0 -- the repository is quite stable, but language update pull requests seem to be merged from time to time. And I, on the other hand, merge GeSHi updates, from time to time, into WP-GeSHi-Highlight. Cheers! – Dr. Jan-Philip Gehrcke Jun 18 '15 at 00:59
  • 1
    Here you go, I've included the other improvements by Aaron Bertrand. [Improve T-SQL syntax highlighting](https://github.com/GeSHi/geshi-1.0/pull/49) – Jonathan Amend Jun 18 '15 at 01:50
3

According to me, what you are doing would take a lot of time. So, I suggest that you install a different plugin:

It has better features and supports more languages and in a better way. So, it would remove all these problems.

EDIT:

Hey, I tried out the same code with latest version and got following result-

enter image description here

EDIT:

So, if you don't want to use another plugin, then I'll tell you about the coding:

First open \wp-content\plugins\wp-geshi-highlight\geshi\geshi\tsql.php in your text editor.

Then, locate the array 'KEYWORDS' or search for it.

Add 6 to the last of it (after 5) and add your custom keywords in it. For example:

5 => array(
'ALL', 'AND', 'ANY', 'BETWEEN', 'CROSS', 'EXISTS', 'IN', 'JOIN', 'LIKE', 'NOT', 'NULL',
'OR', 'OUTER', 'SOME',
),

6 => array(                          //This line has been added by me
'status'                             //This line has been added by me
)                                    //This line has been added by me

Note: I have just shown array element 5 (already present) and array element 6 (which I have made).

Then, to make it case-sensitive add below code to the last of 'CASE_SENSITIVE' array:

6 => true

The 'CASE_SENSITIVE' array should look like this:

'CASE_SENSITIVE' => array(
GESHI_COMMENTS => false,
        1 => false,
        2 => false,
        3 => false,
        4 => false,
        5 => false,
        6 => true                         //This line has been added by me
        ),

Now, you will have to add styling to the custom keywords. This can be achieved by adding below line to the 'KEYWORDS' element of 'STYLES' array. The starting of 'STYLES' array should look like this:

'STYLES' => array(
        'KEYWORDS' => array(
            1 => 'color: #0000FF;',
            2 => 'color: #FF00FF;',
            3 => 'color: #AF0000;',
            4 => 'color: #AF0000;',
            5 => 'color: #808080;',
            6 => 'color: #0000FF;'          //This line has been added by me
            ),

You can solve your problems by above guidelines, but for the part in which the plugin highlights incomplete words, I have found only one solution, that you update your plugin to latest version, because it solves this problem.

prakhar19
  • 465
  • 4
  • 18
  • 4
    Thanks, but I've invested a lot of work into other changes to GeSHi to suit my needs, so I'd much rather continue to work at it, rather than swap out plug-ins wholesale. I'm also very afraid of "grass is greener" syndrome - where I'd just be trading these few remaining imperfections for different, perhaps more important ones. – Aaron Bertrand May 15 '14 at 13:28
  • Also, the results you show *do* have similar problems. Why is `status` not highlighted in blue on line 2? Why are the `join` portions of the local variables in the `AND mort IN` line highlighted in grey? As I suspected, I'd be going through a whole lot of hassle migrating 150+ posts to use slightly different `
    ` syntax, only to trade my current formatting annoyances for slightly different ones.
    – Aaron Bertrand May 20 '14 at 13:25
  • The keywords I am complaining about are already in an array. All of the arrays have the styling I want, and all options are set to case insensitive (and I want them that way). I am seeing differences in whether they are seen as keywords because I think different regular expressions are being used depending on the array. And the real problem I need to solve here is to stop those regexes from identifying keywords contained in larger words (or prefixed with @ or surround by [square brackets]). I am on the latest version of the GeSHi plug-in and the problem *is* still occurring. – Aaron Bertrand May 20 '14 at 14:14
  • If you don't have any problem in making some messy corrections, then maybe you can define one more array in which you can define '@status' and style it as 'color: #000000;', etc. But, remember that the arrays containing words like '@status' should be defined above the arrays containing keywords like 'status'. – prakhar19 May 20 '14 at 14:36
  • Well yes, I could do that, but I'd have to create an array that contains every single keyword that could be used as a variable (and that's pretty much all of them). I'm Wuite positive there is a much more elegant way to have the regular expressions simply ignore keywords that are prefixed with `@`, instead of defining every single keyword all over again. I don't want to sound ungrateful, and I appreciate your effort here, but these are not the types of solutions I am looking for. – Aaron Bertrand May 20 '14 at 14:38
  • Sorry, but I couldn't find any solution other than that. Because if you edit any other wp-geshi file, then it would change that rule for every other language and wp-geshi doesn't contain any option for a particular language. – prakhar19 May 20 '14 at 16:11
  • I'm ok with that. I use it for T-SQL 99.999% of the time, and am okay with funky coloring in the other 0.001% of cases (PowerShell is really the only other one any of my blog posts use). – Aaron Bertrand May 20 '14 at 16:30