1

This is my first post on stackoverflow, so please be gentle with me...

I am still learning regex - mostly because I have finally discovered how useful they can be and this is in part through using Sublime Text 2. So this is Perl regex (I believe)

I have done searching on this and other sites but I am now genuinely stuck. Maybe I am trying to do something that can't be done

I would like to find a regex (pattern) that will let me find the function or method or procedure etc that contains a given variable or method call.

I have tried a number of expressions and they seem to get part of the way but not all the way. Particularly when searching in Javascript I pick up multiple function declarations instead of the one nearest to the call/variable that I am looking for.

for example: I am looking for the function that calls the method save data() I have learnt, from this excellent site that I can use (?s) to switch . to include newlines

function.*(?=(?s).*?savedata\(\))

however, that will find the first instance of the word function and then all the text unto and including savedata()

if there are multiple procedures then it will start at the next function and repeat until it gets to savedata() again

function(?s).*?savedata\(\) does something similar

I have tried asking it to ignore the second function (I believe) by using something like:

function(?s).*?(?:(?!function).*?)*savedata\(\)

But that doesn't work.

I have done some investigation with look forwards and look backwards but either I am doing it wrong (highly possible) or they are not the right thing.

In summary (I guess), how do I go backwards, from a given word to the nearest occurrence of a different word.

At the moment I am using this to search through some javascript files to try and understand the structure/calls etc but ultimately I am hoping to use on c# files and some vb.net files

Many thanks in advance

Thanks for the swift responses and sorry for not added an example block of code - which I will do now (modified but still sufficient to show the issue)

if I have a simple block of javascript like the following:

    function a_CellClickHandler(gridName, cellId, button){
        var stuffhappenshere;
        var and here;
        if(something or other){
            if (anothertest) {

                event.returnValue=false;
                event.cancelBubble=true;
                return true; 
            }
            else{
                event.returnValue=false;
                event.cancelBubble=true;
                return true;
            }
        } 
    }

    function a_DblClickHandler(gridName, cellId){
        var userRow = rowfromsomewhere;
        var userCell = cellfromsomewhereelse;
        //this will need to save the local data before allowing any inserts to ensure that they are inserted in the correct place
        if (checkforarangeofthings){
            if (differenttest) {
                InsSeqNum = insertnumbervalue;
                InsRowID = arow.getValue()
                blnWasInsert = true;
                blnWasDoubleClick = true;
                SaveData();      
            }
        }
    }

running the regex against this - including the second one that was identified as should be working Sublime Text 2 will select everything from the first function through to SaveData()

I would like to be able to get to just the dblClickHandler in this case - not both.

Hopefully this code snippet will add some clarity and sorry for not posting originally as I hoped a standard code file would suffice.

megapode
  • 75
  • 5
  • Please show examples of the text to which you're applying the regex. Without that it will be almost impossible for anybody to help you. – Jim Garrison Jan 26 '13 at 21:24
  • Your second pattern, annotated *does something similar*, should do what you want. I suggest you try it again. – Borodin Jan 26 '13 at 21:58
  • thank you to both of you for your quick responses. – megapode Jan 26 '13 at 22:58
  • The main problem you are having is the inappropriate usage of `.*`, and by "inappropriate" I mean _any_ ;-) Seriously though, if you are having problems with a regex, and you are using `.*` (which you shouldn't be to start off with, remember), try rewriting the regex without it. See [this post](http://stackoverflow.com/a/14428793/1961728), especially my answer, for another example of a similar problem with .* usage. See [this post](http://stackoverflow.com/q/5319840) for a fuller description of how `.*` actually works. – robinCTS Jan 27 '13 at 06:06
  • Robin, thanks for this information and the additional links which I'll read through now. I have been trying to use the .*? pattern but I believe that I missed it on this occasion. I have been working through various iterations so may have picked the wrong one up... :) – megapode Jan 27 '13 at 09:55
  • I should have been clearer: When I said `.*` previously, I actually meant any of the three [`.*`, `.*?`, `.*+`]. Whilst `.*?` is a little safer, basically the same problems arise with its inappropriate (ie, any) usage. That first post I linked to previously still fails, even if all the `.*` are replaced with `.*?`. `.*+`, whilst safe, will generally cause the regex to fail. As a tip on the side, when commenting on a question/answer and targeting anybody other than the author start the comment with @ and the persons name so they get notified as well. See the comments to my answer for examples. – robinCTS Jan 28 '13 at 10:50

1 Answers1

0

This regex will find every Javascript function containing the SaveData method:

(?<=[\r\n])([\t ]*+)function[^\r\n]*+[\r\n]++(?:(?!\1\})[^\r\n]*+[\r\n]++)*?[^\r\n]*?\bSaveData\(\)

It will match all the lines in the function up to, and including, the first line containing the SaveData method.

Caveat:

  • The source code must have well-formed indentation for this to work, as the regex uses matching indentations to detect the end of functions.
  • Will not match a function if it starts on the first line of the file.

Explanation:

(?<=[\r\n])                      Start at the beginning of a line
([\t ]*+)                        Capture the indentation of that line in Capture Group 1
function[^\r\n]*+[\r\n]++        Match the rest of the declaration line of the function
(?:(?!\1\})[^\r\n]*+[\r\n]++)*?  Match more lines (lazily) which are not the last line of the function, until: 
[^\r\n]*?\bSaveData\(\)          Match the first line of the function containing the SaveData method call

Note: The *+ and ++ are possessive quantifiers, only used to speed up execution.

EDIT: Fixed two minor problems with the regex.
EDIT: Fixed another minor problem with the regex.

robinCTS
  • 5,746
  • 14
  • 30
  • 37
  • Thank you for your help firstly, your answer shows that a) regexes are often far from simple b) very powerful! However, unfortunately, it doesn't work - in the first instance it is complaining about the '}' without an opening '{' If I try and fix that (with patterns I know) then I get an invalid look behind assertion error. I will do some experimenting now that I realise that it won't just be a simple pattern... (repost as learning the timeouts on edits and use of shift-return :) – megapode Jan 27 '13 at 09:15
  • @megapode - My bad. I didn't test it with Perl. I used EditPad Pro. Seems like `}` needs escaping in Perl. Have fixed it in my answer. Can't you delete the first comment? I can't recall needing any reputation to do so. – robinCTS Jan 27 '13 at 11:09
  • Robin, thanks, I didn't think of it as the } at the end of the line (doh!) that was the one option I didn't try! However, it is still saying that there is an invalid lookbehind assertion, any additional ideas, please? – megapode Jan 27 '13 at 13:09
  • Ok, that actually seems a lot closer. If I take off the look back assertion then it is still matching multiple function blocks. However, if I delete the blank rows between functions then it seems to be working - so maybe I need to play around a little more... Nope, my mistake it's not the blank lines :( – megapode Jan 27 '13 at 13:15
  • @megapode - Ah, yes. You're using Perl version 5. It doesn't allow variable length lookbehinds. Version 6 does. This time I've definitely fixed it in my answer. – robinCTS Jan 27 '13 at 16:41
  • Robin, thanks for your continued updates, it is much appreciated. I have now had a chance to try this against the javascript and it is still picking up duplicate functions before the SaveData() function. It seems to match the first line with function and then all lines until it gets to SaveData... Maybe there is something weird with the implementation of Regex in Sublime Text 2 (using as it is cross-platform and on the whole excellent) – megapode Jan 27 '13 at 20:14
  • (sorry only just noticed your post about use of @) it would appear that it might be an issue in Sublime Text 2 itself. I have just used a different regex tool (Patterns on the Mac) and that works exactly as you describe, which is not consistent with the behaviour in Sublime Text 2 - not yet sure the platform that Patterns is built on - maybe that should be my next check... – megapode Jan 27 '13 at 20:38
  • @megapode - Looks like Sublime Text 2 doesn't allow named backreferences in lookaheads! Tested in the latest portable Windows 32 bit version. Answer has been modified with fix. – robinCTS Jan 28 '13 at 02:47
  • thank you very much!!! This appears to be working. Thank you for your time and dedication on this, it is very much appreciated! I believe I have marked this as answered... – megapode Jan 28 '13 at 07:31