0

I need help in order to separate and match correctly expressions that match with patterns contained in a main expression.

I'm working with Regex of C#... i Need to use "Regex.Matches" function in order to get the results.

Actually i need to match correctly some expressions that contains quotation marks, Parenthesis or brackets. These are some examples:

  1. Example 1: I have 3 expressions inside quotation marks. I expect to get the 3 expressions split.

    • Main Expression --> "'word1' 'word2' 'word3'"
    • Expected in Pseudocode --> new Collection(['word1','word2','word3'])
  2. Example 2: I have 3 expressions like --> key:word . I expect to get the 3 expressions split.

    • Main Expression --> "KEY1:WORD1 KEY2:WORD2 KEY3:WORD3"
    • Expected in Pseudocode --> new Collection([KEY1:WORD1,KEY2:WORD2,KEY3:WORD3])
  3. Example 3: I have 3 expressions like --> key:'word' . I expect to get the 3 expressions split.

    • Main Expression --> "KEY1:'WORD1' KEY2:'WORD2' KEY3:'WORD3'"
    • Expected in Pseudocode --> new Collection([KEY1:'WORD1',KEY2:'WORD2',KEY3:'WORD3'])
  4. Example 4: I have 3 expressions like --> key:[word] . I expect to get the 3 expressions split.

    • Main Expression --> "KEY1:[WORD1] KEY2:[WORD2] KEY3:[WORD3]"
    • Expected in Pseudocode --> new Collection([KEY1:[WORD1] KEY2:[WORD2] KEY3:[WORD3]])

KEYx could be anything WORDx could be anything also

This is the code that i'm using, where the input variable is the main expression:

    private List<String> getSplitSimpleExpressions(string input)
    {
        string pattern = @".+:\[.+?\]|.+:\'.+\'*$|.+:.+|\'.+\'"
        List<String> Expressions = Regex.Matches(input, pattern).Cast<Match>().Select(x => x.Value).ToList();            
        return Expressions;
    }

With this function i get only 1 match for the above examples.

This is because the pattern (let say \'.+\') use the last close quotation mark ('word1.......word3') instead of the first close quotation mark for each expression ('word1' , 'word2', 'word3')

Same happens with the example 3:

  • Expression: "KEY1:'WORD1' KEY2:'WORD2' KEY3:'WORD3'"
  • Pattern: .+:\'.+\'
  • Result (Only 1 result): KEY1:'WORD1.......WORD3'

Same happens with the example 4, but with the closing bracket:

  • Expression: "KEY1:[WORD1] KEY2:[WORD2] KEY3:[WORD3]"
  • Pattern: .+:[.+?]
  • Result (Only 1 result): KEY1:[WORD1.......WORD3]

I could split the Main expression using the blank space, BUT i should omit the blank space inside the quotation marks.

So i need to match these expression with the first closing quotation marks, or the first closing bracket, etc What i should fix this? What do you think? Thanks in advance

Sebastian
  • 33
  • 4
  • This is completly different... i don't want to "replace all the - characters between the quotation marks with, say, a space. ..." I need to match all the expression in the quotation marks. – Sebastian Oct 20 '17 at 05:38
  • Also i asked for a c# code – Sebastian Oct 20 '17 at 05:40
  • It seems you may use `@"(?:[^:]+:)?'.*?'"` or `@"(?:[^:\s]+:)?'.*?'"` (if keys can't have whitespace). Or even `@"(?:\w+:)?'.*?'"` if keys can only consist of word chars. – Wiktor Stribiżew Oct 20 '17 at 06:02

0 Answers0