-1

I am having some issues wrapping my head around regular expression.

I have the following type of string I need to parse/get data from

'Account'.Search = "[Id] is NULL OR NAME = ""0%"""
'Account'.Sort = "NAME, Address"
'Bank Objectives'.Search = "[Active Flg] = "Y""
'Bank'.Search = "[Id] is NULL"
'Bank'.Sort = "Transit"
'Bank Goals'.Search = "[Active Flg] = 'Y'"
.......

The following expression seems to work

\s*(?:(?<search>(?:(=['"])?(?:.*)\1|.*)\.Search[^"']*(?:(=['"])?(?:.*)\1|.*))\s*$?(?<sort>(?:(=['"])?(?:.*)\1|.*)\.Sort[^"']*(?:(=['"])?(?:.*)\1|.*))?\s*$?)

And I can get the output like

search: 'Account'.Search = '[Id] is NULL OR NAME = ''0%'''
sort: 'Account'.Sort = "NAME, Address"
search: 'Bank Objectives'.Search = "[Active Flg] = "Y""
search: 'Bank'.Search = "[Id] is NULL"
sort: 'Bank'.Sort = "Transit"
search: 'Bank Goals'.Search = "[Active Flg] = 'Y'"

However if the input string doesn't have line returns, then I start to get

search: 'Account'.Search = '[Id] is NULL OR NAME = ''0%''''Account'.Sort = "NAME, Address"

What I really basically from the input string is

Account
  Search = "[Id] is NULL OR NAME = ""0%"""
  Sort = "NAME, Address"
Bank Objectives 
  Search = "[Id] is NULL OR NAME = ""0%"""
  Sort = ""
.....

Also the (=['"])?(?:.*)\1|.*) is suppose to match quotes, but it doesn't seem to be working either.

The end goal is to pass the string [Id] IS NULL or Name =""0%""" to an expression editor, allow the user to make changes and then just update the search portion of the overall string, so if the user wants the account name that starts with bob, I would need to generate

'Account'.Search = "Name like 'bob%'"
'Account'.Sort = "NAME, Address"
'Bank Objectives'.Search = "[Active Flg] = "Y""
'Bank'.Search = "[Id] is NULL"
'Bank'.Sort = "Transit"
'Bank Goals'.Search = "[Active Flg] = 'Y'"
.......

I already have that piece built, it is parsing this string which can have multiple entries. Also the string could just contain

"Name like 'bob%'"

So it is left to me to figure out that it was suppose to actually be

'Account'.Search = "Name like 'bob%'"

So I need to know if the input string even matches the regex

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Nadin
  • 81
  • 1
  • 7
  • What the heck are you trying to do..? are you trying to build something like QBE "Query By Example" or Dynamically building / generating Sql Query..? – MethodMan Jan 03 '13 at 21:02
  • please explain more clearly what you are trying to do and what the purpose of the parsed strings will be used for.. – MethodMan Jan 03 '13 at 21:06
  • The strings are encoded using another system which I actually dont understand. However, I need to be able to pull out part of the string, specifically, the search portion of Account (for example), load the string up into an expression editor, allow the user to make changes, and then update that portion of the string. – Nadin Jan 03 '13 at 21:13
  • can you show or provide an example that's not so messy so to speak in your original question.. give examples of what the expected results you are expecting.. perhaps you could use some other method of getting the strings out like a Split() method.. – MethodMan Jan 03 '13 at 21:18

1 Answers1

0

It's unlikely you will get a 100% solution using regular expressions. If you have syntax that is very complex and may receive input that does not conform, you want to generate a lexical parser. There are plenty of resources on how to do that for .Net.

Here are some starter links:

  • http://stackoverflow.com/questions/1669/learning-to-write-a-compiler
  • http://blogs.microsoft.co.il/blogs/sasha/archive/2010/10/06/writing-a-compiler-in-c-lexical-analysis.aspx
  • http://www.antlr.org/

Update:

If you must use regex, why not break it down into several smaller problems? First, split by newline and parse each line individually. Then, use a set of simpler regexes to break down the string.

For instance, parse each line into [obj].[verb] = [rule] with:
/'(?<obj>[^']+)'.(?<verb>[^\w=]+\)\s*=\s*"(?<rule>.+)"/

Then parse the rule out for fields with:
/\[[^\]+\]/

saarp
  • 1,931
  • 1
  • 15
  • 28
  • oh crap. I was hoping to be done with all that from university. Can anyone suggest a way to just get the "[Id] is NULL OR NAME = ""0%""" part out and the location where it was found in the string? – Nadin Jan 03 '13 at 22:27