I want to determine if there exists any occurrences of strings (from a list of rejectable strings) in some text, but only if that string isn't found within a larger allowable string in the text where it was found (from a list of allowable strings).
Simple example:
Text: "The quick red fox jumped over the lazy brown dog in front of the farmer."
rejectableStrings: "fox", "dog", "farmer"
allowableStrings: "quick red fox", "smurfy blue fox", "lazy brown dog", "old green farmer"
So, raise flag if any of each of the strings "fox", "dog", or "farmer" are found in the text but not if that string found is contained within any of the allowable strings (at/around the same location within text where the rejection was found).
Example logic not yet complete:
string status = "allowable";
foreach (string rejectableString in rejectableStrings)
{
// check if rejectableString is found as a whole word with either a space or start/end of string surrounding the flag
// https://stackoverflow.com/a/16213482/56082
string invalidValuePattern = string.Format(@"(?<!\S){0}(?!\S)", rejectableString);
if (Regex.IsMatch(text, invalidValuePattern, RegexOptions.IgnoreCase))
{
// it is found so we initially raise the flag to check further
status = "flagged";
foreach (string allowableString in allowableStrings)
{
// only need to consider allowableString if it contains the rejectableString, otherwise ignore
if (allowableString.Contains(rejectableString))
{
// check if the found occurence of the rejectableString in text is actually contained within a relevant allowableString,
// *** the area that needs attention ***
if ('rejectableString occurence found in text is also contained within the same substring allowableString of text')
{
// this occurrence of rejectableString is actually allowable, change status back to allowable and break out of the allowable foreach
status = "allowable";
break;
}
}
}
if (status.Equals("flagged"))
{
throw new Exception(rejectableString.ToUpper() + " found in text is not allowed.");
}
}
}
Background if interested: This is for an SQL query validation method for an app where the goal is to reject queries that contain permanent database modification commands, but allow the query to be considered valid if the invalid command found is actually a substring of a temporary table command or some other logical exception that should allow the command within the query. This is a multi-database query validation, not specific to a single database product.
So the real world examples for rejectable and allowable are
private string[] rejectableStrings = {"insert","update","set","alter",
"create","delete"};
private string[] allowableStrings = { "insert into #", "create table #",
"create global temporary table ", "create temporary tablespace ", "offset "};
and the text would be an sql query.