0

I have a static method that gets an input string for a search. In this method it splits this input-string at the space, and uses a search algorithm (RavenQueryable) on each of them. This search input can include (Dutch) postcodes, and the customer want to search all of them, regarding of having a space or not.

In semi-code - what I had:

// Replace multiple whitespaces in the search-input for a single one
// Split the search-input at a single space
// Use RavenQueryable's SearchMultiple-method on this array of strings

What I want to replace it with:

// Replace multiple whitespaces in the search-input for a single one
// Find a (part of) a postcode regex with a whitespace "[1-9][0-9]{3}[ ][A-Za-z]{2}" or @"[\d][ ][A-Za-z]"
// var string with this postcode without spaces (replaced for "[1-9][0-9]{3}[A-Za-z]{2}" or @"[\d][A-Za-z]")
// Find a postcode regex without a whitespace "[1-9][0-9]{3}[A-Za-z]{2}" or @"[\d][A-Za-z]"
// var string with this postcode with a single whitespace (replaced for "[1-9][0-9]{3}[ ][A-Za-z]{2}" or @"[\d][ ][A-Za-z]")
// Split the search-input at a single space
// Use RavenQueryable's SearchMultiple-method on this array of strings

This way when the user inputs a postcode (with or without whitespace doesn't matter), it will find all occurences (both with or without whitespace)

As an example:

  • When the user puts in 1234 AB: It gives results for both items with 1234AB and 1234 AB
  • When the user puts in 1234AB: It gives results for both items with 1234AB and 1234 AB

Some code I already have:

public static IRavenQueryable<T> SearchMultiple<T>(this IRavenQueryable<T> self,
    Expression<Func<T, object>> fieldSelector, string queries,
    decimal boost = 1, SearchOptions options = SearchOptions.Or)
{
    if(string.IsNullOrEmpty(queries) throw new ArgumentNullException("queries");

    queries = Regex.Replace(queries, @"\s{2,}", " ");
    // Postcode code
    var searchValues = queries.Split(' ');

    return self.SearchMultiple(fieldSelector, searchValues, boost, options);
}

So, how do I make this // Postcode code so I replace my "what I had semi-code" for my "what I want to replace it with semi-code"?


EDIT:

  • I know how to get the postcode regex: var postcode = Regex.Match(queries, "[1-9][0-9]{3}[A-Za-z]{2}");
  • I just don't know how to replace a regex with another regex. I know there is a Regex.Replace, but this replaces the entire regex for the chosen string. What I want instead, is replacing the entire string that matches the regex, for the same string (but with a space).

If I only accept whole postcodes (like 1234AB / 1234 AB), I would just use a string-substring to add/replace a space after the 4th character. But since I also want to allow the user to put part of the postcode as a valid search (like 34A / 34 A, which also both need to search for 1234AB and 1234 AB), I can't use a sub-string after the 4th character.

I hope this clears some things of what I want to achieve and where I'm stuck. Is there some kind of replace regex for regex plus added character (like a space in my case) method, because that would be great.


EDIT 2:

Ok, I found a regex for regex replace method here, I just don't know how to apply it to my case.

When I try the following code, it gives an ArgumentException that my regex is incorrect. I almost never use Regex and don't know a lot about it, so any help would be appreciated.

if (string.IsNullOrEmpty(queries)) throw new ArgumentNullException("queries");

queries = Regex.Replace(queries, @"\s{2,}", " ");
const string withSpaceRegex = @"?<decimals>[\d][ ]?<letters>[A-Za-z]";
const string withoutSpaceRegex = @"?<decimals>[\d]?<letters>[A-Za-z]";
const string replacementWithSpace = "${decimals}${letters}";
const string replacementWithoutSpace = "${decimals} ${letters}";
var postcodesWithSpace = Regex.Matches(queries, withSpaceRegex);
var postcodesWithoutSpace = Regex.Matches(queries, withoutSpaceRegex);
queries = postcodesWithSpace.Cast<string>().Aggregate(queries, (current, s) => current
    + " " + Regex.Replace(s, s, replacementWithSpace, RegexOptions.IgnoreCase));
queries = postcodesWithoutSpace.Cast<string>().Aggregate(queries, (current, s) => current
    + " " + Regex.Replace(s, s, replacementWithoutSpace, RegexOptions.IgnoreCase));
var searchValues = queries.Split(' ');ostcodeWithoutSpace, RegexOptions.IgnoreCase));
var searchValues = queries.Split(' ');

return self.SearchMultiple(fieldSelector, searchValues, boost, options);
Community
  • 1
  • 1
Kevin Cruijssen
  • 9,153
  • 9
  • 61
  • 135
  • 1
    It's great that you're showing what you already have, but what is the question? – khellang Nov 21 '14 at 09:27
  • @khellang Edited. I want to know the code for searching a postcode that matches both with and without spaces – Kevin Cruijssen Nov 21 '14 at 09:29
  • This is not really a "please write the codez for me" type of place. It seems you have the pseudo-code under control, why don't you try writing it out and come back when you have a _real_ problem? ;) – khellang Nov 21 '14 at 09:33
  • @khellang Edited once again. I know how to get the postcode part, I just don't know how to replace it with its counterpart (with space to without space, and vice-versa). – Kevin Cruijssen Nov 21 '14 at 09:47

1 Answers1

0

Ok, some things I came across making this semi-answer below:

  • Since I do a .Split(' ');, I don't need to evaluate the ones already with a space, I just need to add postcodes without a space to the query as the exact same postcode (but instead with a space).
  • Apparently you can assign variable-names to your regex parts, and than use Regex.Replace.

So, now I have the following code in my method, which fixes my regex replace problem:

if (string.IsNullOrEmpty(queries)) throw new ArgumentNullException("queries");

var newQueries = Regex.Replace(queries, @"\s{2,}", " ");
var withSpaceRegex = @"(?<decimals>[0-9]+)[ ](?<letters>[A-Za-z]+)";
var replacementWithSpace = "${decimals}${letters}";
var postcodesWithSpace = Regex.Matches(newQueries, withSpaceRegex);
newQueries = postcodesWithSpace.Cast<object>().Aggregate(newQueries, (current, s) => current
    + " " + Regex.Replace(s.ToString(), withSpaceRegex, replacementWithSpace, RegexOptions.IgnoreCase));
var searchValues = newQueries.Split(' ');

return self.SearchMultiple(fieldSelector, searchValues, boost, options);

Some examples:

  • "1234AB" -> "1234AB"
  • "some words 1234AB" -> "some words 1234AB"
  • "1234 AB" -> "1234 AB 1234AB"
  • "some words 1234 AB" -> "some words 1234 AB 1234AB"

So this does indeed fix my regex. The only problem is we use a SearchOptions.And for the RavenQueryable#SearchMultiple, so now it won't match either anymore.

Basically, the code above fixes my regex replace problem, but now I need to figure out how to have part of the string-array used as SearcOptions.Or (the postcodes with and without spaces), and have the rest (including these postcode-ors) as SearchOptions.And. This is however a completely new problem, which I'll first discuss with a colleague of mine which knows a lot more about Raven than me, and if he doesn't know a solution either I'll make a new question.


EDIT: We've decided to convert all postcodes with spaces to their equal part without when we import everything and when we save new ones. So I've just applied the regex above to the import part, and after that only allow the user to input postcodes without spaces for new ones added.

Kevin Cruijssen
  • 9,153
  • 9
  • 61
  • 135