-4

I am trying to achieve the result in which if the user enters the word, in plural or singular, the regex should return true

For example 'I want to by drone' or 'I want to by drones'.

    @Test
    public void testProductSearchRegexp() {
        String regexp = "(?i).*?\\b%s\\b.*?";

        String query = "I want the drone with FLIR Duo";

        String data1 = "drone";
        String data2 = "FLIR Duo";
        String data3 = "FLIR";
        String data4 = "drones";

        boolean isData1 = query.matches(String.format(regexp, data1));
        boolean isData2 = query.matches(String.format(regexp, data2));
        boolean isData3 = query.matches(String.format(regexp, data3));
        boolean isData4 = query.matches(String.format(regexp, data4));

        assertTrue(isData1);
        assertTrue(isData2);
        assertTrue(isData3);
        assertTrue(isData4);//Test fails here (obviously) 
    }

Your valuable time on this question is very appreciated.

  • 7
    What about 'knife' and 'knives'? 'Child' and 'children'? 'Leaf' and 'leaves'? 'Man' and 'men'? – Michael Jul 25 '17 at 12:08
  • 1
    So you want to find out if one of the words end with ``'s'``? Is that a foolproof indication of a plural? – f1sh Jul 25 '17 at 12:09
  • To add to @Michael 's list, moose? would it be meese or mooses? also deer? – CraigR8806 Jul 25 '17 at 12:10
  • @Michael This is very valid point. What could be the possible solution? – Umair Ahmed Jul 25 '17 at 12:17
  • @UmairAhmed some natural language parser library. You can find list [here](https://stackoverflow.com/questions/870460/is-there-a-good-natural-language-processing-library) – talex Jul 25 '17 at 12:19

2 Answers2

2

English is a language with many exceptions. Checking whether a word ends in 's' is simply not sufficient to determine whether it's plural.

The best way to solve this problem is to not solve this problem. It's been done before. Take advantage of that. One solution would be to make use of a third party API. The OED have one, for example.

If you were to make a request to their API such as:

/entries/en/mice

You would get back a JSON response containing:

"crossReferenceMarkers": [
    "plural form of mouse"
],

from there it should be easy to parse. Simply checking for the presence of the word 'plural' may be sufficient.

They even have working Java examples that you can copy and paste.


An advantage of this approach is there's no compile-time dependency. A disadvantage is that you're relying on being able to make HTTP requests. Another is that you're limited by any restrictions they impose. The OED allows up to 3k requests/month and 60 requests/minute on their free plan, which seems pretty reasonable to me.

Michael
  • 41,989
  • 11
  • 82
  • 128
0

Well something like this is very hard to achieve without external sources. Sure many words in plural end with 's' but there are also a lot of exceptions to this like "knife" and "knives" or "cactus" and "cacti". For that you could use a Map to sort these out.

public static String getPlural(String singular){
    String plural;
    HashMap<String,String> irregularPlurals = new HashMap<>();
    irregularPlurals.put("cactus","cacti");
    irregularPlurals.put("knife","knives");
    irregularPlurals.put("man","men");
    /*add all your irregular ones*/
    plural = irregularPlurals.get(singular);
    if (plural == null){
        return singular + "s";
    }else{
        return plural;
    }
}

Very simple and not very practical but gets the job done when you only have a few words.

T.Furholzer
  • 199
  • 1
  • 8