0

I know there are similar threads on this, but most of them only involve ignoring spaces.

I have to write an app using some poorly written data sheets, so often I have to compare things like this: Packs, packs, Pack(s), pack(s), pack

These should all be considered equal, as they are all a pack. However, none of the people who made these data sheets communicated with each other so now I get to deal with it.

How can I compare strings while ignoring parentheses, spaces, the 's' character, and also making sure everything is lowercase before comparison?

All I have right now is this:

private boolean sCompare(String s1, String s2)
{


   return s1.equalsIgnoreCase(s2)
}

Obviously it isn't much and doesn't do anything other than directly compare two lowercase strings, but I'm not sure the proper approach to get the results I need.

The new comparison function should return true for the examples above, and false when comparing things like: Pack(s) and Case(s), Packs and Case(s), etc.

EDIT Using help from the best answer, I've created a function that suits my needs

private boolean sCompare(String s1, String s2)
{
    String rx = "[\\se(s)|s$]";
    return (s1.toLowerCase().replaceAll(rx,"")).equals(s2.toLowerCase().replaceAll(rx,""));
}
Connor S
  • 353
  • 2
  • 12
  • How many different root words do you need to compare (eg, pack, case = 2 root words)? – Jason May 13 '19 at 22:42
  • @Jason Right now it would just be case, pack, box, unit, and bag. It might change tho, so I need a function that can handle anything (by simply removing s, (, ), and space) – Connor S May 13 '19 at 22:46
  • I have an idea to do this with regex, I'm gonna see if it works. If it does, I'll post it as an edit – Connor S May 13 '19 at 22:46
  • You could use a regex, but then you'd have two problems :) – Jason May 13 '19 at 22:47
  • Check out this post: https://stackoverflow.com/questions/796412/how-to-turn-plural-words-singular – Thai Doan May 13 '19 at 22:56

3 Answers3

1

This:

public static void main(String[] args) throws Exception {
    String REGEX = "\\(s\\)|s$";

    System.out.println("Packs".replaceAll(REGEX, "")
                              .toLowerCase());
    System.out.println("packs".replaceAll(REGEX, "")
                              .toLowerCase());
    System.out.println("Pack(s)".replaceAll(REGEX, "")
                                .toLowerCase());
    System.out.println("pack(s)".replaceAll(REGEX, "")
                                .toLowerCase());
    System.out.println("pack".replaceAll(REGEX, "")
                             .toLowerCase());
}

Yields:

pack
pack
pack
pack
pack

So this should do it:

private static boolean sCompare(String s1, String s2) {
    return discombobulate(s1).equals(discombobulate(s2));
}

private static String discombobulate(String s) {
    String REGEX = "\\(s\\)|s$";

    return s.replaceAll(REGEX, "")
            .toLowerCase();
}
Not a JD
  • 1,864
  • 6
  • 14
  • You'll need to tweak this if you want it to work for boxes -> box. Your fix will produce boxes -> boxe – Jason May 13 '19 at 22:49
  • This helps a lot. I will post an edit to the answer with the function I decided to use (a slightly modified version of this) – Connor S May 13 '19 at 22:56
  • Not to mention children, geese and hippopotami. – Dawood ibn Kareem May 13 '19 at 22:58
  • Hopefully product quantities won't be recorded in terms of geese lol. I've added an answer that successfully converts boxes, packs, cases, bags, cans, and some others I might need – Connor S May 13 '19 at 23:05
0

Hi I think that this answers your question :) just add another forbidden character to set and it will simply filter that char too.

   Set<Character> forbiddenChars = Set.of('s', '{', '}', ' ');

        String testString = "This Is{ Test} string";

        String filteredString = testString
                                        .toLowerCase()
                                        .codePoints()
                                        .filter(character -> !forbiddenChars.contains((char)character))
                                        .collect(StringBuilder::new, StringBuilder::appendCodePoint,
                                                    StringBuilder::append)
                                        .toString();
        System.out.println(filteredString);
Petr M
  • 143
  • 2
  • 3
  • 12
0

You can use:

s1.replaceAll("\\W|s\\)?$", "").equals("pack"); // true

or:

s1.replaceAll("\\W|s", "").equals("pack"); // true

If you don't care about any other s character in the string.

"\W|s\)?$" will remove everything that is not a word character and any s at the end.

If you know there will be no other s in the words but the last one, then you can use this simplified expression: "\W|s". It will remove everything that is not a word character and any s in the string.

Jeremy Then
  • 525
  • 2
  • 12