1

I am working on java code in which I want to remove repetitive words. Following code works fine to remove them if I get space in any word for example: 1298 Anthony 1298 Anthony, it will make it like: 1298 Anthony

But for any other special character like: 1298 Anthony.ef 1298 Anthony.ef, it will show it like: ef. 1298 Anthony.

My method is given below, I want to make it work for every special character, specially for : coma(,) , fullstop(.), dash(-), underscore(_). Please help me in this problem.

public static void removeString(){

    String name1 = "1298 Anthony.ef 1298 Anthony.ef";

    String[] strArr = name1.split(" ");
    Set<String> set = new HashSet<String>(Arrays.asList(strArr));

    String[] result = new String[set.size()];
    set.toArray(result);
    StringBuilder res = new StringBuilder();
    for (int i = 0; i < result.length; i++) {
        String string = result[i];
        if(i==result.length-1){
            res.append(string);
        }
        else{
            res.append(string).append(" ");
        }

    }
    System.out.println(res.toString());
    String abc = res.toString();
}
default locale
  • 13,035
  • 13
  • 56
  • 62
  • Well, I don't know if I understand you correctly, but you might try something like **name1.split("\\s|\\.|,|\\-")** – pnadczuk Jun 09 '15 at 09:44
  • I have executed your code and it yielded `1298 Anthony.ef`. Is this not what you are after? – npinti Jun 09 '15 at 09:46
  • Rather **name1.split((\\s|\p{Punct})+)** See http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for different character classes. – agad Jun 09 '15 at 09:46

1 Answers1

1

You're splitting name1 around spaces. You can try to split name1 around any non-word character:

names.split("\\W+");

Method String.split accepts regex as argument. To quote from the docs:

Splits this string around matches of the given regular expression.

name1.split(" "); splits string around single space and returns array: [1298, Anthony.ef, 1298, Anthony.ef]

names.split("\\W+"); splits string around any non-word character (comma, dot, dash, etc.) and returns array: [1298, Anthony, ef, 1298, Anthony, ef] As you can see in this case it was able to split Anthony.ef into separate strings.

UPDATE: If you want to preserve word's order in the original string you might want to use LinkedHashSet instead of HashSet. For example:

public static void removeString(){

    String name1 = "1298 Anthony.ef 1298 Anthony.ef";

    String[] strArr = name1.split("\\W+");
    Set<String> set = new LinkedHashSet<String>(Arrays.asList(strArr));

    String[] result = new String[set.size()];
    set.toArray(result);
    StringBuilder res = new StringBuilder();
    for (int i = 0; i < result.length; i++) {
        String string = result[i];
        if(i==result.length-1){
            res.append(string);
        }
        else{
            res.append(string).append(" ");
        }

    }
    System.out.println(res.toString());
    String abc = res.toString();
}

Check out this question: Is there an insertion order preserving Set that also implements List?

Community
  • 1
  • 1
default locale
  • 13,035
  • 13
  • 56
  • 62
  • thanks for your reply here. Can you please explain me little bit more here, i exactly didn't get your point. – Android_Zapier Jun 09 '15 at 09:47
  • It's just "Anthony.ef" will be split into "Anthony" and "ef" – pnadczuk Jun 09 '15 at 09:48
  • @Android_Zapier see [documentation](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html) – agad Jun 09 '15 at 09:48
  • @Android_Zapier I have included additional explanations. Please refer to tutorial/documentation links provided in the answer. – default locale Jun 09 '15 at 09:54
  • @defaultlocale thanks for your answer here. It works and removes Dot from the string. But the problem is that it is showing the word in reverse direction like: ef Argentina, please help me out here – Android_Zapier Jun 09 '15 at 09:57
  • 1
    @Android_Zapier In what order do you want words to appear? If you want to preserve the order in the original string you might want to use `LinkedHashSet` instead of `HashSet`. Check out this question: http://stackoverflow.com/questions/8185090/is-there-an-insertion-order-preserving-set-in-java – default locale Jun 09 '15 at 10:02
  • @defaultlocale thanks for your reply. Can you please put some code snippet here according to my condition, it will help me a lot in this regard. – Android_Zapier Jun 09 '15 at 10:04
  • @defaultlocale kindly help me out here with code snippet according to my condition. I can't figure it out. – Android_Zapier Jun 09 '15 at 10:16
  • 1
    @defaultlocale thank you so much dear, it works perfectly as you say here. Thanks.....!!! – Android_Zapier Jun 09 '15 at 10:21