3

I have an ArrayList of Strings and it contains records such as:

this is a first sentence
hello my name is Chris 
what's up man what's up man
today is tuesday

I need to clear this list, so that the output does not contain repeated content. In the case above, the output should be:

this is a first sentence
hello my name is Chris 
what's up man
today is tuesday

as you can see, the 3rd String has been modified and now contains only one statement what's up man instead of two of them. In my list there is a situation that sometimes the String is correct, and sometimes it is doubled as shown above.

I want to get rid of it, so I thought about iterating through this list:

for (String s: myList) {

but I cannot find a way of eliminating duplicates, especially since the length of each string is not determined, and by that I mean there might be record:

this is a very long sentence this is a very long sentence

or sometimes short ones:

single word singe word

is there some native java function for that maybe?

Chris Martin
  • 30,334
  • 10
  • 78
  • 137
user3766930
  • 5,629
  • 10
  • 51
  • 104
  • 2
    You can split each line into an array of strings with `line.split(" ")`, then add them to a `LinkedHashSet`, and then read them back out. – 4castle Mar 13 '17 at 18:32
  • Not a function but you can implement this logic : split every word by token and now add it in set... Retrieve back – minigeek Mar 13 '17 at 18:33
  • @4castle haha .. Concurrent comments – minigeek Mar 13 '17 at 18:33
  • hm guys, could you help me with some small code snippet? – user3766930 Mar 13 '17 at 18:34
  • Your question isn't specific enough to be very answerable. We like answers which can be objectively correct, rather than a list of suggestions where every solution is equally correct. Google "how to remove duplicates from a list". – 4castle Mar 13 '17 at 18:38
  • @user3766930 i added my solution. let me know ;) tested it sucessfully works – minigeek Mar 13 '17 at 18:53
  • Please [check](https://stackoverflow.com/a/44363194/1352919) this example code. I hope this will work for you! – Fakhar Jun 05 '17 at 06:52

6 Answers6

2

Assuming the String is repeated just twice, and with an space in between as in your examples, the following code would remove repetitions:

for (int i=0; i<myList.size(); i++) {
    String s = myList.get(i);
    String fs = s.substring(0, s.length()/2);
    String ls = s.substring(s.length()/2+1, s.length());
    if (fs.equals(ls)) {
        myList.set(i, fs);
    }
}

The code just split each entry of the list into two substrings (dividing by the half point). If both are equal, substitute the original element with only one half, thus removing the repetition.

I was testing the code and did not see @Brendan Robert answer. This code follows the same logic as his answer.

airos
  • 742
  • 5
  • 14
2

I would suggest using regular expressions. I was able to remove duplicates using this pattern: \b([\w\s']+) \1\b

public class Main {
    static String [] phrases = {
            "this is a first sentence",
            "hello my name is Chris",
            "what's up man what's up man",
            "today is tuesday",
            "this is a very long sentence this is a very long sentence",
            "single word single word",
            "hey hey"
    };
    public static void main(String[] args) throws Exception {
        String duplicatePattern = "\\b([\\w\\s']+) \\1\\b";
        Pattern p = Pattern.compile(duplicatePattern);
        for (String phrase : phrases) {
            Matcher m = p.matcher(phrase);
            if (m.matches()) {
                System.out.println(m.group(1));
            } else {
                System.out.println(phrase);
            }
        }
    }
}

Results:

this is a first sentence
hello my name is Chris
what's up man
today is tuesday
this is a very long sentence
single word
hey
vhula
  • 487
  • 2
  • 9
1

Assumptions:

  1. Uppercase words are equal to lowercase counterparts.

String fullString = "lol lol";
String[] words = fullString.split("\\W+");
StringBuilder stringBuilder = new StringBuilder();
Set<String> wordsHashSet = new HashSet<>();

for (String word : words) {
    // Check for duplicates
    if (wordsHashSet.contains(word.toLowerCase())) continue;

    wordsHashSet.add(word.toLowerCase());
    stringBuilder.append(word).append(" ");
}
String nonDuplicateString = stringBuilder.toString().trim();
Veneet Reddy
  • 2,707
  • 1
  • 24
  • 40
1

simple logic : split every word by token space i.e " " and now add it in LinkedHashSet , Retrieve back, Replace "[","]",","

 String s = "I want to walk my dog I want to walk my dog";
 Set<String> temp = new LinkedHashSet<>();
 String[] arr = s.split(" ");

 for ( String ss : arr)
      temp.add(ss);

 String newl = temp.toString()
          .replace("[","")
          .replace("]","")
          .replace(",","");

 System.out.println(newl);

o/p : I want to walk my dog

minigeek
  • 2,766
  • 1
  • 25
  • 35
1

//Doing it in Java 8

String str1 = "I am am am a good Good coder";
        String[] arrStr = str1.split(" ");
        String[] element = new String[1];
        return Arrays.stream(arrStr).filter(str1 -> {
            if (!str1.equalsIgnoreCase(element[0])) {
                element[0] = str1;
               return true;
            }return false;
        }).collect(Collectors.joining(" "));
0

It depends on the situation that you have but assuming that the string can be repeated at most twice and not three or more times you could find the length of the entire string, find the halfway point and compare each index after the halfway point with the matching beginning index. If the string can be repeated more than once you will need a more complicated algorithm that would first determine how many times the string is repeated and then finds the starting index of each repeat and truncates all index's from the beginning of the first repeat onward. If you can provide some more context into what possible scenarios you expect to handle we can start putting together some ideas.