7

I have two strings with me:

s1="MICROSOFT"
s2="APPLESOFT"

I need to compare the strings and remove the duplicate part (always towards the end) from the second string. So I should get "MICROSOFT" and "APPLE" as output.

I have compared both the strings character by character.

               String s1 = "MICROSOFT";
               String s2 = "APPLESOFT";

               for(int j=0; j<s1.length(); j++)
               {
                   char c1 = s1.charAt(j);
                   char c2 = s2.charAt(j);

                   if(c1==c2)
                       System.out.println("Match found!!!");
                   else
                       System.out.println("No match found!");
               }

It should check the strings and if the two strings have same characters until the end of string, then I need to remove that redundant part, SOFT in this case, from the second string. But I can't think of how to proceed from here.

There can be more duplicates...but we have to remove only those which are continuously identical. if i have APPWWSOFT and APPLESOFT, i should get APPLE again in the second string since we got LE different than WW in between

Can you guys please help me out here?

Snow Leopard
  • 347
  • 3
  • 7
  • 18
  • could the duplicate part be anywhere in the string or is it always at the end? Eg might you want to remove "SOFT" from "MICSOFTRO" and "APPSOFTLE"? – The Cat Sep 27 '12 at 11:23
  • also could there be more duplicates? like APPAPPLESOFT and APPMICROSOFT should remove APP and SOFT? also, the duplicate can be only one character? or there's always more than 1? – Th0rndike Sep 27 '12 at 11:24
  • this question shouldn't get so many upvotes, i think. @GauravOjha are you asking about 2nd string's diff from 1st string? if so, this question should get negative votes and closed. – Juvanis Sep 27 '12 at 11:41
  • @The Cat: its always in the end – Snow Leopard Sep 27 '12 at 12:16
  • @Th0rndike: yes there can be more duplicates...but we have to remove only those which are continuously identical. if i have APPWWSOFT and APPLESOFT, i should get APPLE again in the second string since we got LE different than WW in betwen. – Snow Leopard Sep 27 '12 at 12:21
  • @VincenzoSanchez: with all due respect Vincenzo, I dont think this is a problem of just subtraction. But again, I'm merely a beginner, may be you're right, but i dont see a way through it. – Snow Leopard Sep 27 '12 at 12:26

7 Answers7

4

Search and read about Longest Common Subsequence, you can find efficient algorithms to find out the LCS of two input strings. After finding the LCS of the input strings, it is easy to manipulate the inputs. For example, in your case an LCS algorithm will find "SOFT" as the LCS of these two strings, then you might check whether the LCS is in the final part of the 2nd input and then remove it easily. I hope this idea helps.

An example LCS code in Java is here, try it: http://introcs.cs.princeton.edu/java/96optimization/LCS.java.html

Example scenario (pseudocode):

input1: "MISROSOFT";
input2: "APPLESOFT";

execute LCS(input1, input2);
store the result in lcs, now lcs = "SOFT";

iterate over the characters of input2,
if a character exists in lcs then remove it from input2.
Juvanis
  • 25,802
  • 5
  • 69
  • 87
  • He wants to remove any similar characters from the 2 strings. So, I don't think longest common subsequence can be applied ... only if you apply it repeatedly ... – Razvan Sep 27 '12 at 11:30
  • Check this out - http://stackoverflow.com/questions/2929557/java-longest-common-subsequence – Cid Oct 03 '12 at 07:37
2

As far as I understand, you want to remove any identical characters from the two strings. By identical I mean: same position and same character(code). I think the following linear complexity solution is the simplest:

 StringBuilder sb1 = new StringBuilder();
 StringBuilder sb2 = new StringBuilder(); //if you want to remove the identical char 
                                          //only from one string you don't need the 2nd sb
 char c;
 for(int i = 0; i<Math.min(s1.length,s2.length);i++){
     if((c = s1.charAt(i)) != s2.charAt(i)){
           sb1.append(c);
     }
 }
 return sb1.toString();
Razvan
  • 9,925
  • 6
  • 38
  • 51
2

Try this algo- Create characters sequences of your first string and find it in second string.

performance -
Average case = (s1.length()-1)sq

public class SeqFind {
    public static String searchReplace(String s1,String s2) {
        String s3;
        boolean brk=false;
        for(int j=s1.length();j>0&&!brk;j--){
        for (int i = j-4; i > 0; i--) {
            String string = s1.substring( i,j);
            if(s2.contains(string)){
                System.out.println(s2+" - "+string+" "+s2.replace( string,""));
                brk=true;
                break;
            }
        }
    }
        return s3;      
    }
    public static void main(String[] args) {
        String s1 = "MICROSOFT";
        String s2 = "APPLESOFT";
        String s3 = searchReplace(s1,s2);
    }
}

Out put - APPLESOFT - SOFT - APPLE

Subhrajyoti Majumder
  • 40,646
  • 13
  • 77
  • 103
  • Thanks you for your answer. If I understand right this algorithm is finding the first matched pattern between the two strings and removing it from the second string. Right? It's almost what I want, but i want the matched pattern to be seen from the end. So if i take two strings like "APPWWSOFT" and "APPLESOFT", I woud get "APPLESOFT - APP LESOFT" as the output now, however, i need "APPLESOFT - SOFT - APPLE" as the output. – Snow Leopard Sep 27 '12 at 12:56
  • code updated and this will start searching from end and atleast 4 word should be matched – Subhrajyoti Majumder Sep 27 '12 at 16:15
  • Thanks a lot Quoi (Subhrajyoti)!! It works as a charm..but I have solved my problem myself and posted the solution below. Nothing gives you more pleasure than your own working code :) Appreciate your help! – Snow Leopard Sep 28 '12 at 08:00
1
   public class Match {

public static void main(String[] args)
{
    String s1="MICROSOFT";
    String s2="APPLESOFT";
    String[] s=new String[10];
    String s3;
    int j=0,k=0;
    for(int i=s2.length();i>0;i--)
    {
        s[j]=s2.substring(k,s2.length());
        if(s1.contains(s[j]))
        {
            s3=s2.substring(0,j);
                                 System.out.println(s1+""+s3);

            System.exit(0);

        }
        else
        {
            System.out.println("");
        }
                                j++;
                                k++;
    }


}

     }

I have edited the code you can give it an another try.

kanhai shah
  • 1,193
  • 10
  • 14
  • I have solved my problem Kanhai and posted the solution, but I'll check yours too and let you know if it works. Thanks!! – Snow Leopard Sep 27 '12 at 15:17
0

try this, not tested thou

 String s1 = "MICROSOFT";
         String s2 = "APPLESOFT";
         String s3="";
         for(int j=0; j<s1.length(); j++)
         {
             if(s1.charAt(j)==s2.charAt(j)){
                 s3+=s1.charAt(j);
             }
         }
         System.out.println(s1.replace(s3, " ") + " \n"+ s2.replace(s3, " "));
PermGenError
  • 45,977
  • 8
  • 87
  • 106
  • 2
    I think you mean `!=` instead of `==` and you won't need the replace. Replace wouldn't work if the characters are not together. – Peter Lawrey Sep 27 '12 at 11:26
  • @PeterLawrey i compare the chars and if they are equal then append it to s3. as i said not tested :P – PermGenError Sep 27 '12 at 11:27
  • @Chaitanya.. If you are trying to append.. Then `StirngBuffer` would be a better choice.. (And rather than appending the matching character, you can append non-matching character.. Will save you from `replace` code.) – Rohit Jain Sep 27 '12 at 11:34
  • @RohitJain yes, performance wise. but i was only showin a sample :P – PermGenError Sep 27 '12 at 11:35
  • @chaitanya10.. I understand.. I just quoted when I thought it should be quoted.. :) – Rohit Jain Sep 27 '12 at 11:37
  • @chaitanya10: thanks for the help dude. But there can be more duplicates...but we have to remove only those which are continuously identical. if i have APPWWSOFT and APPLESOFT, i should get APPLE again in the second string since we got LE different than WW in between. – Snow Leopard Sep 27 '12 at 12:33
0

You should rather use StringBuffer if you want your String to be modified..

And in this case, you can have one extra StringBuffer, in which you can keep on appending non-matching character: -

    StringBuffer s1 = new StringBuffer("MICROSOFT");
    StringBuffer s2 = new StringBuffer("APPLESOFT");
    StringBuffer s3 = new StringBuffer();

    for(int j=0; j<s1.length(); j++)
    {
        char c1 = s1.charAt(j);
        char c2 = s2.charAt(j);

        if(c1==c2) {
            System.out.println("Match found!!!");
        } else {
            System.out.println("No match found!");
            s3.append(c1);
        }
    }
    s1 = s3;
    System.out.println(s1);    // Prints "MICRO"
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • This code (as well as the OP's) will **NOT** find/remove anything in "MICROSOFT" and " APPLESOFT" pair – Germann Arlington Sep 27 '12 at 11:40
  • @GermannArlington.. This code is not removing anything from `MICROSOFT` or `APPLESOFT`.. But it is creating a new `StringBuffer` of non-matching character.. Which is later on assigned to the original `s3'.. It prints `MICRO` non-matching character which OP wants.. – Rohit Jain Sep 27 '12 at 11:51
  • It will **NOT** print anything useful (matches) for **my** two given strings. Give that a try. – Germann Arlington Sep 27 '12 at 11:56
  • I'm sorry, but which `strings` you are talking about??.. OP didn't wanted the `Matching` string, he wanted to remove them.. So, I used this approach.. I would welcome your advice on any improvement over the code.. – Rohit Jain Sep 27 '12 at 12:01
  • 1
    So, the OP wanted to **remove** ANY? substrings from the second string? They did **NOT** want to remove **matching** part of the strings? **Your** code will **NOT** find any match to remove in two string that I suggested in my first comment: "MICROSOFT" and " APPLESOFT". **Your** code will print "MICROSOFT" (full word with nothing removed) because you are doing positional match **ONLY** **AND** it will **blow up** when the s1 is longer than s2 as well. Do you need any more **constructive** advice? – Germann Arlington Sep 27 '12 at 12:09
0

I have solved my problem after racking some brains off. Please feel free to correct/improve/refine my code. The code not only works for "MICROSOFT" and "APPLESOFT" inputs, but also for inputs like "APPWWSOFT" and "APPLESOFT" (i needed to remove the continuous duplicates from the end - SOFT in both the above inputs). I'm in the learning stage and I'll appreciate any valuable inputs.

public class test
    {           
        public static void main(String[] args)
        {
            String s1 = "MICROSOFT";
            String s2 = "APPLESOFT";

            int counter1=0;
            int counter2=0;

            String[] test = new String[100];
            test[0]="";

            for(int j=0; j<s1.length(); j++)
            {
                char c1 = s1.charAt(j);
                char c2 = s2.charAt(j);

                if(c1==c2)
                {
                    if(counter1==counter2)
                    {
                        //System.out.println("Match found!!!");
                        test[0]=test[0]+c2;
                        counter2++;
                        //System.out.println("Counter 2: "+counter2);
                    }
                    else
                        test[0]="";
                }
               else
               {
                   //System.out.print("No match found!");
                   //System.out.println("Counter 2: "+counter2);
                   counter2=counter1+1;
                   test[0]="";
               }

               counter1++;
               //System.out.println("Counter 1: "+counter1);
                           }

             System.out.println(test[0]);
             System.out.println(s2.replaceAll(test[0]," "));
        }
    }
Snow Leopard
  • 347
  • 3
  • 7
  • 18