0

I'm trying to filter a string and keep only certain phrases, trying to develop an amateur syntax checker for code. For example:

String line = "<html><head><title>HELLO WORLD</title></head><body>Hello WorldMy name is Ricardo i hope you are all doing good</body></html>";

String[] splitt = line.split("\\<html>|\\</html>|\\<head>|\\</head>|\\<title>|\\</title>|\\<body>|\\</body>");

    for (String split: splitted) {
        System.out.println(split);
    }
}

I want to take all the tokens such as <html> , </html> , <title>, </title> and with the code up there I'm getting totally the opposite, basically filtering out what I want.

Thanks in advance! I've been stressing out all day trying to figure it out.

2 Answers2

2

If you are looking for certain phrases in a string then you can use java Regex to find your desired output. Just create regex of desired string and use it like.

Pattern pattern=Pattern.compile("Your Regex");  
Matcher matcher=pattern.matcher("Source String");

 while (matcher.find())                    // true if matches
    {
     System.out.println(matcher.group());  //prints string token  
    }

Currently you are using split(regex) which will split the string by given regex, So it will omit splitter <html>,</html> etc

Tarun
  • 986
  • 6
  • 19
0

Try the following code snippet.

String line = "<html><head><title>HELLO WORLD</title></head><body>Hello WorldMy name is Ricardo i hope you are all doing good</body></html>";
ArrayList<StringBuffer> list = new ArrayList<StringBuffer>(); 
for(int i=0; i<line.length();i++)
{
  if(line.charAt(i)=='<')
  {
    StringBuffer str = new StringBuffer();
    while(line.charAt(i)!='>')
    {
      str.append(line.charAt(i));
      i++;
    }
    str.append('>');
    list.add(str);
  }
}

Iterator<StringBuffer> itr = list.iterator();
while(itr.hasNext())
System.out.println(itr.next());

You can change the code from putting strings into a ArrayList to your logic.

Hope I helped with your code.

Varun Jain
  • 1,371
  • 12
  • 26