-2

I'm beginner in Java and it's still very complicated to understand how a regex works. I don't know how to create a regex to check and remove some specific special characters from a string.

"!@#$%¨&*()_-+={[}]º|\,.:;?/° (I need check and remove it from a String)

Thanks in advance

F4bioo
  • 179
  • 1
  • 5
  • 11
  • maybe this is what you really want http://stackoverflow.com/questions/14361556/remove-all-special-characters-in-java? – Scary Wombat Mar 28 '17 at 02:19
  • 1
    I looked at this but my string will contain Asian language values, it will not have aA-zZ.. thanks @ScaryWombat – F4bioo Mar 28 '17 at 02:21
  • I'm voting to close this question as off-topic because this is not a regular expression writing service. There are existing questions here, like [this one](http://stackoverflow.com/q/14361556/62576), that should give you a place to start. (If you need to make adjustments to it for other characters, you can try to do so, and then ask a specific question about any problems you run into while doing so.) – Ken White Mar 28 '17 at 02:22
  • OK, sorry. But my string will not have A-z .. letters will have Asian characters. That's why I need this. @KenWhite – F4bioo Mar 28 '17 at 02:25
  • Read my comment again. *If you need to make adjustments...* seems like it's pretty clear. *Make some kind of effort to do something yourself; this is **not** a regex writing service.* Regular expressions can work with Unicode values as well, also in ranges. There are many existing questions here about doing so, and plenty of tutorials on the web you can locate. *Make an effort of your own first*, and then if you run into problems you can post a **specific question** about the problem you've encountered. – Ken White Mar 28 '17 at 02:26
  • for exmple: (あり@がと!う = Thank you) How should I use this [^ a-zA-Z0-9] if it checks only characters outside the alphabet? – F4bioo Mar 28 '17 at 02:30
  • You should make an effort on your own before asking your question. We will be very happy to see your attempt, and if it fails, help you find out why. See [How do I ask a good question?](http://stackoverflow.com/help/how-to-ask) – Ole V.V. Mar 28 '17 at 02:55
  • I think in regex you need the `[abc]` construct, the one that matches a, b or c, only fill in the characters you need to match instead of a, b and c. Some of the characters you mention will need escaping with a \ (backslash). Remember to type it as two backslashes in a `String` literal in Java, for example `"[!\\+]"`. – Ole V.V. Mar 28 '17 at 02:59
  • Tip: there are online regex engines that make development of your regex more convenient, for example https://regex101.com. Try it out. – Ole V.V. Mar 28 '17 at 03:02
  • I don't know if this is a good practice but it is working. myString.replaceAll("[!@#$%¨&*()_\\-+={\\[}\\]º|\\\\,.:;?/°]", ""); – F4bioo Mar 28 '17 at 03:47
  • 1
    If you are using a narrow (utf-16) target, you can't just dump all the characters in a class. That's because they could contain surrogates, which are prohibited in a mixed class. If you know there aren't any surrogates (supplemental plane), you can just dump the char's in a class. Or, you could use the \uXXXX notation. `[\u0021-\u0026\u0028-\u002F\u003A-\u003B\u003D\u003F-\u0040\u005B-\u005D\u005F\u007B-\u007D\u00A8\u00B0\u00BA]` –  Mar 28 '17 at 05:03

2 Answers2

1

I don't know if is a good practice but it's working.

private String check(String answer) {
    return answer.replaceAll("[!@#$%¨&*()_\\-+={\\[}\\]º|\\\\,.:;?/°]", "");
}

String answer = "Lore!m ips@um dol$or si%t amet, co¨nsectetur adi&piscing el*it. Mo(rbi pla)cerat, tu_rpis s_it am+et acc=umsan ve{nenatis, ma[gna r}isus ulla]mcorper an|te, ne\\c por,ttitor lac.us n:unc se;d el?it. Nul/la tristi°que posºuere felis, in ullamcorper sapien dignissim sit amet.";

Log.i("ans", check(answer));

Lorem ipsum dolor sit amet consectetur adipiscing elit Morbi placerat turpis 
sit amet accumsan venenatis magna risus ullamcorper ante nec porttitor lacus 
nunc sed elit Nulla tristique posuere felis in ullamcorper sapien dignissim 
sit amet
F4bioo
  • 179
  • 1
  • 5
  • 11
1

You may use a regex to remove all punctuation (\p{P}) and symbols (\p{S}) and the º symbol:

String result = s.replaceAll("[\\p{S}\\p{P}º]+", "");

or use \p{Punct} (One of !"#$%&'()*+,-./:;<=>?@[\]^_{|}~`)

String result = s.replaceAll("[\\p{Punct}º]+", "");

See a Java demo:

import java.util.*;
import java.lang.*;
import java.io.*;

class Ideone
{
    public static void main (String[] args) throws java.lang.Exception
    {
        String answer = "Lore!m ips@um dol$or si%t amet, co¨nsectetur adi&piscing el*it. Mo(rbi pla)cerat, tu_rpis s_it am+et acc=umsan ve{nenatis, ma[gna r}isus ulla]mcorper an|te, ne\\c por,ttitor lac.us n:unc se;d el?it. Nul/la tristi°que posºuere felis, in ullamcorper sapien dignissim sit amet.";
        System.out.println(check(answer));
    }
    private static String check(String answer) {
        return answer.replaceAll("[\\p{S}\\p{P}º]+", "");
    }
}

Output:

Lorem ipsum dolor sit amet consectetur adipiscing elit Morbi placerat turpis sit amet accumsan venenatis magna risus ullamcorper ante nec porttitor lacus nunc sed elit Nulla tristique posuere felis in ullamcorper sapien dignissim sit amet
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563