You can do it from a Unicode point of view:
String s = "This is sample CCNA program. it contains CCNP™. And it contains digits 123456789.";
String res = s.replaceAll("[^\\p{L}\\p{M}\\p{P}\\p{Nd}\\s]+", "");
System.out.println(res);
will print out:
This is sample CCNA program. it contains CCNP. And it contains digits 123456789.
\\p{...}
is a Unicode property
\\p{L}
matches all letters from all languages
\\p{M}
a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.).
\\p{P}
any kind of punctuation character.
\\p{Nd}
a digit zero through nine in any script except ideographic scripts.
So this regex will replace every character that is not a letter (also combined letters), a Punctuation, a digit or a withespace character (\\s
).