113

I have a string with lots of special characters. I want to remove all those, but keep alphabetical characters.

How can I do this?

MR AND
  • 376
  • 7
  • 29
Tanu
  • 1,155
  • 2
  • 9
  • 5

10 Answers10

211

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

Or, with Guava:

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

Community
  • 1
  • 1
Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
  • when i am using this function it is removing all numbers as well.but i dont want numbers to remove.just want to remove special characters.Please suggest something.. – Tanu Nov 26 '10 at 11:28
  • well you said you only wanted the alphabet. But I'll update my answer in a minute – Sean Patrick Floyd Nov 26 '10 at 11:46
  • I want to concat string but with some condition like 1.If there is only one result no concat required 2.If result is more than 1 than concat string in the following form example: stack+over+flow – Tanu Nov 26 '10 at 11:52
  • 2
    @Tanu that's a different question. Make it a new one – Pekka Nov 26 '10 at 11:58
  • What if I don't want spaces to be removed? or say all spaces like tabs, newlines collapsed as only one space? – damned Mar 05 '12 at 10:23
  • @damned a) that's a completely different question. feel free to ask it. b) have a look at Guava's CharMatcher class. It does all that and more: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/base/CharMatcher.html – Sean Patrick Floyd Mar 05 '12 at 10:33
  • It misses pi symbol (π) – Bugs Happen Nov 15 '16 at 12:43
85

I am using this.

s = s.replaceAll("\\W", ""); 

It replace all special characters from string.

Here

\w : A word character, short for [a-zA-Z_0-9]

\W : A non-word character

Dhiral Pandya
  • 10,311
  • 4
  • 47
  • 47
15

You can use the following method to keep alphanumeric characters.

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

replaceAll("[^a-zA-Z]", "");
Dhrumil Shah - dhuma1981
  • 15,166
  • 6
  • 31
  • 39
8

Replace any special characters by

replaceAll("\\your special character","new character");

ex:to replace all the occurrence of * with white space

replaceAll("\\*","");

*this statement can only replace one type of special character at a time

krishnamurthy
  • 1,574
  • 14
  • 14
  • Definitely what I was looking for when I saw the question title "How to replace special characters in a string?" thanks! – Mr.Drew Apr 02 '19 at 12:38
7

Following the example of the Andrzej Doyle's answer, I think the better solution is to use org.apache.commons.lang3.StringUtils.stripAccents():

package bla.bla.utility;

import org.apache.commons.lang3.StringUtils;

public class UriUtility {
    public static String normalizeUri(String s) {
        String r = StringUtils.stripAccents(s);
        r = r.replace(" ", "_");
        r = r.replaceAll("[^\\.A-Za-z0-9_]", "");
        return r;
    }
}
Marco Sulla
  • 15,299
  • 14
  • 65
  • 100
2
string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.

Mike Clark
  • 1,860
  • 14
  • 21
2

Here is a function I used to remove all possible special characters from the string

let name = name.replace(/[&\/\\#,+()$~%!.„'":*‚^_¤?<>|@ª{«»§}©®™ ]/g, '').toLowerCase();
0

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.

Muneesh
  • 433
  • 10
  • 19
0

You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html

Matt Ball
  • 354,903
  • 100
  • 647
  • 710
Madhu Nandan
  • 162
  • 7
  • 20
0

For spaces use "[^a-z A-Z 0-9]" this pattern

Muhammad Ahsan
  • 249
  • 4
  • 13