How to replace special characters in a string?

Question

I have a string with lots of special characters. I want to remove all those, but keep alphabetical characters.

How can I do this?

Look this thread : http://stackoverflow.com/questions/3438854/replace-unicode-control-characters-existing-solution — Cyril Gandon, Nov 26 '10 at 07:44

score 211 · Answer 1 · edited May 23 '17 at 12:18

211

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

Or, with Guava:

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

edited May 23 '17 at 12:18

Community

1
1

answered Nov 26 '10 at 07:44

Sean Patrick Floyd

292,901
67
465
588

when i am using this function it is removing all numbers as well.but i dont want numbers to remove.just want to remove special characters.Please suggest something.. – Tanu Nov 26 '10 at 11:28
well you said you only wanted the alphabet. But I'll update my answer in a minute – Sean Patrick Floyd Nov 26 '10 at 11:46
I want to concat string but with some condition like 1.If there is only one result no concat required 2.If result is more than 1 than concat string in the following form example: stack+over+flow – Tanu Nov 26 '10 at 11:52
2

@Tanu that's a different question. Make it a new one – Pekka Nov 26 '10 at 11:58
What if I don't want spaces to be removed? or say all spaces like tabs, newlines collapsed as only one space? – damned Mar 05 '12 at 10:23
@damned a) that's a completely different question. feel free to ask it. b) have a look at Guava's CharMatcher class. It does all that and more: http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/base/CharMatcher.html – Sean Patrick Floyd Mar 05 '12 at 10:33
It misses pi symbol (π) – Bugs Happen Nov 15 '16 at 12:43

Dhiral Pandya · Answer 2 · 2013-04-09T12:40:02.643

85

I am using this.

s = s.replaceAll("\\W", "");

It replace all special characters from string.

Here

\w : A word character, short for [a-zA-Z_0-9]

\W : A non-word character

edited Apr 09 '13 at 12:40

answered Feb 28 '13 at 10:13

Dhiral Pandya

10,311
4
47
47

Does not work for . How to remove '<', '>','\' characters? – Manoj May 12 '16 at 10:15

score 15 · Answer 3 · answered Nov 29 '14 at 05:48

15

You can use the following method to keep alphanumeric characters.

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

replaceAll("[^a-zA-Z]", "");

answered Nov 29 '14 at 05:48

Dhrumil Shah - dhuma1981

15,166
6
31
39

6

For space use `replaceAll("[^a-zA-Z0-9 ]", "");` – Qamar Jan 22 '18 at 07:55

score 8 · Answer 4 · answered Aug 09 '18 at 04:33

8

Replace any special characters by

replaceAll("\\your special character","new character");

ex:to replace all the occurrence of * with white space

replaceAll("\\*","");

*this statement can only replace one type of special character at a time

answered Aug 09 '18 at 04:33

krishnamurthy

1,574
14
14

Definitely what I was looking for when I saw the question title "How to replace special characters in a string?" thanks! – Mr.Drew Apr 02 '19 at 12:38

Marco Sulla · Answer 5 · 2019-05-13T20:42:21.763

Following the example of the Andrzej Doyle's answer, I think the better solution is to use org.apache.commons.lang3.StringUtils.stripAccents():

package bla.bla.utility;

import org.apache.commons.lang3.StringUtils;

public class UriUtility {
    public static String normalizeUri(String s) {
        String r = StringUtils.stripAccents(s);
        r = r.replace(" ", "_");
        r = r.replaceAll("[^\\.A-Za-z0-9_]", "");
        return r;
    }
}

score 2 · Answer 6 · answered Mar 03 '15 at 05:30

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.

Kachailo Dmytro · Answer 7 · 2021-02-07T20:41:20.803

2

Here is a function I used to remove all possible special characters from the string

let name = name.replace(/[&\/\\#,+()$~%!.„'":*‚^_¤?<>|@ª{«»§}©®™ ]/g, '').toLowerCase();

edited Feb 07 '21 at 20:41

answered Oct 05 '20 at 21:47

Kachailo Dmytro

31
3

Can you explain the regex a bit? – stdunbar Oct 06 '20 at 02:33
i recommend that you add notes in your answer section to explain your code. Please read more about [how to write good answers](https://stackoverflow.com/help/how-to-answer). – Joe Ferndz Oct 06 '20 at 03:40

score 0 · Answer 8 · answered Aug 26 '14 at 08:05

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.

score 0 · Answer 9 · edited Feb 23 '12 at 16:48

0

You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html

edited Feb 23 '12 at 16:48

Matt Ball

354,903
100
647
710

answered Nov 26 '10 at 09:36

Madhu Nandan

162
7
20

score 0 · Answer 10 · answered Apr 16 '18 at 14:52

0

For spaces use "[^a-z A-Z 0-9]" this pattern

answered Apr 16 '18 at 14:52

Muhammad Ahsan

249
4
13

How to replace special characters in a string?

10 Answers10

Linked

Related