75

I am trying to remove all non alphanumeric characters from a string.

I tried using replace() with a regex as followed:

var answer = answerEditText.text.toString()
Log.d("debug", answer)
answer = answer.replace("[^A-Za-z0-9 ]", "").toLowerCase()
Log.d("debug", answer)

D/debug: Test. ,replace

D/debug: test. ,replace

Why are the punctuation characters still present? How to get only the alphanumeric characters?

Distwo
  • 11,569
  • 8
  • 42
  • 65
  • I think you want `replaceAll`? – user94559 Aug 29 '17 at 02:13
  • `String.replace` searches for a literal string, while `String.replaceAll` searches for a regular expression. – user94559 Aug 29 '17 at 02:13
  • You have to create a regex object. Otherwise you're just replacing occurrences of the literal string `[^A-Za-z0-9 ]` which is obviously not in your input. – hasen Aug 29 '17 at 02:14
  • 7
    Although my suggestion (`replaceAll`) would work in Java, Kotlin has its own `String` class, which does not contain a definition for `replaceAll`. So please disregard my suggestion. – user94559 Aug 29 '17 at 02:24

8 Answers8

102

You need to create a regex object

var answer = "Test. ,replace"
println(answer)
answer = answer.replace("[^A-Za-z0-9 ]", "") // doesn't work
println(answer)
val re = Regex("[^A-Za-z0-9 ]")
answer = re.replace(answer, "") // works
println(answer)

Try it online: https://try.kotlinlang.org/#/UserProjects/ttqm0r6lisi743f2dltveid1u9/2olerk6jvb10l03q6bkk1lapjn

hasen
  • 161,647
  • 65
  • 194
  • 231
  • 10
    Going with `val answer = answerEditText.text.toString().replace("[^A-Za-z0-9 ]".toRegex(), "").toLowerCase()` – Distwo Aug 29 '17 at 02:23
  • 2
    Alternatively `(?i)[^\\w\\d ]` - case insensitive checks using \w and \d instead of manually typing the matches – Zoe May 10 '18 at 10:00
  • @AbhijitSarkar, Swift also has extension functions. :) – CoolMind Jul 25 '18 at 16:21
  • @CoolMind Many languages do. My comment was with regards to the answer above, not in general applicable to all the languages that exist. – Abhijit Sarkar Jul 25 '18 at 17:03
90

The standard library of Kotlin is beautiful like this. Just use String.filter combined with Char.isLetterOrDigit, like this:

val stringToFilter = "A1.2-b3_4C"
val stringWithOnlyDigits = stringToFilter.filter { it.isLetterOrDigit() }
println(stringWithOnlyDigits) //Prints out "A12b34C"
Mad Scientist Moses
  • 1,577
  • 1
  • 12
  • 11
  • Fun fact. The Pi symbol does not get filtered out with this filter. It is seen as a letter since it's a letter in the greek alphabet. TIL – jeepGirl90 Aug 08 '23 at 19:24
25

You need to create a regex, this can be done by str.toRegex() before calling replace

val string = "AsAABa3a35e8tfyz.a6ax7xe"
string = string.replace(("[^\\d.]").toRegex(), "")

result: 3358.67

In case you need to handle words W and spaces

var string = "Test in@@ #Kot$#lin   FaaFS@@#$%^&StraßeFe.__525448=="
    string = string.replace(("[^\\w\\d ]").toRegex(), "")
    println(string)

result: Test in Kotlin FaaFSStraeFe__525448

Simson
  • 3,373
  • 2
  • 24
  • 38
Fakhar
  • 3,946
  • 39
  • 35
  • 1
    The question was about *I am trying to remove all non alphanumeric characters from a string.* This regexp does something different – Simson Nov 12 '19 at 14:10
  • 1
    Please check my edited answer, but I am not sure how are you considering some special characters like à ß? In my answer it will remove them too... thanks – Fakhar Nov 13 '19 at 05:14
  • 2
    Better! I added a line about `toRegx()` as a intro – Simson Nov 13 '19 at 06:14
19

I find this to be much more succinct and maintainable. Could be that the previous answers were made before these extensions were added?

val alphaNumericString = someString.toCharArray()
   .filter { it.isLetterOrDigit() }
   .joinToString(separator = "")
Kyle Luce
  • 191
  • 1
  • 2
  • 1
    I'd be interested in seeing what the performance of this was once reduced to bytecode – Tom Jun 17 '19 at 23:13
  • This seems wasteful as it creates two temporary arrays, and doesn't really increase maintainability in any real sense. It just "feels" nice because it's "functional". – hasen Jul 29 '20 at 02:23
13

I think it's easiest way:

fun String.toNumericString() = this.filter { it.isDigit() }
Artem Botnev
  • 2,267
  • 1
  • 14
  • 19
11
fun String.digitsOnly(): String{
    val regex = Regex("[^0-9]")
    return regex.replace(this, "")
}
fun String.alphaNumericOnly(): String{
    val regex = Regex("[^A-Za-z0-9 ]")
    return regex.replace(this, "")
}

Usage:

val alphaNumeric = "my string #$".alphaNumericOnly()
2

You can try without regex, for example:

val ranges = ('0'..'9') + ('a'..'z') + ('A'..'Z')
val escaped = "1! at __ 2? at 345..0986 ZOk".filter { it in ranges }
2

Kotlin thinks you substituting string, but not regex, so you should help a little bit to choose right method signature with regex as a first argument.

Use Regex type explicitly instead of string:

"[^A-Za-z0-9 ]".toRegex()

or tell that you are passing named regex parameter:

answer.replace(regex = "[^A-Za-z0-9 ]", "")

and in this case kotlin wont compile unless you pass real regex, not string

Maksim Kostromin
  • 3,273
  • 1
  • 32
  • 30