229

I have a Java String object. I need to extract only digits from it. I'll give an example:

"123-456-789" I want "123456789"

Is there a library function that extracts only digits?

Thanks for the answers. Before I try these I need to know if I have to install any additional llibraries?

codaddict
  • 445,704
  • 82
  • 492
  • 529
user488469
  • 2,293
  • 2
  • 14
  • 6

15 Answers15

612

You can use regex and delete non-digits.

str = str.replaceAll("\\D+","");
Matt
  • 74,352
  • 26
  • 153
  • 180
codaddict
  • 445,704
  • 82
  • 492
  • 529
  • 6
    nice short code. A linear search might be faster but i think yours makes more sense. – kasten Oct 27 '10 at 07:46
  • 1
    @BjornS oh please: this is a standard solution, easy, understandable and fast enough for most purposes. It may not be a best practice (although I'd argue about that) but it certainly doesn't deserve a downvote. – Sean Patrick Floyd Oct 27 '10 at 12:39
  • 1
    @seanizer ok, When does standard become best practise or maintainable? I down voted this not because I couldn't understand this or because it isn't standard but rather that it perpetuates bad code. As I said I've done the same myself but if at all possible I try to avoid it. If I shouldn't down vote for this then what should I down vote for? I'm sorry if I came across as unpleasant. – BjornS Oct 27 '10 at 13:11
  • 23
    I guess you can downvote anything you like to downvote (no sarcasm intended). But my personal opinion is: when great developers (and we have lots of them here) share some of their advice for free, then I'm going to honor that, and I only downvote stuff that's really awful (check my profile, my current ratio is 14xx up against 17 down). But that's my personal philosophy and you are free to have your own. – Sean Patrick Floyd Oct 27 '10 at 13:19
  • 89
    This wont work if your number has a decimal point, it removes the decimal point too. `str = str.replaceAll("[^\\.0123456789]","");` – Aravindan R Jan 10 '12 at 22:21
  • 2
    Although the regex is supremely simple and clean to look at, it suffers from performance issues and should only be used where you have a one-off strip (like a form submit). If you are processing a lot of data, this is not the way to go. – Brill Pappin Dec 19 '12 at 21:36
  • 2
    and if you need to exclude anything, like a decimal point, `(?!\\.)` – azerafati Apr 17 '14 at 10:43
  • 1
    This is exactly what regular expressions are good at, and if you think about what it does internally, it does not have any obvious inefficiencies. I believe that the distractors just have a knee jerk reaction to regexes, stemming from seeing to many abuses of regexes in other cases. – Svante Jun 03 '15 at 10:48
  • If you don't want remove the pointers and other chars, so you to need do [^0-9] on the replaceAll. – luke cross Feb 11 '20 at 20:54
  • @AravindanR your solution fails if digit has negative sign against it, can you please suggest what modification should be done with you rsolution – vikramvi Jan 06 '21 at 11:55
  • str.replaceAll("[^\\.-0123456789]","") if your number is negative – vikramvi Jan 06 '21 at 11:58
50

Here's a more verbose solution. Less elegant, but probably faster:

public static String stripNonDigits(
            final CharSequence input /* inspired by seh's comment */){
    final StringBuilder sb = new StringBuilder(
            input.length() /* also inspired by seh's comment */);
    for(int i = 0; i < input.length(); i++){
        final char c = input.charAt(i);
        if(c > 47 && c < 58){
            sb.append(c);
        }
    }
    return sb.toString();
}

Test Code:

public static void main(final String[] args){
    final String input = "0-123-abc-456-xyz-789";
    final String result = stripNonDigits(input);
    System.out.println(result);
}

Output:

0123456789

BTW: I did not use Character.isDigit(ch) because it accepts many other chars except 0 - 9.

Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
  • 4
    You should provide a size to the `StringBuilder` constructor (such as `input.length()`) to ensure that it won't need to reallocate. You don't need to demand a `String` here; `CharSequence` suffices. Also, you can separate the allocation of the `StringBuilder` from the collection of non-digits by writing a separate function that accepts a `CharSequence` as input and an `Appendable` instance as an output accumulator. – seh Nov 02 '10 at 00:18
  • 1
    @seh Sounds interesting but rather than commenting why not create your own answer with the extensions? – RedYeti Jul 02 '12 at 14:34
  • 4
    @RedYeti Letting this answer remain and adding a comment is more honourable since Sean receives upvotes then. It's also a lot quicker to critique others' code than rewrite it if you're in a hurry. Don't punish seh for making a valuable contribution, he didn't have to add those useful tidbits, and your response makes him less likely to do so next time. – KomodoDave Apr 20 '13 at 14:18
  • 2
    I'm not "punishing" anyone - that's a complete misinterpretation of what I was saying to @seh. My point was that his comments added so much which was worthwhile and in fact changed so much that I felt it warranted an answer of it's own. I'm sure Sean Patrick Floyd isn't concerned with kudos only helping others and would be perfectly happy with seh providing his own answer. I was merely encouraging seh since I felt his contribution deserved greater visibility. How it's possible to read my comment as anything else completely puzzles me but I apologise to seh if it somehow did. – RedYeti Apr 22 '13 at 10:50
  • 1
    I like how these discussions pick up after lying dormant for a while. Perhaps the best thing to do here is for me to edit Sean's answer, augmenting it with my suggestions. That way, Sean will continue to receive the credit unless the answer transitions to community wiki status. – seh Apr 22 '13 at 22:27
  • Hey - good to see seh's answer partly coded up but the more interesting part to me was suggestion of using the CharSequence and Appendable instance. (And thanks to @KomodoDave for firing this up again!) – RedYeti Apr 23 '13 at 11:13
  • 1
    @RedYeti *sigh* ok, added the CharSequence, too – Sean Patrick Floyd Apr 23 '13 at 13:44
  • @RedYeti Thank you, I understand the tone of your original comment now. It's concise enough to interpret either way hence my misapprehension, my apologies. – KomodoDave Apr 24 '13 at 11:57
  • @KomodoDave Hey just glad it was all cleared up! Though it did kinda spiral - sorry to be any bother Sean Patrick Floyd! It can be hard to get a particular tone to come across in comments - will be more careful with phrasing in future :) – RedYeti Apr 25 '13 at 13:11
  • I will confess to not having memorized the ASCII table and not being completely sure about the inner workings of Java was a little put off by them. No offense intended. Point is, why not define something like `int lower = ((int)'0'-1);` and `int upper = ((int)'9'+1);` before the **for** loop. Than in your **if** statement use those variables. I think it would make it slightly easier to read. – Raystorm Oct 02 '13 at 22:59
  • @Raystorm these days I'd go for Emil's Guava answer anyway. Should perform well enough and is certainly more readable – Sean Patrick Floyd Oct 03 '13 at 10:01
  • 1
    I agree the Guava answer is very short and readable. However, since I'm not actually using Google Guava in my own work I like the "pure" java answer. I just thought I would add my own 2 cents on how to make the java code more readable. – Raystorm Oct 03 '13 at 22:22
  • the codeaddict's solution worked, but put also dots in result. this worked and helped me a lot. thanks! – marson Jul 28 '14 at 13:00
  • After i read comments i do a performance test to very long String with minimal numbers and result is: Guava approach is 2.5-3 times slower Regular expression with D+ is 3-3.5 times slower Regular expression with only D is 25+ times slower – Perlos Jul 12 '16 at 08:40
  • This code with using CharBuffer and while cycle (without accessing array position) is 23% faster And more using char[] instead of StringBuilder give you more 3% boost – Perlos Jul 12 '16 at 08:48
22
public String extractDigits(String src) {
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < src.length(); i++) {
        char c = src.charAt(i);
        if (Character.isDigit(c)) {
            builder.append(c);
        }
    }
    return builder.toString();
}
dogbane
  • 266,786
  • 75
  • 396
  • 414
  • I thought of using Character.isDigit() myself, but it also accepts some characters that are not 0-9 (see docs: http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isDigit%28char%29 ) – Sean Patrick Floyd Oct 27 '10 at 07:58
22

Using Google Guava:

CharMatcher.inRange('0','9').retainFrom("123-456-789")

UPDATE:

Using Precomputed CharMatcher can further improve performance

CharMatcher ASCII_DIGITS=CharMatcher.inRange('0','9').precomputed();  
ASCII_DIGITS.retainFrom("123-456-789");
Derek Mahar
  • 27,608
  • 43
  • 124
  • 174
Emil
  • 13,577
  • 18
  • 69
  • 108
20
input.replaceAll("[^0-9?!\\.]","")

This will ignore the decimal points.

eg: if you have an input as 445.3kg the output will be 445.3.

trooper
  • 4,444
  • 5
  • 32
  • 32
user3679646
  • 201
  • 2
  • 2
11

Using Google Guava:

CharMatcher.DIGIT.retainFrom("123-456-789");

CharMatcher is plug-able and quite interesting to use, for instance you can do the following:

String input = "My phone number is 123-456-789!";
String output = CharMatcher.is('-').or(CharMatcher.DIGIT).retainFrom(input);

output == 123-456-789

BjornS
  • 1,004
  • 8
  • 19
  • Very nice solution (+1), but it suffers from the same problem as others: lots of characters qualify as unicode digits, not only the ascii digits. This code will retain all of these characters: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5Cp%7Bdigit%7D – Sean Patrick Floyd Oct 27 '10 at 08:41
  • @seanizer: Then will this be better CharMatcher.inRange('1','9').retainFrom("123-456-789") – Emil Oct 27 '10 at 10:32
  • @Emil more like CharMatcher.inRange('0','9'), but: yes – Sean Patrick Floyd Oct 27 '10 at 10:33
  • inRange is what lies behind CharMatcher.DIGIT; http://pastie.org/1252471 It simply takes into account attitional UTF number ranges, I would still consider these as digits, since in reality they are, they are simply not ASCII encoded. – BjornS Oct 27 '10 at 11:31
  • You can also use CharMatcher.JAVA_DIGIT for the same purpose, that will only accept digits as per Character.isDigit – BjornS Oct 27 '10 at 11:34
8
public class FindDigitFromString 
{

    public static void main(String[] args) 
    {
        String s="  Hi How Are You 11  ";        
        String s1=s.replaceAll("[^0-9]+", "");
        //*replacing all the value of string except digit by using "[^0-9]+" regex.*
       System.out.println(s1);          
   }
}

Output: 11

Robert Moskal
  • 21,737
  • 8
  • 62
  • 86
ruchin khare
  • 91
  • 1
  • 2
7

Use regular expression to match your requirement.

String num,num1,num2;
String str = "123-456-789";
String regex ="(\\d+)";
Matcher matcher = Pattern.compile( regex ).matcher( str);
while (matcher.find( ))
{
num = matcher.group();     
System.out.print(num);                 
}
Raghunandan
  • 132,755
  • 26
  • 225
  • 256
5

I inspired by code Sean Patrick Floyd and little rewrite it for maximum performance i get.

public static String stripNonDigitsV2( CharSequence input ) {
    if (input == null)
        return null;
    if ( input.length() == 0 )
        return "";

    char[] result = new char[input.length()];
    int cursor = 0;
    CharBuffer buffer = CharBuffer.wrap( input );

    while ( buffer.hasRemaining() ) {
        char chr = buffer.get();
        if ( chr > 47 && chr < 58 )
            result[cursor++] = chr;
    }

    return new String( result, 0, cursor );
}

i do Performance test to very long String with minimal numbers and result is:

  • Original code is 25,5% slower
  • Guava approach is 2.5-3 times slower
  • Regular expression with D+ is 3-3.5 times slower
  • Regular expression with only D is 25+ times slower

Btw it depends on how long that string is. With string that contains only 6 number is guava 50% slower and regexp 1 times slower

Perlos
  • 2,028
  • 6
  • 27
  • 37
5

Using Kotlin and Lambda expressions you can do it like this:

val digitStr = str.filter { it.isDigit() }
Joel Broström
  • 3,530
  • 1
  • 34
  • 61
3

You can use str.replaceAll("[^0-9]", "");

sendon1982
  • 9,982
  • 61
  • 44
2

I have finalized the code for phone numbers +9 (987) 124124.

Unicode characters occupy 4 bytes.

public static String stripNonDigitsV2( CharSequence input ) {
    if (input == null)
        return null;
    if ( input.length() == 0 )
        return "";

    char[] result = new char[input.length()];
    int cursor = 0;
    CharBuffer buffer = CharBuffer.wrap( input );
    int i=0;
    while ( i< buffer.length()  ) { //buffer.hasRemaining()
        char chr = buffer.get(i);
        if (chr=='u'){
            i=i+5;
            chr=buffer.get(i);
        }

        if ( chr > 39 && chr < 58 )
            result[cursor++] = chr;
        i=i+1;
    }

    return new String( result, 0, cursor );
}
Kairat Koibagarov
  • 1,385
  • 15
  • 9
2

Code:

public class saasa {

    public static void main(String[] args) {
        // TODO Auto-generated method stub
        String t="123-456-789";
        t=t.replaceAll("-", "");
        System.out.println(t);
    }
muneebShabbir
  • 2,500
  • 4
  • 29
  • 46
0
import java.util.*;
public class FindDigits{

 public static void main(String []args){
    FindDigits h=new  FindDigits();
    h.checkStringIsNumerical();
 }

 void checkStringIsNumerical(){
    String h="hello 123 for the rest of the 98475wt355";
     for(int i=0;i<h.length();i++)  {
      if(h.charAt(i)!=' '){
       System.out.println("Is this '"+h.charAt(i)+"' is a digit?:"+Character.isDigit(h.charAt(i)));
       }
    }
 }

void checkStringIsNumerical2(){
    String h="hello 123 for 2the rest of the 98475wt355";
     for(int i=0;i<h.length();i++)  {
         char chr=h.charAt(i);
      if(chr!=' '){
       if(Character.isDigit(chr)){
          System.out.print(chr) ;
       }
       }
    }
 }
}
0

i use this,

Kotlin

str.replace("[^0-9/]".toRegex(), "")