0

I have a String:

String s = "12 text var2 14 8v 1";

I need to get only numbers from this string like:

12 14 1.

But I don't need words like:

var2 and 8v.c  

I tried this:

str = str.replaceAll("[^\\d.]", "");`
msrd0
  • 7,816
  • 9
  • 47
  • 82
dshil
  • 381
  • 2
  • 10

5 Answers5

2

If you really want to use String.replaceAll for this, there's a workaround:

//            | one or more non-digits
//            |   | followed by one or more digits
//            |   |   | followed by one or more non-digits
//            |   |   |    | or the end of the input      
//            |   |   |    |     | replace with single white space
s.replaceAll("\\D+\\d+(\\D+|$)", " ");

Output

12 14 1

However, this solution is ugly and might break with different inputs.

I recommend you parse for positives instead, and gather by iterating over input.

Something in the lines of:

//                           | word boundary
//                           |  | one or more digits
//                           |  |    | word boundary
Pattern p = Pattern.compile("\\b\\d+\\b");
Mena
  • 47,782
  • 11
  • 87
  • 106
  • could you please explain how the first example might break? – Michał Schielmann Aug 26 '14 at 09:43
  • @MichałSchielmann well I just edited my answer, because adding a non-digit+ followed by digit+ item at the end of the input would break the solution - i.e. if input was `"12 text var2 14 8v 1 a1"`. I don't have a clearer picture of further edge cases, only that it looks tricky. – Mena Aug 26 '14 at 09:50
0

The key here is word boundaries (\b). This seems to work:

String s = "x4 12 text var2 14 8v 1 1a";
s = s.replaceAll("\\b[\\d.]*[^ \\d.]+[\\d.]*\\b", "").replaceAll("  +", " ").trim();
System.out.println(s); // "12 14 1"

What that does is look for word boundaries on either side of anything that has at least one non-digit, non-decimal-point, non-space in it, and removes the entire match. You may need to add more that just spaces to the negated character class in the middle, depending on your input. Then I trim extraneous spaces.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
0

Other Solution with Guava And Apache Common

String s = "12 text var2 14 8v 1";
Iterable<String> split = Splitter.on(CharMatcher.BREAKING_WHITESPACE).split(s);

for (String string : split) {
    boolean isNumber = StringUtils.isNumber(string);
    if(isNumber) {
        System.out.println(string);
    }
}

// Result -- 12 14 1
Balicanta
  • 109
  • 6
0

You can use the Scanner class to scan every word in the sentence and a method that you pass each word and checks if its a number or not.

static boolean isNumber(String a){
    try{
        int x = Integer.parseInt(a);  
    }catch(NumberFormatException e){  
        return false; // if it attempts to parse an int from a String like "text" etc..
    }  
    return true;  // if int was successfully parsed 
}


public static void main(String[] args){

    String s = "12 text var2 14 8v 1"; 
    Scanner in = new Scanner(s);
    String result = "";

    while(in.hasNext()){ //scan every word
        String a = in.next();
            if(isNumber(a)) //check if number
                result += a + " "; //add only if its number
    }

    result = result.substring(0, result.length() - 1);//do this to remove the last " "(space) added inside the loop
}

System.out.println(result); will print: "12 14 1"

gkrls
  • 2,618
  • 2
  • 15
  • 29
-1

Try this regex:

([0-9]*[^0-9\s]+[0-9]*\s*)

Here all strings that have(or have not) digit at begining [0-9]* that is followed by one or more non-digit character [^0-9\s]+ and then have (or have not) a digit and space [0-9]*\s* are found. It will find all characters but numbers. It works for all kind of characters - also special characters.

Using it this way would result in what you need:

String myString = "12 text var2 14 8v 1";
myString = myString.replaceAll("([0-9]*[^0-9\\s]+[0-9]*\\s*)", "");
System.out.println(myString);

Output:

12 14 1
Michał Schielmann
  • 1,372
  • 8
  • 17