2

I would like to know how to parse several double numbers from a string, but string can be mixed, for instance: String s = "text 3.454 sometext5.567568more_text".

The standard method (Double.parseDouble) is unsuitable. I've tried to parse it using the isDigit method, but how to parse other characters and .?

thanks.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
Helgus
  • 177
  • 3
  • 7
  • 17

4 Answers4

7

You could search for the following regex:

Pattern.compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?")

and then use Double.parseDouble() on each match.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • you can also use `\\d` instead of [0-9] – ratchet freak Feb 21 '12 at 16:53
  • Here's another good link should you choose to use a regex: http://www.regular-expressions.info/floatingpoint.html – paulsm4 Feb 21 '12 at 17:01
  • Regex is the way to go if you can spare a few cpu clock cycles. – vinnybad Feb 21 '12 at 18:43
  • it doesn't correctly parses the doubles. for instance, if i have a String like "sdf9.99e.23" it parses 9.99 or as i modified it, it gives 9.99e23, BUT! it should throw some exception or return false(or something other,but not a double value), it shouldn't return the double value in a such case – Helgus Feb 29 '12 at 13:38
3

After parsing your doubles with the suitable regular expressions like in this code or in other posts, iterate to add the matching ones to a list. Here you have myDoubles ready to use anywhere else in your code.

public static void main ( String args[] )
{
    String input = "text 3.454 sometext5.567568more_text";
    ArrayList < Double > myDoubles = new ArrayList < Double >();
    Matcher matcher = Pattern.compile( "[-+]?\\d*\\.?\\d+([eE][-+]?\\d+)?" ).matcher( input );

    while ( matcher.find() )
    {
        double element = Double.parseDouble( matcher.group() );
        myDoubles.add( element );
    }

    for ( double element: myDoubles )
        System.out.println( element );
}
Juvanis
  • 25,802
  • 5
  • 69
  • 87
  • i think it's just what i wanted! – Helgus Feb 21 '12 at 17:22
  • it doesn't correctly parses the doubles. for instance, if i have a String like "sdf9.99e.23" it parses 9.99 or as i modified it, it gives 9.99e23, BUT! it should throw some exception or return false, it shouldn't return the double value in a such case – Helgus Feb 28 '12 at 14:16
  • because 9.99e is a double as you know "e" is used for scientific notation of doubles, if you don't want to parse them remove [eE] from regex in the code. – Juvanis Feb 28 '12 at 14:40
  • the problem is that i want to parse them. and as you know after eE there should some integer(or integer with + or -). thus the correct form is 9.99e1, BUT in string we have 9.99e.1 (for example) and our parser shouldn't parse it. or in other words, how to check, that after [eE] comes digit or [+-] and no other character. i think i made my question clear. thanks for the answers! – Helgus Feb 28 '12 at 15:10
  • 1
    @Helgus Input "sdf9.99e.23" contains 2 doubles: 9.99 and 0.23, you will see these match the regex and will be the output. my answer is valid for your original question and it doesn't specify the thing you are telling now. you should ask it in another question. by the way the regex in my answer is the general double regex. – Juvanis Feb 28 '12 at 17:40
  • Input "sdf9.99e.23" contains no doubles, cause if we have [eE], after it MUST be a [+-] or just [0-9]. So i need in regex some kind of "if". in pseudo-code it'll like next: `if(char[i]==(e|E)) then if(char[i+1] == ('+'|'-')) else return null ` – Helgus Feb 29 '12 at 08:05
  • @Helgus: according to that logic the sample text `text 3.454 sometext5.567568more_text`contains no double values either, because after a digit there must be *neither* a ` ` nor a `m` in a double. What makes the `e`special here? – Joachim Sauer Mar 01 '12 at 08:14
  • @JoachimSauer [eE] after a digit identifies, that after [eE] there must be [+-](and digit) or digit, which specifies the degree. f.e.: 9.99e23 = 9.99 * 10^23(^ = degree) – Helgus Mar 01 '12 at 13:08
  • 1
    @Helgus: yes, I understand that. But an `X` for example isn't valid after a digit in a double number anyway and when the input ist `123X` you simply ignore it and say that the input contains `123`. What is the **rule** that makes you ignore the `X` but not the `e`... – Joachim Sauer Mar 01 '12 at 13:34
  • @JoachimSauer the rule is that e - is a sign for identifying scientific record of a number. so if we have [eE] after a digit, we must check if there are [+- and digits] or just digits, if there are no - we must return no value. this rule is only for [eE] character. – Helgus Mar 01 '12 at 14:07
1
  1. Parse out the substring (surrounded by whitespace)

  2. Use String.ParseDouble() to get the numeric value

Here's one example, using "split()" to parse (there are many alternatives):

// http://pages.cs.wisc.edu/~hasti/cs302/examples/Parsing/parseString.html

String phrase = "the music made   it   hard      to        concentrate";
String delims = "[ ]+";
String[] tokens = phrase.split(delims);

Here's a second alternative:

Java: how to parse double from regex

Community
  • 1
  • 1
paulsm4
  • 114,292
  • 17
  • 138
  • 190
0

You'll need to think about what sort of algorithm you'd want to use to do this, as it's not entirely obvious. If a substring is asdf.1asdf, should that be parsed as the decimal value 0.1 or simply 1?

Also, can some of the embedded numbers be negative? If not this greatly simplifies the search space.

I think that aix is on the right track with using a regex, since once you come up with an algorithm this sounds like the kind of job for a state machine (scan through the input until you find a digit or optionally a - or ., then look for the next "illegal" character and parse the substring normally).

It's the edge cases that you have to think about though - for example, without negative numbers you can almost use s.split("[^0-9.]") and filter out the non-empty elements. However, period characters that aren't part of a number will get you. Whatever solution you go with, think about whether any situations could trip it up.

Andrzej Doyle
  • 102,507
  • 33
  • 189
  • 228