7

I need to parse a duration string, of the form 98d 01h 23m 45s into milliseconds.

I was hoping there was an equivalent of SimpleDateFormat for durations like this, but I couldn't find anything. Would anyone recommend for or against trying to use SDF for this purpose?

My current plan is to use regex to match against numbers and do something like

Pattern p = Pattern.compile("(\\d+)");
Matcher m = p.matcher("98d 01h 23m 45s");

if (m.find()) {
    int days = Integer.parseInt(m.group());
}
// etc. for hours, minutes, seconds

and then use TimeUnit to put it all together and convert to milliseconds.

I guess my question is, this seems like overkill, can it be done easier? Lots of questions about dates and timestamps turned up but this is a little different, maybe.

skynet
  • 9,898
  • 5
  • 43
  • 52
  • 1
    Your approach looks fine so far. Careful with `Integer.parseInt()` and leading zeroes, though. Something like `... 08h ...` will be interpreted as octal and fail to parse. Strip off any leading zeroes. – Philipp Reichart Jun 13 '12 at 19:26
  • @PhilippReichart isn't it safe the way I am using it above, since `Integer.parseInt("08")` returns `8`? – skynet Jun 13 '12 at 19:34
  • Perhaps your thinking of leading spaces. `System.out.println("Integer.parseInt("009"));` prints "9" on my machine. The `h` should not be passed to `parseInt`, however. – Gene Jun 13 '12 at 19:37
  • My bad. I always assumed `Integer.parseInt()` would not accept leading zeroes except for octal numbers. It's not mentioned in the javadocs but seems to work fine. Thanks for the hint :) – Philipp Reichart Jun 13 '12 at 21:01

4 Answers4

11

Check out PeriodFormatter and PeriodParser from JodaTime library.

You can also use PeriodFormatterBuilder to build a parser for your strings like this

String periodString = "98d 01h 23m 45s";

PeriodParser parser = new PeriodFormatterBuilder()
   .appendDays().appendSuffix("d ")
   .appendHours().appendSuffix("h ")
   .appendMinutes().appendSuffix("m ")
   .appendSeconds().appendSuffix("s ")
   .toParser();

MutablePeriod period = new MutablePeriod();
parser.parseInto(period, periodString, 0, Locale.getDefault());

long millis = period.toDurationFrom(new DateTime(0)).getMillis();

Now, all this (especially the toDurationFrom(...) part) may look tricky, but I really advice you to look into JodaTime if you're dealing with periods and durations in Java.

Also look at this answer about obtaining milliseconds from JodaTime period for additional clarification.

Community
  • 1
  • 1
npe
  • 15,395
  • 1
  • 56
  • 55
  • 1
    Thank you for the very detailed response! Unfortunately I am going to use a solution that doesn't require adding additional dependencies, but I am sure someone will use your code. – skynet Jun 14 '12 at 14:06
6

Using a Pattern is a reasonable way to go. But why not use a single one to get all four fields?

Pattern p = Pattern.compile("(\\d+)d\\s+(\\d+)h\\s+(\\d+)m\\s+(\\d+)s");

Then use the indexed group fetch.

EDIT:

Building off of your idea, I ultimately wrote the following method

private static Pattern p = Pattern
        .compile("(\\d+)d\\s+(\\d+)h\\s+(\\d+)m\\s+(\\d+)s");

/**
 * Parses a duration string of the form "98d 01h 23m 45s" into milliseconds.
 * 
 * @throws ParseException
 */
public static long parseDuration(String duration) throws ParseException {
    Matcher m = p.matcher(duration);

    long milliseconds = 0;

    if (m.find() && m.groupCount() == 4) {
        int days = Integer.parseInt(m.group(1));
        milliseconds += TimeUnit.MILLISECONDS.convert(days, TimeUnit.DAYS);
        int hours = Integer.parseInt(m.group(2));
        milliseconds += TimeUnit.MILLISECONDS
                .convert(hours, TimeUnit.HOURS);
        int minutes = Integer.parseInt(m.group(3));
        milliseconds += TimeUnit.MILLISECONDS.convert(minutes,
                TimeUnit.MINUTES);
        int seconds = Integer.parseInt(m.group(4));
        milliseconds += TimeUnit.MILLISECONDS.convert(seconds,
                TimeUnit.SECONDS);
    } else {
        throw new ParseException("Cannot parse duration " + duration, 0);
    }

    return milliseconds;
}
skynet
  • 9,898
  • 5
  • 43
  • 52
Gene
  • 46,253
  • 4
  • 58
  • 96
4

The new java.time.Duration class in Java 8 let's you parse durations out of the box:

Duration.parse("P98DT01H23M45S").toMillis();

the format is slightly different so would need adjusting prior to parsing.

Andrejs
  • 26,885
  • 12
  • 107
  • 96
  • I was just about to post a new SO question for parsing "00:07:21.5958786" and returning an integer (in millis or in nanos). Thanks! By the way, the documentation moved so it's worth noting that you recommend java.time.Duration, not other Duration classes. – Daniel May 10 '14 at 13:45
2

I suggest using java.time.Duration which is modelled on ISO-8601 standards and was introduced with Java-8 as part of JSR-310 implementation.

As per ISO-8601 standards, the duration of 1 day and 2 hours is represented as P1DT2H whereas the duration of 2 hours is represented as PT2H. After converting your string into ISO-8601 format, you can parse it using the Duration#parse method. Note that (Thanks to Ole V.V. for this information) the suffixes "D", "H", "M" and "S" for days, hours, minutes and seconds, are accepted in upper or lower case.

Demo:

import java.time.Duration;
import java.util.regex.Pattern;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        // Test
        Stream.of(
                "98d 01h 23m 45s",
                "01d 02h 03m 04s",
                "02h 03m 04s",
                "03m 04s",
                "04s"
        ).forEach(s -> System.out.println(durationMillis(s)));
    }

    static long durationMillis(String s) {
        if (Pattern.compile("\\d+d\\s").matcher(s).find()) {
            int idxSpace = s.indexOf(" ");
            s = "P" + s.substring(0, idxSpace) + "T" + s.substring(idxSpace + 1);
        } else
            s = "PT" + s;
        s = s.replace(" ", "");
        return Duration.parse(s).toMillis();
    }
}

A sample run:

8472225000
93784000
7384000
184000
4000

Another way to solve it is by retrieving the value of each part (day, hour, minute, and second) and adding them to Duration.ZERO:

static long durationMillis(String s) {
    long millis = 0;
    Matcher matcher = Pattern.compile("(?:(?:(?:0*(\\d+)d\\s)?0*(\\d+)h\\s)?0*(\\d+)m\\s)?0*(\\d+)s").matcher(s);
    if (matcher.find()) {
        int days = matcher.group(1) != null ? Integer.parseInt(matcher.group(1)) : 0;
        int hours = matcher.group(2) != null ? Integer.parseInt(matcher.group(2)) : 0;
        int minutes = matcher.group(3) != null ? Integer.parseInt(matcher.group(3)) : 0;
        int seconds = matcher.group(4) != null ? Integer.parseInt(matcher.group(4)) : 0;
        millis = Duration.ZERO.plusDays(days).plusHours(hours).plusMinutes(minutes).plusSeconds(seconds).toMillis();
    }
    return millis;
}

Learn more about the the modern date-time API* from Trail: Date Time.

Explanation of the regex:

  • (?:: Start of non-capturing group
    • (?:: Start of non-capturing group
      • (?:: Start of non-capturing group
        • 0*: Any number of zeros
        • (\\d+)d\\s: One or more digits (group#1) followed by d and a whitespace.
      • )?: End of non-capturing group. The ? makes it optional.
      • 0*: Any number of zeros
      • (\\d+)h\\s: One or more digits (group#2) followed by h and a whitespace.
    • )?: End of non-capturing group. The ? makes it optional.
    • 0*: Any number of zeros
    • (\\d+)m\\s: One or more digits (group#3) followed by m and a whitespace.
  • )?: End of non-capturing group. The ? makes it optional.
  • 0*: Any number of zeros
  • (\\d+)s: One or more digits (group#4) followed by s

* For any reason, if you have to stick to Java 6 or Java 7, you can use ThreeTen-Backport which backports most of the java.time functionality to Java 6 & 7. If you are working for an Android project and your Android API level is still not compliant with Java-8, check Java 8+ APIs available through desugaring and How to use ThreeTenABP in Android Project.

Arvind Kumar Avinash
  • 71,965
  • 6
  • 74
  • 110