1

I want to be able to parse period of time (days, hours, minutes) + optional string so the input looks like this: <time>(white_spaces)<optional_string>. As i know regex is the right tool for such things so i came up with such expression:

Pattern.compile("((?<days>\\d+)d)?((?<hours>\\d+)h)?((?<minutes>\\d+)m)?\\s+?(?<reason>.+)?");

Basically it works as expected however in this expression all times groups (days, hours, minutes) are optional and i want the input to atleast contain minutes group. However if hours or days are specified, minutes are not required. Also, all combinations of time groups (d+h, h+m, d+m, d+h+m) are possible. So how can i correct my expression? Or maybe there is other way to parse period of time?

EDIT: examples of inputs:

12h64m - correct

12d43m dsd - correct

- empty string - not correct

12m - correct

12d32h43m - correct

sdsds - not correct - no "time group specified"

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
peter Schiza
  • 387
  • 7
  • 23
  • 3
    You should include some examples that you wish to match. – Gurmanjot Singh Feb 24 '18 at 15:21
  • I updated question and added some examples. – peter Schiza Feb 24 '18 at 15:25
  • I think your validation rules need to be separate from your regex. The regex will let you parse the input string, but then you'll need separate code to say that at least one of the groups has to have a value. – asherber Feb 24 '18 at 15:27
  • Yeah, i had that in mind but thought there might be a way to do this with regex. I will, then, stay with my expression and after the match check if any of the "time groups" has value. – peter Schiza Feb 24 '18 at 15:32
  • You could add a lookahead at the beginning of your expression to check if any time group is defined. `(?=\\d+[dhm])` should do the job. – PJProudhon Feb 24 '18 at 15:32

2 Answers2

10

tl;dr

Duration.parse( 
    "P"
    .concat( "12d32h43m".replace( "d" , "DT" ) )
    .toUpperCase()
).toHoursPart()

8

You said:

As i know regex is the right tool for such things

Nope. No need for regex.

ISO 8601

Your input string format is close to the format formally defined by the ISO 8601 standard: PnYnMnDTnHnMnS

The P marks the beginning. The T separates any years-months-days from any hours-minutes-seconds.

Convert your input to conform with the standard.

String input =  "P".concat( "12d32h43m".replace( "d" , "DT" ) ).toUpperCase() ;

P12DT32H43M

java.time

Java has a class for that, Duration (and Period). No need for regex.

You can interrogate for each part. Call to…Part methods added in Java 9. For Java 8, see this Question and this Question.

long daysPart = d.toDaysPart() ;
int hoursPart = d.toHoursPart() ;

Entire example:

String input = "P".concat( "12d32h43m".replace( "d" , "DT" ) ).toUpperCase();
Duration d = Duration.parse( input );
long daysPart = d.toDaysPart();
int hoursPart = d.toHoursPart();
long hoursTotal = d.toHours(); // Total elapsed hours of entire duration.

Dump to console. Notice the math. Your input of 32 hours is recalculated to be 8, and days went from 12 to 13 (an extra 24-hour chunk = a day).

System.out.println( "input: " + input );
System.out.println( "d.toString()" + d );
System.out.println( "daysPart: " + daysPart );  // 13, not the 12 days seen in the input string. 24 hours were taken from the excessive `32h` of the input string, leaving 8 in the hours part.
System.out.println( "hoursPart: " + hoursPart );
System.out.println( "hoursTotal: " + hoursTotal );  // ( ( 13 * 24 ) + 8 ) = ( 312 + 8 ) = 320 

input: P12DT32H43M

d.toString()PT320H43M

daysPart: 13

hoursPart: 8

hoursTotal: 320

Duration versus Period

Use Duration for hours-minutes-seconds values. Use Period for years-months-days values.

  • In a Duration, “days” are understood to be generic chunks of 24-hours unrelated to dates and the calendar.
  • If you want dates, you should be using the Period class.
  • If you want both, think again. It usually does not make sense to mix the two concepts, though this may seem counter-intuitive at first thought. But if you insist, see the PeriodDuration class available in the ThreeTen-Extra project.

About java.time

The java.time framework is built into Java 8 and later. These classes supplant the troublesome old legacy date-time classes such as java.util.Date, Calendar, & SimpleDateFormat.

The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.

To learn more, see the Oracle Tutorial. And search Stack Overflow for many examples and explanations. Specification is JSR 310.

Using a JDBC driver compliant with JDBC 4.2 or later, you may exchange java.time objects directly with your database. No need for strings nor java.sql.* classes.

Where to obtain the java.time classes?

The ThreeTen-Extra project extends java.time with additional classes. This project is a proving ground for possible future additions to java.time. You may find some useful classes here such as Interval, YearWeek, YearQuarter, and more.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • To parse I believe you need to put `P` in front and `T` between the days and the hours (if either hours or minutes are present). Then one nice trait is that it does require at least one part present, whether it be the days, the hours or the minutes. I am doing something similar in the edit in [this answer](https://stackoverflow.com/a/48772289/5772882) and [the accompanying ideon demo](https://ideone.com/fCnuim). – Ole V.V. Feb 24 '18 at 18:13
  • @OleV.V. You are correct, my Answer was faulty. Fixed now. Thank you. – Basil Bourque Feb 24 '18 at 20:55
  • Your code converts to uppercase, which will work, but doesn't seem necessary as the [`Duration.parse()` API](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/time/Duration.html#parse(java.lang.CharSequence)) (at least as of Java 17) appears to allow either case. – Garret Wilson Sep 04 '22 at 14:50
  • 1
    Note also that `.concat( "12d32h43m".replace( "d" , "DT" ) )`, while it works for the input provided, will fail for day-only designations such as `P2D`, an example given in the [API docs](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/time/Duration.html#parse(java.lang.CharSequence)), because it will yield `P2DT`. – Garret Wilson Sep 04 '22 at 16:44
0

((?<minutes>\\d+)m)? means, the minutes group is optional. But you want it to be mandatory, so remove the trailing question mark.

user unknown
  • 35,537
  • 11
  • 75
  • 121