I wrote a Regular expression for Duration
Regular Expression is
([0-9]+ (?:[y|Y]ears?|[y|Y]rs?|[m|M]o?nths?|[d|D]a?ys?) ?)+
You can check this on this regex tool.
Test Cases that matched
- This October I will complete 24 years. Right now I am 3 months short means 23 years 9 mnths 19 days.
- ATL is servering Research work from last 10 years 23 months 19 dys.
Test cases that should be matched, but not matched
- I am twenty three years old.
- There was a disaster came exactly twenty two years twelve months thirty days back.
Doubts
- Help me to detect English words of numerics, see 3rd and 4th case.
EDITED 1
I added reFourDigits
varibale to handle Twelve hundred twenty
type cases. But it fails to catch that. Please help me in that. Below are all the details regarding above problem.
public static final String reDigit = "(?:[O|o]ne|[t|T]wo|[t|T]hree|[f|F]our|[f|F]ive|[s|S]ix|[s|S]even|[e|E]ight|[n|N]ine)";
public static final String reTeen = "(?:[t|T]wenty|[t|T]hirty|[f|F]orty|[f|F]ifty|[s|S]ixty|[s|S]eventy|[e|E]ighty|[n|N]inety)";
public static final String re10_19 = "(?:[t|T]en|[e|E]leven|[t|T]welve|[t|T]hirteen|[f|F]ourteen|[f|F]ifteen|[s|S]ixteen|[s|S]eventeen|[e|E]ighteen|[n|N]ineteen)";
public static final String reTwoDigits = "(?:(?:" + reTeen + "[- ])?" + reDigit + "|" + re10_19 + "|" + reTeen + ")";
public static final String reThreeDigits = "(?:(?:" + reDigit + " hundred (?:and)?)?" + reTwoDigits + "|" + reDigit + " hundred)";
public static final String reFourDigits = "(?:" + reTwoDigits + " hundred (?:and)? " + reTwoDigits + ")";
public static final String reSixDigits = "(?:(?:" + reThreeDigits + " thousand (?:and )?)?" + reThreeDigits + "|" + reThreeDigits + " thousand|" + reFourDigits + ")";
public static final String reTwelveDigits = "(?:(?:" + reSixDigits + " million (?:and )?)?" + reSixDigits + "|" + reSixDigits + " million)";
Pattern is
String patternString = "\\b( ?(?:[,0-9]+|"+Constants.reTwelveDigits+") ?)\\b";
When I run There are twenty hundred twenty two apples
. It finds two strings twenty
and twenty two
, instead of twenty hundred twenty two
.