0

Sorry if this is a noob question but I'm not very comfortable with regex and (as of now) this is a little beyond my understanding.

My dilemma is that we have a verity of ID badges that get scanned into an android application and I'm trying to parse out some dates.

For example, some dates are represented like so:

"ISS20141231"   format = yyyyMMdd   desired output = "20141231"
"ISS12312014"   format = MMddyyyy   desired output = "12312014"
"ISS12-31-2014" format = MM-dd-yyyy desired output = "12312014"

currently I have a regex pattern:

Pattern p = Pattern.compile("ISS(\\d{8})");
Matcher m = p.matcher(scanData);

which worked fine for the first two examples but recently I have realized that we also occasionally have dates which use dashes (or slashes) as separators.

Is there an efficient means for extracting these dates without having to write multiple patterns and loop through each one checking for a match?

possibly similar to: "ISS([\d{8} (\d{2}\w\d{2}\w\d{4}) (\d{4}\w\d{2}\w\d{2})])"

Thanks!!

[EDIT] Just to make things a little bit more clear. The substring ("ISSMMddyyyy") is from a much larger string and could be located anywhere within it. So regex must search the original (200+ byte) string for a match.

Logic1
  • 1,806
  • 3
  • 26
  • 43
  • 2
    According to [this question](http://stackoverflow.com/q/277547/5743988), one does not simply remove hyphens from a string with regex. It would have to be done as @anabhava has done with replace statements. – 4castle May 27 '16 at 22:12

3 Answers3

1

You can do 2 replace i.e. replace ISS first and then replace / or -:

str = str.replaceFirst("^ISS", "").replaceAll("[/-]", "");
anubhava
  • 761,203
  • 64
  • 569
  • 643
1

If that date string is actually a substring of a larger string, and so you need the regex in order to also search for that pattern, you could modify your regex to be:

ISS([\\d\\-/]{8,10})

And then when retrieving the capture group, strip the hyphens and slashes.

String dateStr = m.group(1).replaceAll("[/\\-]", "");
4castle
  • 32,613
  • 11
  • 69
  • 106
  • Brilliant solution! Thanks! – Logic1 May 27 '16 at 22:52
  • Just a slight correction for other users, the part of the pattern \\d-/ was treating it as a range (from \d to /) which wasn't exactly right. needed to be ISS([\\d\\-/]{8,10}) notice the backslash before the dash \- otherwise this was perfect – Logic1 May 27 '16 at 23:21
0

Or to only use a regex: Search: ISS([0-9])([-./])([0-9])([-./])([0-9]*) Replace: ${1}${3}${5}

GordR
  • 91
  • 8