0

(DISCLAIMER: This is a homework exercise)

I have a file with several registers, each of them has a credit card number. I have to retrieve those registers that belong to an AMEX card, which begin with 34, 35 or 36 and have 15 digits overal like this:

XXXX-XXXX-XXXX-XXX

Separated by hyphens every 4 digits.

I tried using grep over that file to detect those numbers, but to no avail. This is the original command I used:

grep -E "[34-36][0-9]{2}[\-]([0-9]{4}[\-]){2}[0-9]{3}" MYFILE.txt

Which I assumed it would work, because I think the regular expression means:

"A number between 34 and 36, followed by 2 numbers from 0 to 9, followed by a hyphen, followed by 4 numbers from 0 to 9 followed by a hyphen (this twice) followed by 3 numbers from 0 to 9".

But this command retrieved nothing. For the record, I have this register:

3435-9999-8765-333

in the file, so I should at least retrieve that.

What's funny is that if I change the regular expression to just [34-36][0-9]{2}, I get a match (obviously), and if I add the hyphen ([34-36][0-9]{2}[\-]) I still get a match, but if I add anything after it (like [34-36][0-9]{2}[\-][0-9], [34-36][0-9]{2}[\-]a while changing the record to have an "a" after the first two digits) I stop retrieving anything.

What am I doing wrong? Is there a problem with the hyphen in the regular expression? Which would be the right one?

Thanks.

samgak
  • 23,944
  • 4
  • 60
  • 82
Heathcliff
  • 3,048
  • 4
  • 25
  • 44
  • 4
    I don't think `[34-36]` is correct, I think you'll need `(34|35|36)`. I don't use character classes to often; but I think they are a single character to another single character if it is a range, e.g. `a-z` or `0-9`, or a list of characters. Here's a thread on it, http://stackoverflow.com/questions/3148240/regex-why-doesnt-01-12-range-work-as-expected. – chris85 Apr 27 '15 at 03:09
  • 3
    or `3[4-6][0-9]....` recall that any list or range of characters (numbers) inside of `[ ...]` take one position of input. Good luck. – shellter Apr 27 '15 at 03:15
  • `[34-36]` means Any of (3, any number between 4 and 3, and 6). Regex flavors will usually trip on range out of order though `4-3`. While simple here because you only need three numbers, matching a range of numbers in regex can get to be quite the mess. – Regular Jo Apr 27 '15 at 04:01

2 Answers2

2

You should anchor the expression at the start of the line with the circumflex ^. (if your lines start with the card number, otherwise delete the ^ from the expression) There is no need to escape the hyphen within the character class (e.g. [\-] should be [-]. With that, try:

grep -E "^3[4-6][0-9]{2}[-]([0-9]{4}[-]){2}[0-9]{3}" MYFILE.txt
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
1

Try the following regex:

[3][4-6][0-9]{2}-([0-9]{4}-){2}[0-9]{3}
Jahid
  • 21,542
  • 10
  • 90
  • 108
Dilshan
  • 58
  • 1
  • 7