-1

I have a folder that I scan in a php page, introducing all of the file names (and directory names) into the array called struct, which I have to then analyze and check two things:

1) if the filename matches a pattern of 5 to 8 lowercase letters followed by and underscore, two numbers and ends with the .jpg extension, for which I tried using

$pattern = '~[a-z]{5,8}_\d{1,2}\.jpg~';

2) get the number from the file name and pass it as an int, for which I tried using

$getnumber = '~(?<=_)\d+(?=\.)~';

The first pattern seems to work, even when called upon with

if ((preg_match($pattern, $entry) === 0) && (!in_array($entry, $wrongformat)))

but trying to get the numbers of the files with

$temp[] = preg_grep($getnumber, $struct);
print_r($temp);

returns 1 in every element. How can I get the actual number inside the file name?

This is only the tip of the iceberg, it needs to be coordinated with other systems, so using something simple like explode is not something that is usable in the current setup, for various reasons (chief of which is because Boss said so). My current regex experience is 4 hours of Google-fu, so ANY help would be appreciated.

mihneaadr
  • 53
  • 7
  • 1
    Could you give examples of file names with numbers, and perhaps even tell us what numbers you want to extract? Given the first pattern it seems to be something like `ABCDEF12.JPG`? – KIKO Software Jun 14 '19 at 19:04
  • FWIW: [How to get a substring between two strings in PHP?](https://stackoverflow.com/questions/5696412/how-to-get-a-substring-between-two-strings-in-php) Could this be a dupe? – ficuscr Jun 14 '19 at 19:06
  • I suspect you need https://3v4l.org/NFs8K. Please let know if it is what you need. – Wiktor Stribiżew Jun 14 '19 at 20:08

2 Answers2

1

You could do it with a single expression by using a capturing group, allowing you to extract that subpattern. If you have more than one capture group, it will index them in the order that they are defined. Function reference.

You'll also need to anchor your pattern, since a filename with 11 lowercase letters has between 5 and 8 characters (in part of the string, anyways).

$pattern = '/^[a-z]{5,8}_(\d{1,2})\.jpg$/';

if (preg_match($pattern, $entry, $matches)) {
    $number = $matches[1];
}

Demo

msg
  • 7,863
  • 3
  • 14
  • 33
  • Worked wonderfully. Furthermore, I've modified it to be ``` '/^([a-z]{5,8})_(\d{1,2})\.([a-z]{3})$/' ``` So I could capture the ID, number and extension in one. So far it has worked properly, is it safe to assume that $matches[1] = ID, [2] = number and [3] equals extension in this case? – mihneaadr Jun 19 '19 at 13:35
  • @mihneaadr yes, otherwise the expression wouldn't match. `$matches[0]` will also include the full pattern match. In your case doesn't matter because is an anchored expression and `$matches[0]` will have the same value as `$entry`, but if you removed the starting anchor it would match a longer filename as I said earlier. – msg Jun 19 '19 at 14:12
0

Here, we just use a simple capturing group in your original expression,

([a-z]{5,8}_\d{1,2})\.jpg

and our desired output is in $1.

Demo

Test

$re = '/([a-z]{5,8}_\d{1,2})\.jpg/m';
$str = 'aaaaa_11.jpg
aaaaa_11.jpeg';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

var_dump($matches[0][1]);

Output

aaaaa_11
Emma
  • 27,428
  • 11
  • 44
  • 69