multi-byte function to replace preg_match_all?

Question

I'm looking for a multi-byte function to replace preg_match_all(). I need one that will give me an array of matched strings, like the $matches argument from preg_match(). The function mb_ereg_match() doesn't seem to do it -- it only gives me a boolean indicating if there were any matches.

Looking at the mb_* functions page, I don't offhand see anythng that replaces the functionality of preg_match(). What do I use?

Edit I'm an idiot. I originally posted this question asking for a replacement for preg_match, which of course is ereg_match. However both those only return the first result. What I wanted was a replacement for preg_match_all, which returns all match texts. But anyways, the u modifier works in my case for preg_match_all, as hakre pointed out.

http://stackoverflow.com/questions/1766485/are-the-php-preg-functions-multibyte-safe — Griwes, Oct 06 '11 at 14:21
I note your say that `ereg_match()` is a replacement for `preg_match()`. Be aware that PHP's `ereg_` functions are deprecated, and should be avoided. — Spudley, Oct 06 '11 at 16:15

hakre · Accepted Answer · 2021-12-25T13:19:56.770

17

Have you taken a look into mb_ereg?

Additionally, you can pass an UTF-8 encoded string into preg_match using the u modifier, which might be the kind of multi-byte support you need. The other option is to encode into UTF-8 and then encode the results back.

See as well an answer to a related question: Are the PHP preg_functions multibyte safe?

edited Dec 25 '21 at 13:19

answered Oct 06 '11 at 14:33

hakre

193,403
52
435
836

Can you point me to some documentation on the `u` modifier? That's part of the regex? – user151841 Oct 06 '11 at 14:58
Actually it looks like the 4th answer down on that related question has some info about the `u` modifier. – user151841 Oct 06 '11 at 14:59
So I tried it out, and it only seems to return the first match :P Unless I'm doing it wrong. – user151841 Oct 06 '11 at 15:08
You should add your code to your question, so it's actually clear what you tried so far. Take care that the input string is UTF-8 encoded if you're using `preg_match` with the `u` modifier. Then I might be able to spot your error. – hakre Oct 06 '11 at 15:10
Sorry, what I meant was that `mb_ereg` returns only the first match string (apparently). – user151841 Oct 06 '11 at 15:14
I'm an idiot. I'm looking for a replacement for `preg_match_all`! :P – user151841 Oct 06 '11 at 15:30
LOL ;), okay. What is the encoding/charset of your string? I ask, because if you have this in UTF-8, you don't need any replacement. If not, you needs to create a replacement function on your own that consists of `mb_ereg...` functions, doing one match after the other. – hakre Oct 06 '11 at 15:33
4

The `u` modifier is the correct answer. Avoid the `ereg_` (and `mb_ereg_`) functions because they have been deprecated. – Spudley Oct 06 '11 at 16:16
@hakre I'm not looking to do replacement, but to pull multiple matches out of a large string. – user151841 Oct 06 '11 at 16:29
Find the next match after the offset of the last match + the length of the last match (both 0 at start). Loop until nothing is found any longer. Store matches inside an array. – hakre Oct 06 '11 at 16:34

MarcoP · Answer 2 · 2020-10-21T08:14:43.343

3

PHP: preg_grep manual

$matches = preg_grep('/(needles|to|find)/u', $inputArray);

Returns an array indexed using the keys from the input array.

Note the /u modifier which enables multibyte support.

Hope it helps others.

edited Oct 21 '20 at 08:14

answered Feb 14 '14 at 15:36

MarcoP

190
1
6

multi-byte function to replace preg_match_all?

2 Answers2

Linked