-1


I'm currently working on a PHP project with a search engine which requires me to break down a search query to an array of substrings.
I want to support "doublequoting" so that the user is able to search for exact results e.g. "super secret stuff" only gets result with that exact string in their name while the same string without quotes returns results for "super" or "secret" or "stuff".
These quotes are supposed to be escapable.
Let me give you an example:

keyword Monday "Larrys \"House\"" epic

Should return an array of

[keyword, Monday, Larrys "House", epic]

I would like to support escaping of the backslashes too if this isn't too difficult to implement e.g. \\" still counts as a quote, while \\\" doesn't and so on.

Any help is appreciated!

RoiEX
  • 1,186
  • 1
  • 10
  • 37

1 Answers1

4

There is no need to use regex for this as php provides very good CSV parsing routines. Using str_getcsv:

$input = 'keyword Monday "Larrys \"House\"" epic';

$arr = str_getcsv($input, " ");

// remove \\ from array
$arr = array_map(function($elem) { return str_replace ( '\\\\', '\\', str_replace ( '\\"', '"', $elem ) );}, $arr);


// print resulting array
print_r($arr);

Code Demo

Output:

Array
(
    [0] => keyword
    [1] => Monday
    [2] => Larrys "House"
    [3] => epic
)
RoiEX
  • 1,186
  • 1
  • 10
  • 37
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Could you provide a way to remove the escaping doublequotes like shown in the example? – RoiEX Oct 10 '16 at 19:32
  • Check my edited answer now. – anubhava Oct 10 '16 at 19:38
  • Looks better, but I would replace `$arr[$i] = str_replace('\\', '', $arr[$i]);` with `$arr[$i] = str_replace('\\\\', '\\', $arr[$i]);$arr[$i] = str_replace('\\"', '"', $arr[$i]);`not sure if this works as expected though – RoiEX Oct 10 '16 at 19:42
  • Anyways... This is (more or less) exactly what I needed. Thanks! – RoiEX Oct 10 '16 at 19:43
  • 1
    Nice use of `str_getcsv`! I'd add that a `foreach` instead of `for` makes more sense to me here. – Steve Oct 10 '16 at 19:46
  • 1
    Actually, we can avoid looping using `array_map` (edited). – anubhava Oct 10 '16 at 19:50
  • 1
    @anubhava - even better! – Steve Oct 10 '16 at 19:52
  • Like I see this, this replaces every backslash with nothing... it works for most of the cases, but `\\ ` / escaped `\\\\ ` should be replaced with `\ ` /escaped `\\ `. And `\"` with `"` – RoiEX Oct 10 '16 at 19:59
  • @RoiEX: You can try: `$arr = array_map(function($elem) { return preg_replace('/\\\\(?=.)/', '', $elem); }, $arr);` – anubhava Oct 10 '16 at 20:05
  • 1
    Almost... `\\\\\\\\\\\\\\\\\\\\\\\\\\ ` is still being replaced with `\ ` but this doesn't matter. Feel free to edit your answer though. It's much better than the answer in the duplicated question – RoiEX Oct 10 '16 at 20:10