3

I need some help with a RegEx. The concept is simple, but the actual solution is well beyond anything I know how to figure out. If anyone could explain how I could achieve my desired effect (and provide an explanation with any example code) it'd be much appreciated!


Basically, imagine a database table that stores the following string:

'My name is $1. I wonder who $2 is.'

First, bear in mind that the dollar sign-number format IS set in stone. That's not just for this example--that's how these wildcards will actually be stored. I would like an input like the following to be able to return the above string.

'My name is John. I wonder who Sarah is.'

How would I create a query that searches with wildcards in this format, and then returns the applicable rows? I imagine a regular expression would be the best way. Bear in mind that, theoretically, any number of wildcards should be acceptable.

Right now, this is the part of my existing query that drags the content out of the database. The concatenation, et cetera, is there because in a single database cell, there are multiple strings concatenated by a vertical bar.

AND CONCAT('|', content, '|')
    LIKE CONCAT('%|', '" . mysql_real_escape_string($in) . "', '|%')

I need to modify ^this line to work with the variables that are a part of the query, while keeping the current effect (vertical bars, etc) in place. If the RegEx also takes into account the bars, then the CONCAT() functions can be removed.

Here is an example string with concatenation as it might appear in the database:

Hello, my name is $1.|Hello, I'm $1.|$1 is my name!

The query should be able to match with any of those chunks in the string, and then return that row if there is a match. The variables $1 should be treated as wildcards. Vertical bars will always delimit chunks.

Nathanael
  • 6,893
  • 5
  • 33
  • 54
  • if every independent string has only one pattern, why don't you just add a pattern index column that makes the search easier? then you can join the string and the pattern together, and create a regex match in the PHP side? – Taha Paksu May 17 '12 at 00:00

4 Answers4

2

For MySQL, this article is a nice guide which should help you. The Regexp would be "(\$)(\d+)". Here's a query I ripped off the article:

SELECT * FROM posts WHERE content REGEXP '(\\$)(\\d+)';

After retrieving data, use this handy function:

function ParseData($query,$data) {
    $matches=array();
    while(preg_match("/(\\$)(\\d+)/",$query,$matches)) {
        if (array_key_exists(substr($matches[0],1),$data))
            $query=str_replace($matches[0],"'".mysql_real_escape_string($data[substr($matches[0],1)])."'",$query);
        else
            $query=str_replace($matches[0],"''",$query);
    }
    return $query;
}

Usage:

$query="$1 went to $2's house";
$data=array(
    '1' => 'Bob',
    '2' => 'Joe'
);
echo ParseData($query,$data); //Returns "Bob went to Joe's house
Connor Peet
  • 6,065
  • 3
  • 22
  • 32
  • I edited my question to better show what I need. Looking over your examples, I don't think it's exactly what I want. – Nathanael May 17 '12 at 21:38
0

If you aren't sticky about using the $1 and $2 and could change them around a bit, you could take a look at this:

http://php.net/manual/en/function.sprintf.php

E.G.

<?php
$num = 5;
$location = 'tree';

$format = 'There are %d monkeys in the %s';
printf($format, $num, $location);
?>
Kevin Green
  • 1,137
  • 11
  • 21
0

If you want to find entries in the database, then you can use a LIKE statement:

SELECT statement FROM myTable WHERE statement LIKE '%$1%'

Which will find all statements that include $1. I'm assuming that the first number to replace will always be $1 - it doesn't matter, in that case, that the total number of wildcards is arbitrary, as we're just looking for the first one.

The PHP replacement is a little trickier. You could probably do something like:

$count = 1;
while (strpos($statement, "$" . $count)) {
    $statement = str_replace("$" . $count, $array[$count], $statement);
}

(I've not tested that, so there might be typos in there, but it should be enough to give the general idea.)

The one downside is that it will fail if you have more than ten parameters in your string to replace - the first runthrough will replace the first two characters of $10, as it's looking for $1.

andrewsi
  • 10,807
  • 132
  • 35
  • 51
0

I asked a different, but similar, question, and I think the solution applies to this question just as well.

https://stackoverflow.com/a/10763476/1382779

Community
  • 1
  • 1
Nathanael
  • 6,893
  • 5
  • 33
  • 54