0

my regexp foo isn't the best but the following doesn't work and I'm looking for a little advice:

$query = $_GET['indicator'];
// given: 239,240  or 240,239  or 238,239,240  
//   and: 239, 240 or 240, 239 or 238, 239, 240
//   and: 239
// we need to check for start+number+puntuation(1), punctuation+number+end(2), punctuation+number+punctuation(3)
//                                                  space+number+end(4),       space+number+punctuation(6)
//                      start+number+end(6)
$sql = 'SELECT * from reports WHERE dataSetIDs REGEXP \'';
$sql .= '^'.$query.'[:punct:]|';            // 1
$sql .= '[:punct:]'.$query.'$|';            // 2
$sql .= '[:punct:]'.$query.'[:punct:]|';    // 3
$sql .= '[:space:]'.$query.'$|';            // 4
$sql .= '[:space:]'.$query.'[:punct:]|';    // 5
$sql .= '^'.$query.'$\'';                   // 6
$result = mysql_query($sql);

The query is looking at a specific cell which has CSV formatted set of numbers for which the formatting isn't set (i.e. could have spaces after the comma or not). I'm getting the rows with just one entry... but not the rows that have the number in the csv.

Any pointers greatly appreciated.

Cheers,

Dom

annoyingmouse
  • 5,524
  • 1
  • 27
  • 45
  • -1 for SQL-injection hole: see http://stackoverflow.com/questions/332365/xkcd-sql-injection-please-explain (but out of votes) – Johan May 28 '11 at 20:51

2 Answers2

1
  1. Why you do not explode the csv ?

  2. You can canonize the csv content in the database and your query input:

    1. remove all useless space
    2. use the same delimiter ',' as an example
    3. add a delimier at start and end ex:

      200, 300 => ,200,300,

    4. use a regexp like ",$id," which is simpler

VGE
  • 4,171
  • 18
  • 17
  • Hi @VGE, even with sanity-checking the field in the database I'm stuck with looking at cases where the delimiter doesn't exist at the beginning, the end or both which is where I'm getting stuck I think. Say I was looking for 23 I'd find it in 23 and 234,456 if I used the LIKE condition which is why I was looking at MySQLs REGEXP conditional. DB isn't set in stone but is a bit of a bother to alter with other people working on it as well... – annoyingmouse Feb 08 '11 at 13:49
  • Use a `update x set csv = ',' + csv where not csv like ',%'` to sanitize csv which does not contain a starting delimiter and `update x set csv = csv + ',' where not csv like '%,'` to sanitize. – VGE Feb 08 '11 at 14:52
1

Clocked it:

$query = mysql_real_escape_string($_GET['indicator']);
$sql = 'SELECT title from reports WHERE dataSetIDs REGEXP \'(^|[:,:]|[: :])'.$query.'([:,:]|[: :]|$)\'';
$result = mysql_query($sql);

Thanks for your help @VGE.

Cheers,

Dom

Johan
  • 74,508
  • 24
  • 191
  • 319
annoyingmouse
  • 5,524
  • 1
  • 27
  • 45