-5

I was trying to think in a simple way to avoid SQL Injection without using any libraries. Just a plain old sql string that I may concatenate parameters as I see fit and pass the whole query to the DB. No PDO, no prepared queries, no escaped strings.

I think the reasons why I don't want to use these aproaches are not relevant to this question, once it would turn it into a viewpoint debate, which isn't what SO is about.

In my view the only really problematic case is of names search as it would have to pass it to "LIKE '%:name%', once I can easily check validity for integer, dates and other kinds of parameters. So I came around this function here:

function strHasSQLReservedWords($str) {
    $str = strtoupper($str);
    $keywords = array("ABSOLUTE", "ACTION", "ADD", "AFTER", "ALL", "ALLOCATE", "ALTER", "AND", "ANY", "ARE", "ARRAY", "AS", "ASC", "ASENSITIVE", "ASSERTION", "ASYMMETRIC", "AT", "ATOMIC", "AUTHORIZATION", "AVG", "BEFORE", "BEGIN", "BETWEEN", "BIGINT", "BINARY", "BIT", "BIT_LENGTH", "BLOB", "BOOLEAN", "BOTH", "BREADTH", "BY", "CALL", "CALLED", "CASCADE", "CASCADED", "CASE", "CAST", "CATALOG", "CHAR", "CHARACTER", "CHARACTER_LENGTH", "CHAR_LENGTH", "CHECK", "CLOB", "CLOSE", "COALESCE", "COLLATE", "COLLATION", "COLUMN", "COMMIT", "CONDITION", "CONNECT", "CONNECTION", "CONSTRAINT", "CONSTRAINTS", "CONSTRUCTOR", "CONTAINS", "CONTINUE", "CONVERT", "CORRESPONDING", "COUNT", "CREATE", "CROSS", "CUBE", "CURRENT", "CURRENT_DATE", "CURRENT_DEFAULT_TRANSFORM_GROUP", "CURRENT_PATH", "CURRENT_ROLE", "CURRENT_TIME", "CURRENT_TIMESTAMP", "CURRENT_TRANSFORM_GROUP_FOR_TYPE", "CURRENT_USER", "CURSOR", "CYCLE", "DATA", "DATE", "DAY", "DEALLOCATE", "DEC", "DECIMAL", "DECLARE", "DEFAULT", "DEFERRABLE", "DEFERRED", "DELETE", "DEPTH", "DEREF", "DESC", "DESCRIBE", "DESCRIPTOR", "DETERMINISTIC", "DIAGNOSTICS", "DISCONNECT", "DISTINCT", "DO", "DOMAIN", "DOUBLE", "DROP", "DYNAMIC", "EACH", "ELEMENT", "ELSE", "ELSEIF", "END", "EQUALS", "ESCAPE", "EXCEPT", "EXCEPTION", "EXEC", "EXECUTE", "EXISTS", "EXIT", "EXTERNAL", "EXTRACT", "FALSE", "FETCH", "FILTER", "FIRST", "FLOAT", "FOR", "FOREIGN", "FOUND", "FREE", "FROM", "FULL", "FUNCTION", "GENERAL", "GET", "GLOBAL", "GO", "GOTO", "GRANT", "GROUP", "GROUPING", "HANDLER", "HAVING", "HOLD", "HOUR", "IDENTITY", "IF", "IMMEDIATE", "IN", "INDICATOR", "INITIALLY", "INNER", "INOUT", "INPUT", "INSENSITIVE", "INSERT", "INT", "INTEGER", "INTERSECT", "INTERVAL", "INTO", "IS", "ISOLATION", "ITERATE", "JOIN", "KEY", "LANGUAGE", "LARGE", "LAST", "LATERAL", "LEADING", "LEAVE", "LEFT", "LEVEL", "LIKE", "LOCAL", "LOCALTIME", "LOCALTIMESTAMP", "LOCATOR", "LOOP", "LOWER", "MAP", "MATCH", "MAX", "MEMBER", "MERGE", "METHOD", "MIN", "MINUTE", "MODIFIES", "MODULE", "MONTH", "MULTISET", "NAMES", "NATIONAL", "NATURAL", "NCHAR", "NCLOB", "NEW", "NEXT", "NO", "NONE", "NOT", "NULL", "NULLIF", "NUMERIC", "OBJECT", "OCTET_LENGTH", "OF", "OLD", "ON", "ONLY", "OPEN", "OPTION", "OR", "ORDER", "ORDINALITY", "OUT", "OUTER", "OUTPUT", "OVER", "OVERLAPS", "PAD", "PARAMETER", "PARTIAL", "PARTITION", "PATH", "POSITION", "PRECISION", "PREPARE", "PRESERVE", "PRIMARY", "PRIOR", "PRIVILEGES", "PROCEDURE", "PUBLIC", "RANGE", "READ", "READS", "REAL", "RECURSIVE", "REF", "REFERENCES", "REFERENCING", "RELATIVE", "RELEASE", "REPEAT", "RESIGNAL", "RESTRICT", "RESULT", "RETURN", "RETURNS", "REVOKE", "RIGHT", "ROLE", "ROLLBACK", "ROLLUP", "ROUTINE", "ROW", "ROWS", "SAVEPOINT", "SCHEMA", "SCOPE", "SCROLL", "SEARCH", "SECOND", "SECTION", "SELECT", "SENSITIVE", "SESSION", "SESSION_USER", "SET", "SETS", "SIGNAL", "SIMILAR", "SIZE", "SMALLINT", "SOME", "SPACE", "SPECIFIC", "SPECIFICTYPE", "SQL", "SQLCODE", "SQLERROR", "SQLEXCEPTION", "SQLSTATE", "SQLWARNING", "START", "STATE", "STATIC", "SUBMULTISET", "SUBSTRING", "SUM", "SYMMETRIC", "SYSTEM", "SYSTEM_USER", "TABLE", "TABLESAMPLE", "TEMPORARY", "THEN", "TIME", "TIMESTAMP", "TIMEZONE_HOUR", "TIMEZONE_MINUTE", "TO", "TRAILING", "TRANSACTION", "TRANSLATE", "TRANSLATION", "TREAT", "TRIGGER", "TRIM", "TRUE", "UNDER", "UNDO", "UNION", "UNIQUE", "UNKNOWN", "UNNEST", "UNTIL", "UPDATE", "UPPER", "USAGE", "USER", "USING", "VALUE", "VALUES", "VARCHAR", "VARYING", "VIEW", "WHEN", "WHENEVER", "WHERE", "WHILE", "WINDOW", "WITH", "WITHIN", "WITHOUT", "WORK", "WRITE", "YEAR", "ZONE");
    foreach($keywords as $keyword)
        if (strpos($str, $keyword) >=0) 
            return true;
    return false;
}

so I would use something like:

$name = // get name from request;
$sql = 'select * from mytable where name = :name';
if (!strHasSQLReservedWords($name)) {
    $sql = str_replace(":name", $name, $sql);
    $result = $DBProviderSQLExecuteFunction($sql);
}

Anyone sees a way a SQL injection could pass through this validation for name parameters ? if yes, is there any way to enhance it so I can use it this way ?

please forget for now names like O'Hara as they could be easily validated.

Edit

Well I don't want to say my reasons 'cause if I told them to you I'll start a discussion about their validity, and probably a moderator will intervene as that's not what SO is for. But if you insist I can open a private talk and explain them to you. But I really don't think that's relevant.

About apostrophes, I'm not very worried with them because this system is used in a country where apostrophes aren't common. I checked my whole DB (26.733 names) and there isn't one single name that has an apostrophe. That's because names in this system came directly from A3 tokens and there are no special characters in the name field. Just plain letters. So I'm really confident I won't have problems with it, and I can fix any if they came.

So... about bug fixing my function, what if I divide the name by spaces and compare whole words with the reserved words list ? I can exclude MAX function as it won't have any major drawback AFAIU. Is there another function you see that can be a whole name ? Nobody's called OR or AND. :D

So based on that I enhanced the function to this:

function strHasSQLReservedWords($str) {
    $str = preg_replace('/\s+/', ' ',$row['message']);
    $arr = explode(" ", strtoupper($str));
    //"MAX",
    $keywords = array("ABSOLUTE", "ACTION", "ADD", "AFTER", "ALL", "ALLOCATE", "ALTER", "AND", "ANY", "ARE", "ARRAY", "AS", "ASC", "ASENSITIVE", "ASSERTION", "ASYMMETRIC", "AT", "ATOMIC", "AUTHORIZATION", "AVG", "BEFORE", "BEGIN", "BETWEEN", "BIGINT", "BINARY", "BIT", "BIT_LENGTH", "BLOB", "BOOLEAN", "BOTH", "BREADTH", "BY", "CALL", "CALLED", "CASCADE", "CASCADED", "CASE", "CAST", "CATALOG", "CHAR", "CHARACTER", "CHARACTER_LENGTH", "CHAR_LENGTH", "CHECK", "CLOB", "CLOSE", "COALESCE", "COLLATE", "COLLATION", "COLUMN", "COMMIT", "CONDITION", "CONNECT", "CONNECTION", "CONSTRAINT", "CONSTRAINTS", "CONSTRUCTOR", "CONTAINS", "CONTINUE", "CONVERT", "CORRESPONDING", "COUNT", "CREATE", "CROSS", "CUBE", "CURRENT", "CURRENT_DATE", "CURRENT_DEFAULT_TRANSFORM_GROUP", "CURRENT_PATH", "CURRENT_ROLE", "CURRENT_TIME", "CURRENT_TIMESTAMP", "CURRENT_TRANSFORM_GROUP_FOR_TYPE", "CURRENT_USER", "CURSOR", "CYCLE", "DATA", "DATE", "DAY", "DEALLOCATE", "DEC", "DECIMAL", "DECLARE", "DEFAULT", "DEFERRABLE", "DEFERRED", "DELETE", "DEPTH", "DEREF", "DESC", "DESCRIBE", "DESCRIPTOR", "DETERMINISTIC", "DIAGNOSTICS", "DISCONNECT", "DISTINCT", "DO", "DOMAIN", "DOUBLE", "DROP", "DYNAMIC", "EACH", "ELEMENT", "ELSE", "ELSEIF", "END", "EQUALS", "ESCAPE", "EXCEPT", "EXCEPTION", "EXEC", "EXECUTE", "EXISTS", "EXIT", "EXTERNAL", "EXTRACT", "FALSE", "FETCH", "FILTER", "FIRST", "FLOAT", "FOR", "FOREIGN", "FOUND", "FREE", "FROM", "FULL", "FUNCTION", "GENERAL", "GET", "GLOBAL", "GO", "GOTO", "GRANT", "GROUP", "GROUPING", "HANDLER", "HAVING", "HOLD", "HOUR", "IDENTITY", "IF", "IMMEDIATE", "IN", "INDICATOR", "INITIALLY", "INNER", "INOUT", "INPUT", "INSENSITIVE", "INSERT", "INT", "INTEGER", "INTERSECT", "INTERVAL", "INTO", "IS", "ISOLATION", "ITERATE", "JOIN", "KEY", "LANGUAGE", "LARGE", "LAST", "LATERAL", "LEADING", "LEAVE", "LEFT", "LEVEL", "LIKE", "LOCAL", "LOCALTIME", "LOCALTIMESTAMP", "LOCATOR", "LOOP", "LOWER", "MAP", "MATCH",  "MEMBER", "MERGE", "METHOD", "MIN", "MINUTE", "MODIFIES", "MODULE", "MONTH", "MULTISET", "NAMES", "NATIONAL", "NATURAL", "NCHAR", "NCLOB", "NEW", "NEXT", "NO", "NONE", "NOT", "NULL", "NULLIF", "NUMERIC", "OBJECT", "OCTET_LENGTH", "OF", "OLD", "ON", "ONLY", "OPEN", "OPTION", "OR", "ORDER", "ORDINALITY", "OUT", "OUTER", "OUTPUT", "OVER", "OVERLAPS", "PAD", "PARAMETER", "PARTIAL", "PARTITION", "PATH", "POSITION", "PRECISION", "PREPARE", "PRESERVE", "PRIMARY", "PRIOR", "PRIVILEGES", "PROCEDURE", "PUBLIC", "RANGE", "READ", "READS", "REAL", "RECURSIVE", "REF", "REFERENCES", "REFERENCING", "RELATIVE", "RELEASE", "REPEAT", "RESIGNAL", "RESTRICT", "RESULT", "RETURN", "RETURNS", "REVOKE", "RIGHT", "ROLE", "ROLLBACK", "ROLLUP", "ROUTINE", "ROW", "ROWS", "SAVEPOINT", "SCHEMA", "SCOPE", "SCROLL", "SEARCH", "SECOND", "SECTION", "SELECT", "SENSITIVE", "SESSION", "SESSION_USER", "SET", "SETS", "SIGNAL", "SIMILAR", "SIZE", "SMALLINT", "SOME", "SPACE", "SPECIFIC", "SPECIFICTYPE", "SQL", "SQLCODE", "SQLERROR", "SQLEXCEPTION", "SQLSTATE", "SQLWARNING", "START", "STATE", "STATIC", "SUBMULTISET", "SUBSTRING", "SUM", "SYMMETRIC", "SYSTEM", "SYSTEM_USER", "TABLE", "TABLESAMPLE", "TEMPORARY", "THEN", "TIME", "TIMESTAMP", "TIMEZONE_HOUR", "TIMEZONE_MINUTE", "TO", "TRAILING", "TRANSACTION", "TRANSLATE", "TRANSLATION", "TREAT", "TRIGGER", "TRIM", "TRUE", "UNDER", "UNDO", "UNION", "UNIQUE", "UNKNOWN", "UNNEST", "UNTIL", "UPDATE", "UPPER", "USAGE", "USER", "USING", "VALUE", "VALUES", "VARCHAR", "VARYING", "VIEW", "WHEN", "WHENEVER", "WHERE", "WHILE", "WINDOW", "WITH", "WITHIN", "WITHOUT", "WORK", "WRITE", "YEAR", "ZONE");
    foreach($keywords as $keyword)
        foreach($arr as $item)
            if ($item == $keyword)
                return true;
    return false;
}

I also thinking in saving all the queries that where entered by users and that matched any keyword to pinpoint some edge cases.

Nelson Teixeira
  • 6,297
  • 5
  • 36
  • 73
  • Why do you want to use string replacement for `:name` rather than a normal parameterized query method? And PDO just makes more sense since it works for all databases, no rewriting when you decide to switch from MySQL to MSSQL or Oracles. Its a no-brainer. – developerwjk Dec 07 '15 at 20:33
  • Forgetting sql injection for a minute, run your code with a name like O'Hara. – Dan Bracuk Dec 07 '15 at 20:33
  • ı edited the question – Nelson Teixeira Dec 07 '15 at 20:38
  • 2
    Lol. `"please forget for now names like O'Hara as they could be easily validated."` Wrong. If a user is writing up something in Word and pastes in your textbox, and they paste some UTF8 curly quote or single quote, you won't be able to tell that its a single quote without having a full list of every UTF8 character that can transform into a single-quote when transformed from UTF8 to ASCII, so you can't validate this. – developerwjk Dec 07 '15 at 20:43
  • so you think the only problem of my approach are apostrophes? besides them do you see any other problems? – Nelson Teixeira Dec 07 '15 at 20:50
  • You're assuming the problem is adding to or deleting from the database .Its not, unless you're using `*_multi_query` ... the bigger problem is tricking a select statement to return more rows. – developerwjk Dec 07 '15 at 21:06
  • So you can easily validate a person’s last names (sic!) but you can’t handle a value properly by using parameterization (or quoting and escaping it properly)? Seriously, parameterization as provided by the libraries is such a convenient way of handling values properly. – Gumbo Dec 08 '15 at 06:33

1 Answers1

1

I do not really see a way of successful SQL injections, but I wonder what if apostrophe is inside the text or a name contains one of the keywords. Technically speaking I believe your way would work for the purpose you want it to work, but I believe this unorthodox approach is undesirable. Apostrophes are not easy to validate, see developerwjk's comment.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
  • so you think the only problem of my approach are apostrophes? – Nelson Teixeira Dec 07 '15 at 20:50
  • 1
    I see a way of successful injection: `or 1=1` ... because SQL injection these days is more a problem of making the query return something when it shouldn't, or more than it should, not of dropping the database. – developerwjk Dec 07 '15 at 21:09
  • Also the method above blacklists anything with a partial match to the keywords. So "MAX" is banned. If my name is Maximilian, I can't use the site. So much easier to just do it right. – developerwjk Dec 07 '15 at 21:11
  • @developerwjk, good point, however, he can add OR to the list. Assuming that all SQL reserved words are present in that array, including || and && where they are supported, the array can be a successful protection, but if someone happens to be called O'Reily or Backhand, then both the ' and the and will cause undesired behaviors. So, while his initial idea still allows successful injection, after some bug fixing it will eventually reach to a level that they will not be possible, but this approach should not be advised, yet, I assume that the op has his reasons, whatever they are. – Lajos Arpad Dec 07 '15 at 23:23
  • @NelsonTeixeira, not only. What if a name contains and SQL-reserved word? Also, see Developerwjk's points, he is right both about ' and the incompleteness of your list. – Lajos Arpad Dec 07 '15 at 23:24