-2

Say we have a typical prepared statement query like so:

$db->query("SELECT * FROM users WHERE id = :id");
$db->bind(':id', "x ' OR 'x' = 'x");
$db->execute();

Question

What is the mechanism that works behind the scenes to ensure SQL injection is avoided?

Namely, I understand that SQL and Data have to be merged back at some point for data retrieval process to work. What is the mechanism of this merge, and which subsystem is responsible for the merge?

Program & Data Separation

I hear and it has been noted that the inherent separation between Program and Data remove the possibility of such and injection. This is not getting to me because I can write a program that simply joins SQL & Data back together.

To drive the point home, here's some code to illustrate this point.

//receive SQL statement and data *separately*
function setSQL($sql) {$this->sql = $sql;}

function setData($data) {$this->data = $data;}

function executeStatement()
{
    //simplified code that shows that we can 
    //blindly replace named parameter with supplied 
    //data inside SQL and execute it
    db_execute(str_replace(':id', $this->data, $this->sql));
}

Are we protected? Well, do we have a program? yes. Do we have the Data sent separately? yes... do we still have injection attack? yes....? did I miss something? The part I am missing is at the heart of the question.

Data has to be joined with the SQL or the Program somehow, and I seek to find how it is done at code/pseudocode level, and what subsystem is responsible for merging this data together.

I am looking for "Fill in this gap in my knowledge" type of answer, where the mechanism of joining data back is exposed enough to cover up this gap.

Dennis
  • 7,907
  • 11
  • 65
  • 115
  • 4
    "*Just Separating SQL from parameters does not necessarily make the resulting query safer.*" .... uh, yes it does. The parameters get sent to the server separately and marked as data, so their contents are **never** executed as SQL even if they contain SQL-like code. That's the whole point, and that's what makes it safe. The query is executed; the data isn't. As long as your query string is static, it cannot possibly be injection attacked. – Simba Jun 10 '16 at 16:24
  • And also read http://stackoverflow.com/questions/12859942/why-shouldnt-i-use-mysql-functions-in-php/14110189#14110189 – RiggsFolly Jun 10 '16 at 16:24
  • please see the note inside the question.. the "already have an answer" did not seem to give me the level of detail I sought. – Dennis Jun 10 '16 at 16:27
  • The prepare basically creates a stored proceedure on the MYSQL Server compiled and optimized thats the program in this context – RiggsFolly Jun 10 '16 at 16:30
  • @Dennis there are additional links in the answer which give a much clearer picture, i.e. http://stackoverflow.com/questions/60174/how-can-i-prevent-sql-injection-in-php/8255054#8255054 – Jay Blanchard Jun 10 '16 at 16:31
  • @Jay I guess I am asking a somewhat different question, it took me a bit to get closer to the point. I am seeking more details on how things are implemented.. i.e. see my latest edit starting with `Placeholders`. The link you gave and the material I found so far does not delve deeply into specifics enough .... I am seeking a deeper understanding of how this is implemented in practice. I don't need to see "bare code" but I need to understand things like who writes/generates the code & how it works at figuring out what type of parameter is being used. – Dennis Jun 10 '16 at 17:11
  • For example, is my understanding that "someone actually creates C++ code" correct? DId someone actually write a program for PHP PDO for example that converts SQL into C? How do they take care of deciding what type the passed parameters are? Those kind of questions is what I seek. – Dennis Jun 10 '16 at 17:13
  • The PHP Dev Team writes the code and determines the inner workings of PHP. Is that what you're after? Aside from that, the questions you're asking may be too broad and may be more operational in nature, rather than a code based solution to a question, which is what Stack Overflow is designed for. – Jay Blanchard Jun 10 '16 at 17:23
  • 1
    For the moment I will answer your questions right here in the comments and later will decide whether to reopen this question and answer, or update the other answer. – Your Common Sense Jun 10 '16 at 17:33
  • @Jay. Thanks. It can be too broad.... to answer you -- what I'm after is the mechanism and some questions about how the mechanism works. For example, I invented an idea that the mechanism is "someone wrote the code to take SQL statement and turn it into code that expects typed parameters and then feeds them the typed parameters". It appears to be the case and is plausible that it is, but I don't actually know with certainty that it is. But assuming that it is, I'd have more questions about it such as under my "Question 1" – Dennis Jun 10 '16 at 17:33
  • 1
    When no type is specified, then string type is applied by default – Your Common Sense Jun 10 '16 at 17:34
  • The program is the query itself, written in SQL language. There is no translation to C++ or whatever else programming language. Database server reads and interprets sql just like php interpreter reads and interprets php. Does it make the process clearer for you? – Your Common Sense Jun 10 '16 at 17:39
  • In response to your edit, that's much too broad. You should really only be asking one question at a time. The short answers are 1) the database schema determines the type and 2) a driver is written that converts SQL strings into commands against a database. The database has some kind of interface which other systems can send and receive data through. – Mike Cluck Jun 10 '16 at 19:04
  • @Your, in that case I'm back to square 1. If we *separate* SQL from parameters, and we *keep* SQL as SQL, what stops us from "joining the parameters back together with SQL" ? I could see this work if SQL was translated to a C++ program. But when SQL remains SQL, what makes parametrized queries inherently different from concatenation like so: `"SELECT * from .. id = " . apply_some_filter($id);` – Dennis Jun 10 '16 at 19:29
  • I guess .. does the database *itself* allow for parametrised queries. i.e. can I do an SQL statement inside an SQL client to where I supply the SQL and the parameters separately? For example, is there MySQL syntax for such things. Are PDO's and MySQL's parameter separation is just a front to SQL server prepared statement facility? – Dennis Jun 10 '16 at 19:46
  • Yes, pretty much this way. Save for PDO emulation mode – Your Common Sense Jun 11 '16 at 03:20
  • Thanks all. I believe that the answer I was looking for is "Prepared Statements at PHP level are essentially a 1:1 mapping to Stored Procedures at SQL Server Level". The `Program` is created inside SQL Server, and the data is passed to it from PHP. So the mechanism I was looking for was 1) PHP does pass-through for SQL and data in separate streams, and 2) all the magic happens inside SQL Server using normal stored procedures + plugged in data. And when PHP or SQL are not given a column data type, string data type is assumed. – Dennis Jun 13 '16 at 15:39
  • I added my vote to reopen because my OCD tells me it is not a duplicate of the question as it is currently marked. – Dennis Jun 13 '16 at 16:13
  • 1.Involving whatever stored procedures is too much an exaggeration. There are NO stored procedures. 2.The program is the SQL query itself. 3.Your example is a failed one, because you are still creating a program on the fly. The program should be always the same, as written by the programmer. This is the point, and your example violating it. Imagine a PHP script. Can you change its logic assigning a value to one of its variables, by means of adding PHP code to it? No, unless you are using eval(). SAME HERE. Either you have your program intact, or have an injection. – Your Common Sense Jun 13 '16 at 16:21
  • 1
    All you need is to understand the program/data separation. At the moment you don't. – Your Common Sense Jun 13 '16 at 16:23

1 Answers1

1

What you need to understand is the difference between the code and the data.

The code is a program you write while the data is a data this program is using. They never interfere.

Imagine a typical PHP script, like

echo "Hello ". $name;

No matter what $name may contain, this contents will never change your program. It will always remain the same. Even if $name contains something like

<?php unlink(__FILE__);

it will do no harm. Because it's just data which is not interpreted as a program. So this code will be simply echoed out to the browser and can be seen in the source code.

Exactly the same thing with [native] prepared statements.

You have a program (an SQL query) and you have some data. And the latter is only used by the program but never can alter it.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345