30

I know the advantages of using PreparedStatement, which are

  • query is rewritten and compiled by the database server
  • protection against SQL injection

But I want to know when we use it instead of Statement?

dimo414
  • 47,227
  • 18
  • 148
  • 244
Johanna
  • 27,036
  • 42
  • 89
  • 117
  • possible duplicate of [PreparedStatements and performance](http://stackoverflow.com/questions/687550/preparedstatements-and-performance) – sarnold Mar 05 '12 at 01:08

9 Answers9

31
  1. Query is rewritten and compiled by the database server

    If you don't use a prepared statement, the database server will have to parse, and compute an execution plan for the statement each time you run it. If you find that you'll run the same statement multiple times (with different parameters) then its worth preparing the statement once and reusing that prepared statement. If you are querying the database adhoc then there is probably little benefit to this.

  2. Protected against SQL injection

    This is an advantage you almost always want hence a good reason to use a PreparedStatement everytime. Its a consequence of having to parameterize the query but it does make running it a lot safer. The only time I can think of that this would not be useful is if you were allowing adhoc database queries; You might simply use the Statement object if you were prototyping the application and its quicker for you, or if the query contains no parameters.

dimo414
  • 47,227
  • 18
  • 148
  • 244
Martin Booth
  • 8,485
  • 31
  • 31
19

Ask Tom's opinion:

The use of a Statement in JDBC should be 100% localized to being used for DDL (ALTER, CREATE, GRANT, etc) as these are the only statement types that cannot accept BIND VARIABLES.

PreparedStatements or CallableStatements should be used for EVERY OTHER type of statement (DML, Queries). As these are the statement types that accept bind variables.

This is a fact, a rule, a law -- use prepared statements EVERYWHERE. Use STATEMENTS almost no where.

He's specifically talking about Oracle but the same principle applies to any database that caches execution plans.

Database apps that scale and prevent SQL injection attacks at the same time? What's the downside?

mattjames
  • 1,636
  • 1
  • 12
  • 14
10

I would turn this round: in a publicly distributed app, you should generally always use prepared statements unless you have a really compelling reason not to, and you should always supply parameters "properly" to the prepared statement, and not by splicing them into the query string.

Why? Well, basically because of the reasons you gave (or at least, the second one)...

Neil Coffey
  • 21,615
  • 7
  • 62
  • 83
  • 3
    Note: Performance of a PreparedStatement may be abysmal unless you do _lots_ of operations with it. This is database driver dependent. – Thorbjørn Ravn Andersen Jan 20 '10 at 08:10
  • Thanks, that's an interesting point. Out of interest, do you have an example of a specific DB/driver where this is the case? From tests I've done with MySQL, there doesn't appear to be anything in it performance-wise. Don't just remember with SQL Server, though don't remember prepared statements being particularly bad. – Neil Coffey Jan 21 '10 at 05:42
5

PreparedStatements should be used very carefully in WHERE clauses.

Suppose that a table is defined as:

create table t (int o, k varchar(100), v varchar(100))

(e.g. "o: object-ID (foreign key), k: attribute-key, v: attribute-value").

Furthermore there is a (non-unique) index on v.

create index ixt on t ( v )

Suppose that this table contains 200 million rows inserted like:

for (i = 0; i < 100*1000*1000; i++) {
  insert into t (o,k,v) values (i,'k1','v1');
  insert into t (o,k,v) values (i,'k2', Convert(i, varchar));
}

("Thus, every object o has attributes k1=v1 and k2=o")

Then you should not build queries like:

select o,p,v from t as tx, t as ty where tx.o=ty.o and tx.k=? and tx.v=? and ty.k=? and ty.v=?

("find objects that have two given attributes")

My experience with ORACLE and MSSQL is, that those queries might need many minutes to return. This is true even if no row matches the where clause. It depends on wether the SQL-Server decides to lookup tx.v or ty.v first.

One shoud put the values for the columns k and v directy into the statement. I think this is because the SQL-Servers take the values into account when computing the execution plan.

A query look like this returns always after milliseconds:

select o,p,v from t as tx, t as ty where tx.o=ty.o and tx.k='k1' and tx.v='v1' and ty.k='k2' and ty.v='1234'

("The SQL-Server will always search first for v='1234' and then for v='v1' ")

Regards
Wolfgang

Wolfgang
  • 51
  • 1
  • 2
2

Statement: Each time the sql query is running,this sql statement is sent to the DBMS where it is compiled. So, it increases the server loads and decreases the performance.

connection con=null; 
  String sql="select * from employee where id=5";
Statement st=conn.createStatement();

PreparedStatement: Unlike Statement PreparedStatement is given a sql query as a parameter when it is created.

connection con=null; 
String sql="select * from employee where id=?";
PreparedStatement ps=conn.prepareStatement(sql);

This sql statement is sent to Database where it is compiled. So,in preparedStatement compiled happens only once but in statement compiled happens each time Statement is called.

stumbler
  • 647
  • 3
  • 11
  • 26
2

You can always use PreparedStatement instead of Statment( select, insert , update, delete ). Better performance and protected against SQL injection.

But, don't use it with a dynamic request like a request with WHERE variable IN [ hundreds possibilities ] :

  1. It's counter-productive, you lost performance and memory because you cache every time a new request, and PreparedStatement are not just for SQL injection, it's about performance. In this case, Statement will not be slower.

  2. Your pool have a limit of PreparedStatment ( -1 defaut but you must limit it ), and you will reach this limit ! and if you have no limit or very large limit you have some risk of memory leak, and in extreme case OutofMemory errors. So if it's for your small personnal project used by 3 users it's not dramatic, but you don't want that if you're in a big company and that you're app is used by thousand people and million request.

Some reading. IBM : Periodical OutOfMemory errors with prepared statement caching

amdev
  • 3,010
  • 3
  • 35
  • 47
  • WHERE variable IN [ hundreds possibilities ] can be passed an array object (depends on your DB. Also this tends to be abstracted away by Hibernate, etc) – Andrew May 30 '19 at 20:12
  • hi @amdev, the page for the provided link is not available now, better to update. – Agent47 Feb 09 '21 at 07:47
  • Honnestly, I don't know how JDBC depending of the DB optimize the array object. Maybe it's just creating a new prepared statement for each array behind the scene. Maybe it depends of the driver and the database. – amdev Jun 22 '21 at 00:02
1

It's simply a Java DESIGN MISTAKE tie "prepared statement" with "parameterized query / bind variables".

Databases does have API to accept "bind variables" in SQL code that just run once time.

It's a big resource wasting force use "prepared statement" everywhere, just to protect from SQL injection. Why not Java just let developers use databases in correct way?

It could be as follows:
Statement Interface - Multiples commands could be run. Not accept bind variables. One execution of SQL command. No SQL injection protection.
PreparedStatement Interface - One command could be run. Accept bind variables. Multiple executions of SQL command. SQL injection protection.
(MISSING IN JAVA!) RunOnceStatement - One command could be run. Accept bind variables. One execution of SQL command. SQL injection protection.

For exemple, in Postgres performance could be better, by driver mapping to:
Statement Interface - PQExec()
PreparedStatement Interface - PQPrepare() / PQExecPrepare() / ...
(MISSING IN JAVA!) RunOnceStatement - PQExecParams()

Using prepared statement in SQL code that runs just once is a BIG performance problem: more processing in database, waste database memory, by maintaining plans that will not called later. Cache plans get so crowed that actual SQL commands that are executed multiple times could be deleted from cache.

Marcos
  • 11
  • 1
0

Besides preventing SQL injection, formatting portability (which you can't get from Statement), performance is the obvious reason. However, PreparedStatement doesn't come without any penalty. For example, it is generally slower than Statement if running only once, as there is some overhead. So the general idea is PreparedStatement should be used when you are performing the same query many many times. However, how much overhead is very database server implementation-specific, so exactly when to choose PreparedStatement over Statement, from performance consideration, should really be based on your actual experience/experiments of a specific database server.

bryantsai
  • 3,405
  • 1
  • 30
  • 30
0

I was facing the same issue ,Then I break the query into multiple common table expression (cte) and now it's working fine for me.

SELECT DISTINCT 
            1 AS RecordSource, --CMD
            EA.EmployerKey,
            EA.AccountID, 
            ISNULL(EA.SourceGroupNumber,'NA') AS SourceGroupNumber,
            EA.SourceSubGroupNumber,
            EA.PurchaserOrgNumber,
            ISNULL(EA.GroupName,'NA') AS GroupName, 
            EA.MemberSourceCode,
            CASE WHEN ESA.EmailAddress = '' OR ESA.EmailAddress IS NULL THEN 'UNKNOWN@UNKNOWN.COM' 
            ELSE ESA.EmailAddress END AS EmailAddress,
            CASE WHEN ESA.FirstName='' OR ESA.FirstName IS NULL THEN 'NA' ELSE ESA.FirstName END AS FirstName,
            CASE WHEN ESA.MiddleName='' OR ESA.MiddleName IS NULL THEN 'NA' ELSE  ESA.MiddleName END AS MiddleInitial,
            CASE WHEN ESA.LastName='' OR ESA.LastName IS NULL THEN 'NA' ELSE ESA.LastName END AS LastName,
            CASE WHEN ESA.ContactName='' OR ESA.ContactName IS NULL THEN 'NA' ELSE ESA.ContactName END AS ContactName,
            ISNULL(ESA.PhoneNumber,'NA') As PhoneNumber,
            'NA' AS MobilePhoneNumber,
            'GROUP ADMINISTRATOR' AS ContactType,
            ESA.StateCode
            
        INTO DCB_A1.Temp_abc

        FROM
            (SELECT  FE.EmployerKey, DE.AccountID, DE.SourceGroupNumber, DE.SourceSubGroupNumber,DE.PurchaserOrgNumber,
            DE.GroupName, DE.MemberSourceCode
            
            FROM DCB_A1.FctESF FE 
            
            INNER JOIN DCB_A1.DimESF DE 
            ON FE.EmployerKey = DE.EmployerKey

            LEFT JOIN DCB_A1.DimMKBU DM 
            ON FE.MBUKey = DM.MarketingBusinessUnitKey
                
            where DE.IsActiveVersion = 1 AND DE.PurchaserOrgTerminationReasonCode <> '171'
            --and (MBUStateCoverage <> '' and MBUStateCoverage <> 'UNK')
            
            GROUP BY FE.EmployerKey, DE.AccountID, DE.SourceGroupNumber,DE.SourceSubGroupNumber,DE.PurchaserOrgNumber,
             DE.GroupName, DE.MemberSourceCode
            )EA
            INNER JOIN
            (SELECT  DE.AccountID,DE.SourceGroupNumber,
            DE.MemberSourceCode,DE.GroupName,
            ISNULL(DC.ContactValue,'') AS EmailAddress, 
            ISNULL(DN.ContactText2,'') AS FirstName,
            ISNULL(DN.ContactText3,'') AS MiddleName,
            ISNULL(DN.ContactText4,'') AS LastName,
            ISNULL(DN.ContactValue,'') AS ContactName,
            ISNULL(DP.ContactValue,'') AS PhoneNumber,
            SUBSTRING(MBUStateCoverage, 1, 2) AS StateCode 
            
            FROM DCB_A1.FctESF FE 
            
            INNER JOIN DCB_A1.DimESF DE 
            ON FE.EmployerKey = DE.EmployerKey
            
            LEFT JOIN DCA_A1.DimECSF DC 
            ON DE.MemberSourceCode = DC.MemberSourceCode
            AND DE.PurchaserOrgNumber = DC.PurchaserOrgNumber
            AND DE.PurchaserOrgTypeCode = DC.PurchaserOrgTypeCode  
            AND DC.ContactType = 3  
            --and DC.ContactValue <> '' -- Do we need this?
            
            LEFT JOIN DCA_A1.DimECSF DN 
            ON DE.MemberSourceCode = DN.MemberSourceCode
            AND DE.PurchaserOrgNumber = DN.PurchaserOrgNumber
            AND DE.PurchaserOrgTypeCode = DN.PurchaserOrgTypeCode  
            AND DN.ContactType = 4
            AND DN.IsActiveVersion = 1
            
            LEFT JOIN DCA_A1.DimECSF DP 
            ON DE.MemberSourceCode = DP.MemberSourceCode
            AND DE.PurchaserOrgNumber = DP.PurchaserOrgNumber
            AND DE.PurchaserOrgTypeCode = DP.PurchaserOrgTypeCode  
            AND DP.ContactType = 1  
            
            LEFT JOIN DCB_A1.DimMKBU DM 
            ON FE.MBUKey = DM.MarketingBusinessUnitKey
                
            WHERE DE.IsActiveVersion = 2
            --and DE.AccountID =2
            --AND (DE.CustomerStatusCode = 'A' 
            -- OR DE.IsEmployerActive = 0)
            ----and CustomerStatusCode = 'A'
            --AND (MBUStateCoverage <> '' and MBUStateCoverage <> 'UNK')
            
            GROUP BY DE.AccountID,DE.SourceGroupNumber,
            DE.MemberSourceCode,DE.GroupName,
            ISNULL(DC.ContactValue,''), ISNULL(DN.ContactValue,''),
            ISNULL(DN.ContactText2,''),ISNULL(DN.ContactText3,''),
            ISNULL(DN.ContactText4,''),ISNULL(DP.ContactValue,'') ,SUBSTRING(MBUStateCoverage, 1, 2)
            )ESA

            ON EA.AccountID = ESA.AccountID ;

[Amazon][JDBC](11220) Parameters cannot be used with normal Statement objects, use PreparedStatements instead.`

****Below check what I done with this code and it iss now working fine.****


    with ESA AS
(SELECT  DE.AccountID,DE.SourceGroupNumber,
            DE.MemberSourceCode,DE.GroupName,
            ISNULL(DC.ContactValue,'') AS EmailAddress, 
            ISNULL(DN.ContactText2,'') AS FirstName,
            ISNULL(DN.ContactText3,'') AS MiddleName,
            ISNULL(DN.ContactText4,'') AS LastName,
            ISNULL(DN.ContactValue,'') AS ContactName,
            ISNULL(DP.ContactValue,'') AS PhoneNumber,
            SUBSTRING(MBUStateCoverage, 1, 2) AS StateCode 
            
            FROM DCB_A1.FctESF FE 
            
            INNER JOIN DCB_A1.DimESF DE 
            ON FE.EmployerKey = DE.EmployerKey
            
            LEFT JOIN DCA_A1.DimECSF DC 
            ON DE.MemberSourceCode = DC.MemberSourceCode
            AND DE.PurchaserOrgNumber = DC.PurchaserOrgNumber
            AND DE.PurchaserOrgTypeCode = DC.PurchaserOrgTypeCode  
            AND DC.ContactType = 3  
            --and DC.ContactValue <> '' -- Do we need this?
            
            LEFT JOIN DCA_A1.DimECSF DN 
            ON DE.MemberSourceCode = DN.MemberSourceCode
            AND DE.PurchaserOrgNumber = DN.PurchaserOrgNumber
            AND DE.PurchaserOrgTypeCode = DN.PurchaserOrgTypeCode  
            AND DN.ContactType = 4
            AND DN.IsActiveVersion = 1
            
            LEFT JOIN DCA_A1.DimECSF DP 
            ON DE.MemberSourceCode = DP.MemberSourceCode
            AND DE.PurchaserOrgNumber = DP.PurchaserOrgNumber
            AND DE.PurchaserOrgTypeCode = DP.PurchaserOrgTypeCode  
            AND DP.ContactType = 1  
            
            LEFT JOIN DCB_A1.DimMKBU DM 
            ON FE.MBUKey = DM.MarketingBusinessUnitKey
                
            WHERE DE.IsActiveVersion = 2
            --and DE.AccountID =2
            --AND (DE.CustomerStatusCode = 'A' 
            -- OR DE.IsEmployerActive = 0)
            ----and CustomerStatusCode = 'A'
            --AND (MBUStateCoverage <> '' and MBUStateCoverage <> 'UNK')
            
            GROUP BY DE.AccountID,DE.SourceGroupNumber,
            DE.MemberSourceCode,DE.GroupName,
            ISNULL(DC.ContactValue,''), ISNULL(DN.ContactValue,''),
            ISNULL(DN.ContactText2,''),ISNULL(DN.ContactText3,''),
            ISNULL(DN.ContactText4,''),ISNULL(DP.ContactValue,'') ,SUBSTRING(MBUStateCoverage, 1, 2)
),  EMK AS (
       SELECT DISTINCT 
            1 AS RecordSource, --CMD
            EA.EmployerKey,
            EA.AccountID, 
            ISNULL(EA.SourceGroupNumber,'NA') AS SourceGroupNumber,
            EA.SourceSubGroupNumber,
            EA.PurchaserOrgNumber,
            ISNULL(EA.GroupName,'NA') AS GroupName, 
            EA.MemberSourceCode
        FROM
            (SELECT  FE.EmployerKey, DE.AccountID, DE.SourceGroupNumber, DE.SourceSubGroupNumber,DE.PurchaserOrgNumber,
            DE.GroupName, DE.MemberSourceCode
            
            FROM DCB_A1.FctESF FE 
            
            INNER JOIN DCB_A1.DimESF DE 
            ON FE.EmployerKey = DE.EmployerKey

            LEFT JOIN DCB_A1.DimMKBU DM 
            ON FE.MBUKey = DM.MarketingBusinessUnitKey
                
            where DE.IsActiveVersion = 1 AND DE.PurchaserOrgTerminationReasonCode <> '171'
            --and (MBUStateCoverage <> '' and MBUStateCoverage <> 'UNK')
            
            GROUP BY FE.EmployerKey, DE.AccountID, DE.SourceGroupNumber,DE.SourceSubGroupNumber,DE.PurchaserOrgNumber,
             DE.GroupName, DE.MemberSourceCode)EA
    )
    Select Distinct 
    RecordSource
    ,EMK.EmployerKey
    ,EMK.AccountID
    ,EMK.SourceGroupNumber
    ,EMK.SourceSubGroupNumber
    ,EMK.PurchaserOrgNumber
    ,EMK.GroupName
    ,EMK.MemberSourceCode
    ,CASE WHEN ESA.EmailAddress = '' OR ESA.EmailAddress IS NULL THEN 'UNKNOWN@UNKNOWN.COM' 
            ELSE ESA.EmailAddress END AS EmailAddress
    ,CASE WHEN ESA.FirstName='' OR ESA.FirstName IS NULL THEN 'NA' ELSE ESA.FirstName END AS FirstName
    ,CASE WHEN ESA.MiddleName='' OR ESA.MiddleName IS NULL THEN 'NA' ELSE  ESA.MiddleName END AS MiddleInitial
    ,CASE WHEN ESA.LastName='' OR ESA.LastName IS NULL THEN 'NA' ELSE ESA.LastName END AS LastName
    ,CASE WHEN ESA.ContactName='' OR ESA.ContactName IS NULL THEN 'NA' ELSE ESA.ContactName END AS ContactName
    ,ISNULL(ESA.PhoneNumber,'NA') As PhoneNumber
    ,'NA' AS MobilePhoneNumber
    ,'GROUP ADMINISTRATOR' AS ContactType
    ,ESA.StateCode
    INTO DCB_A1.Temp_abc
    FROM EMK 
    INNER JOIN ESA
    ON EMK.AccountID = ESA.AccountID ;
Ajeet Verma
  • 1,021
  • 1
  • 7
  • 25