Slow query when connecting to linked server

Question

I've got this query

UPDATE linkeddb...table SET field1 = 'Y' WHERE column1 = '1234'

This takes 23 seconds to select and update one row

But if I use openquery (which I don't want to) then it only takes half a second.

The reason I don't want to use openquery is so I can add parameters to my query securely and be safe from SQL injections.

Does anyone know of any reason for it to be running so slowly?

Any clues from the query execution plan? Or you could set up SQL Profiler to watch the database and see what openquery is doing differently. — Rup, Dec 08 '10 at 17:42

Joe Stefanelli · Accepted Answer · 2010-12-08T17:34:10.360

9

Here's a thought as an alternative. Create a stored procedure on the remote server to perform the update and then call that procedure from your local instance.

/* On remote server */
create procedure UpdateTable
    @field1 char(1),
    @column1 varchar(50)
as
    update table
        set field1 = @field1
        where column1 = @column1
go

/* On local server */
exec linkeddb...UpdateTable @field1 = 'Y', @column1 = '1234'

edited Dec 08 '10 at 17:34

answered Dec 08 '10 at 17:25

Joe Stefanelli

132,803
19
237
235

@Joe - is a stored procedure basically just a function/method for a database? – orokusaki Dec 08 '10 at 17:28
@orokusaki: Yes, you can think of it that way. – Joe Stefanelli Dec 08 '10 at 17:29
@Joe This will run only slightly faster, due to removing the parse and compile time. @orokusaki It is still better than your method as it lets the database protect itself from sql injection. You can also set permissions so the user cant update the table directly, and can only do it via the SP. – stevenrcfox Dec 08 '10 at 18:09
1

@Overflow: Actually, I expect a more significant improvement as the workload for performing the update is shifted to the remote server's SQL engine. In the OP's current version, the local SQL engine is trying to update the remote resource. – Joe Stefanelli Dec 08 '10 at 18:12
@Joe, Does this behaviour depend on specific rdbms or drivers? I ran both versions looking at the data over the network with wireshark, both resulted in query sent to server, data sent back. What should I expect to see if the local sql engine was running the query? I'd have imagined either getting data to local cache and sending bulk updated data back, or row by row retrieval and update. – stevenrcfox Dec 09 '10 at 11:33
This would help mitigate potential collation issues. In addition, you would only need to give EXEC permissions to the one SP for the appropriate login instead of direct table permissions. – Cade Roux Dec 10 '10 at 04:47

score 4 · Answer 2 · edited May 23 '17 at 12:04

If you're looking for the why, here's a possibility from Linchi Shea's Blog:

To create the best query plans when you are using a table on a linked server, the query processor must have data distribution statistics from the linked server. Users that have limited permissions on any columns of the table might not have sufficient permissions to obtain all the useful statistics, and might receive aless efficient query plan and experience poor performance. If the linked serveris an instance of SQL Server, to obtain all available statistics, the user must own the table or be a member of the sysadmin fixed server role, the db_ownerfixed database role, or the db_ddladmin fixed database role on the linkedserver.

(Because of Linchi's post, this clarification has been added to the latest BooksOnline SQL documentation).

In other words, if the linked server is set up with a user that has limited permissions, then SQL can't retrieve accurate statistics for the table and might choose a poor method for executing a query, including retrieving all rows.

Here's a related SO question about linked server query performance. Their conclusion was: use OpenQuery for best performance.

Update: some additional excellent posts about linked server performance from Linchi's blog.

score 2 · Answer 3 · answered Nov 18 '10 at 11:50

2

Is column1 primary key? Probably not. Try to select records for update using primary key (where PK_field=xxx), otherwise (sometimes?) all records will be read to find PK for records to update.

answered Nov 18 '10 at 11:50

Arvo

10,349
1
31
34

I'm pretty sure that column1 is the primary key but looking at the execution plan in SQL Query Analyser it shows that the remote scan is taking the longest so it seems that it is going through all 40,000 records. – Jamie Taylor Nov 18 '10 at 12:20
2

Hmm, if your PK is (n)varchar, then you may have some collation issue (I mean SQL doesn't use such index because it doesn't know collation or so). I've no experience with non-integer PK fields though. – Arvo Nov 18 '10 at 13:11
@Jamie, I second Arvo's theory of collation being a potential issue – stevenrcfox Dec 08 '10 at 18:10

score 1 · Answer 4 · answered Nov 18 '10 at 12:39

1

Is column1 a varchar field? Is that why are you surrounding the value 1234 with single-quotation marks? Or is that simply a typo in your question?

answered Nov 18 '10 at 12:39

Tim

5,371
3
32
41

If I run the query in Query Analyser without the quotation marks I get an error and I believe column1 is a varchar field – Jamie Taylor Nov 18 '10 at 12:54

Slow query when connecting to linked server

4 Answers4

Linked