-3

What is the fastest way of updating a number of rows in table (cca 1 million), using the technologies in the tags?

The statement looks something like this:

UPDATE MyTable SET MyColumn = 2 WHERE MyTable.ID = 1
UPDATE MyTable SET MyColumn = 3 WHERE MyTable.ID = 3
UPDATE MyTable SET MyColumn = 5 WHERE MyTable.ID = 7
UPDATE MyTable SET MyColumn = 1 WHERE MyTable.ID = 45
UPDATE MyTable SET MyColumn = 0 WHERE MyTable.ID = 234

And so on.. Additional description:

  • No pattern to the numbers.
  • Source is a file, it updates changes to existing records, from a file.
  • File contains modifications to existing records. I hvae to reflect those changes to the database. Notice how I already know the ID. That is not a problem.

Another possible description. Given a list of IDs, paired up with a value, what is the fastest way to update the records with the ID primary key (MyTable.ID), with the paired value (MyColumn)? Just to be more clear. I am given a (source) file, with some values written to the file. The source file, from which i extract data for importing(updating) to a database, looks like this:

2,   1
3,   3
5,   7
1,   45
0,   234

I will accept an answer using pure TSQL, or something from an ORM or any different provider.

JJ_Jason
  • 349
  • 8
  • 24
  • What is the source of your updates? It doesn't look like there is a pattern. I.e., how do you know what value to set for a given ID? – David Peden Mar 17 '12 at 03:00
  • No pattern. That's the question, basically. The source is a flat file, the code imports (updates) data. – JJ_Jason Mar 17 '12 at 03:01
  • 3
    Is the question how to do this in C#? It is kind of confusing what language you expect the answer to be in (C# or TSQL). – M.Babcock Mar 17 '12 at 03:01
  • @JJ_Jason Your subject says "hardcoded set" so, again, what is the source of your updates? A text file, a CSV, a table...? – David Peden Mar 17 '12 at 03:03
  • @M.Babcock Does it really matter. Both? – JJ_Jason Mar 17 '12 at 03:03
  • 1
    @JJ_Jason - If it doesn't matter then your question is likely too broad to only have _one answer_ (a requirement of SO). – M.Babcock Mar 17 '12 at 03:05
  • Ok then, in C# and/or TSQL. I'm open for suggestions. – JJ_Jason Mar 17 '12 at 03:06
  • @JJ_Jason - Pick a language or your question will likely end up closed. A suggestion... TSQL. – M.Babcock Mar 17 '12 at 03:07
  • @M.Babcock Tagging is not enough? Ok then, C# it is. I saw answers with different approaches before. So I don't what the fuss is about. – JJ_Jason Mar 17 '12 at 03:08
  • @JJ_Jason - No because tagging more than 1 language tag (especially ones that are unrelated) makes your question not specific enough to be answered. – M.Babcock Mar 17 '12 at 03:11
  • I guess if I don't get the satisfactory response, I just duplicate the question with the other language? – JJ_Jason Mar 17 '12 at 03:19
  • @JJ_Jason - Duplicate questions are discouraged on the Stack Exchange network and will be handled through moderation if necessary. Perhaps you should figure out what you want before asking... – M.Babcock Mar 17 '12 at 03:33
  • I believe that my question is perfectly understandable. What your problem is, is not. – JJ_Jason Mar 17 '12 at 03:35
  • @JJ_Jason - I did not say that your question was not _understandable_ more that it does not fit the generally accepted format of SO. Your question cannot be answered with a single answer because it asks for the _fastest_ way to do something in 2 _very_ **different** languages. Questions on SO are expected to have _a_ correct answer. This one has at least 2... – M.Babcock Mar 17 '12 at 03:43
  • So how would I, get both those answers? I said fastest... Time is measurable. One answer - best answer. Get it now? – JJ_Jason Mar 17 '12 at 03:47
  • 1
    @JJ_Jason - Try measuring it yourself? This is definitely not something you should be dependent on someone else to do for you... – M.Babcock Mar 17 '12 at 03:48
  • My God are you hard. I have a clock, I just need two, three, fifteen, one, different approaches? Bulk update, linq2sql, entity, a stroke of lightning... Anything using C# and/or TSQL. – JJ_Jason Mar 17 '12 at 03:51
  • 3
    Keep in mind that **tags describe the *question*, not the [possible] answers**. And beyond the debate about [tag:c#] and [tag:tsql], where in the world does [tag:winforms] come into this? – Cody Gray - on strike Mar 17 '12 at 03:55
  • 1
    Thank you again, for the at least 0 correct answers. – JJ_Jason Mar 17 '12 at 04:17
  • Nice moderating team... I disagree with your desc of my question. It can be answered, pretty well. So it will remain closed, cause there's nothing more to it, I guess, then updating it this way. – JJ_Jason Mar 17 '12 at 15:29

2 Answers2

4

It depends on the format of your update file. If it's a delimited file or and Excel spreadsheet, it's pretty simple. Use the Sql server bulk import tool to import the file into a temp table. Then update via join. If your file is not in a format that is simple to import, put it in a format that is simple to import.

UPDATE
    Table
SET
    Table.col1 = other_table.col1,
    Table.col2 = other_table.col2
FROM
    Table
INNER JOIN
    other_table
ON
    Table.id = other_table.id

https://stackoverflow.com/questions/2334712/sql-server-update-from-select

Community
  • 1
  • 1
Brian Mieras
  • 121
  • 1
  • 7
  • It's a spreadsheet but the format is not handy, but it could be made *importable*. I would like to avoid bulk insert. The from clause, can it be constructed as a series of unions? To skip it? Any other approach? (BTW, best *one* answer so far) – JJ_Jason Mar 17 '12 at 03:23
  • @JJ_Jason Please update your question to include any relevant information, we don't have to guess stuff like what your update file actually is. – yannis Mar 17 '12 at 03:35
  • Why does it matter? All the info that i can get from the file is IDs, and the value to update the records, with the id. I get a series of numbers, accompanied with new values. I have to update existing records (through their IDs), to the new values. – JJ_Jason Mar 17 '12 at 03:38
  • 2
    @jj_jason If you want help, let the community judge if it matters. You are leaving out relevant, if not important, details. – David Peden Mar 17 '12 at 03:40
  • 1
    You are nitpicking, and badgering. – JJ_Jason Mar 17 '12 at 03:43
  • Please, point me to the details i left out, i will gladly give them, I just don't know what is missing. – JJ_Jason Mar 17 '12 at 03:48
  • Could you tell us why you would like to avoid the bulk import? If you want to avoid it because you want to do these updates remotely, something like what Ken suggested doing with the batch updates is best. These are the kind of things that M.Babcock is pointing to. If you don't like an answer, give us the details of why not. It will help people with similar problems in the future find solutions quickly. The more specific you can be on your situation, the better. – Brian Mieras Mar 17 '12 at 04:07
1

Brian's solution in some shape or form is the way to go. For that amount of data, if you have the disk space to work with, a full import followed by the update from query is the simplest and fastest method.

If you don't have the disk space, then you'll need to work in batches. Import 5000 or so, update, delete imported, and repeat. That's a full T/SQL approach. You could use the BCP utility or BULK INSERT statement to import the data.

For a C#-based solution, you can use the .NET SqlBulkCopy class and provide it an IDataReader source, either a custom reader or one from an OdbcConnection.

This is also a scenario you might consider building an SSIS package for.

Ken
  • 834
  • 9
  • 25
  • Disk space is not an issue. Only speed. And simplicity would be nice. Brians answer seems best for now. – JJ_Jason Mar 17 '12 at 03:44
  • How does one use SqlBulCopy to performa updates? – JJ_Jason Mar 17 '12 at 04:03
  • Sorry. For the C# solution, SqlBulkCopy would be used in place of BCP or BULK INSERT for the import step. Then, you could execute the UPDATE FROM query from an SqlConnection. This is definitely not the most straightforward option, but I've found it useful when complex business rules were thrown in the mix. – Ken Mar 17 '12 at 04:12
  • I believe i will go with the solution Brian posted. Thanks for the effort. – JJ_Jason Mar 17 '12 at 04:18