1

Consider having a table like this:

CREATE TABLE Product (
    Id int PRIMARY KEY CLUSTERED,
    InvoicesStr varchar(max)
)

which InvoicesStr is concatenated Ids of the invoices containing this Product. I know it is not a proper design, but I just brought it up to demonstrate the problem I want to describe.

So the table data would be something like this:

Product

Id  | InvoicesStr
----|-------------------------------------
1   | 4,5,6,7,34,6,78,967,3,534,
2   | 454,767,344,567,89676,4435,3,434,

After selling millions of products the InvoicesStr would contain a very large string. Consider a situation in which for a row, this column contains a very big string, say a 1GB string.

I want to know about the performance for such an update query:

UPDATE Product
SET InvoiceStr = InvoiceStr + '584,'
WHERE Id = 100

Is the performance of this query dependent on the size of InvoiceStr? Or is SQL Server smart enough to just append the new string and not replace it completely?

mehrandvd
  • 8,806
  • 12
  • 64
  • 111
  • please read https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad and consider another approach – nbk Jul 02 '22 at 22:42
  • 3
    There are undocumented (or at least minimally documented) functions like `.write()` that could help to at least minimally log messing with large strings. But all in all, why would you want to go through all that trouble, just to go against the very principle of database design? Why do you have a 1GB string with many values, when they should have been tiny rows? Sql server wasn't meant for this, it was designed for relational databases. – HoneyBadger Jul 02 '22 at 23:54
  • 1
    "is SQL Server smart enough to just append the new string and not replace it completely?" in most languages strings are immutable, so appending to a string means creating a new string. It's not a matter of smart, it's a matter of what a datatype allows. Appending and creating a new string are the same thing, in reality – HoneyBadger Jul 02 '22 at 23:57
  • @HoneyBadger See `.WRITE` https://learn.microsoft.com/en-us/sql/t-sql/queries/update-transact-sql?view=sql-server-ver16#updating-lobs – Charlieface Jul 03 '22 at 01:30
  • @Charlieface slightly more documented than I remember, thanks. Still, I would call that minimally documented – HoneyBadger Jul 03 '22 at 01:49

1 Answers1

4

You can use the little-known .WRITE syntax to append or modify text/data in a max column.

This does an efficient append or replace (minimally logged if possible), and is useful for modifying large values. Note that SQL Server modifies only whole 8k pages, so the minimum amount of modified data would be 8k (unless the existing data exactly filled a page).

For example

UPDATE Product
SET InvoicesStr.WRITE('100,', NULL, NULL)
WHERE Id = 2;

db<>fiddle

In reality, there is usually little reason to actually use this syntax, because you would not normally have such a denormalized design. And if you were storing something like pictures or audio, you would just replace the whole value.

Charlieface
  • 52,284
  • 6
  • 19
  • 43