0

Im trying to add 10 Million products to my products table.

That was my attempt so far:

INSERT INTO Artikel (Hersteller, Artikelnummer, Artikelnamen, Artikelbeschreibung, Preis)
VALUES (
    
    (SELECT TOP 1 Hersteller FROM Artikel ORDER BY NEWID()) ,
    (SELECT FLOOR(RAND() * (100000000000-101 + 1)) + 101 ) ,
    (SELECT REPLACE(NEWID(),'-','')),
    (SELECT REPLACE(NEWID(),'-','')),
    (SELECT ROUND(RAND(CHECKSUM(NEWID())) * (9999), 2))
)
GO 10000000

But this takes forever. After ~45 minutes my query was nowhere near 200K values. Are there any faster/more efficient solutions?

Mike
  • 11
  • 1
  • 4
    Well yeah -- craft a query that returns 10 million rows, instead of inserting 10 million rows one by one. Look up things like tally/number tables. – Jeroen Mostert Nov 22 '22 at 15:15
  • @JeroenMostert Thats not really what I want. I need 10 Million unique rows. Selecting from a different table isn't really feasible for that. – Mike Nov 22 '22 at 15:21
  • for the first field you can use directly NEWID() instead of a query with top and order by – Joe Taras Nov 22 '22 at 15:22
  • I'm not really sure why you are selecting a "random" row from the table you are inserting into (`Artikel`). That `TOP (1)` query is also going to get *slower and slower* each iteration. – Thom A Nov 22 '22 at 15:23
  • But you must consider to change point of view in according wirh @Jeroen comment. Otherwise if you have a file with this sample data you can import with a BULK INSERT operation – Joe Taras Nov 22 '22 at 15:23

1 Answers1

1

What you actually want is unclear, however, you could likely get some very good performance by doing this in only a couple of batches. I don't understand why you are getting a value(of Hersteller) from your table (Artikel) only to insert it into the table again, but I've incorporated that anyway.

This does the INSERT in 2 batches of 5,000,000:

WITH N AS(
    SELECT N
    FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
    FROM N N1, N N2, N N3, N N4, N N5, N N6, N N7), --10,000,000 rows
Dataset AS(
    SELECT TOP (5000000)
           A.Hersteller
    FROM dbo.Artikel A
         CROSS JOIN Tally T)
INSERT INTO dbo.Artikel
SELECT D.Hersteller,
       FLOOR(RAND() * (100000000000-101 + 1)) + 101, --This'll be the same for every row, is that intended?
       REPLACE(NEWID(),'-',''),
       REPLACE(NEWID(),'-',''),
       ROUND(RAND(CHECKSUM(NEWID())) * (9999), 2) --This'll be the same for every row, is that intended?
FROM Dataset D;
GO 2

Note my comment about RAND, and that it'll produce the same value on every row (within the batch). If that isn't desired then see this post about making a random number per row: How do I generate a random number for each row in a T-SQL select?

Thom A
  • 88,727
  • 11
  • 45
  • 75