I have 600 milion record into the my database,and i want load all with this query:
var query=(from p in behzad.test
select p).ToArray();
can i load that?
I have 600 milion record into the my database,and i want load all with this query:
var query=(from p in behzad.test
select p).ToArray();
can i load that?
Highly improbable... There are multiple "walls" you'll encounter.
First... Let's say that each record is just an ID... just to store in the best way 600 million of your IDs, it is 600 million * 4 bytes = 2.4gb. But I don't think your objects are so much small. Then there is the reference to that objects... Each referece is 8 bytes at 64 bits...
http://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/ here Skeet calculated that the minimum memory used by an object at 64 bits is 24 bytes... so 14.4gb (it doesn't includes a reference to the object, the new object[size]
is before the GC.GetTotalMemory(true)
). In this space you can put up to two int
without making the object bigger (the ID
and another int
field, for example) (in the same page, search for the table with TwoInt32s
).
Then there is the problem that LINQ-to-SQL duplicates the records when they are loaded (one copy goes to its object tracker). You can disable this with https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.objecttrackingenabled.aspx
Then there was a limit of maximum 2gb of memory for each array... being an array of references of 8 bytes (64 bits), it would be around 260 million records for an array. That was removed with .NET 4.0 at 64 bits: https://msdn.microsoft.com/en-us/library/hh285054.aspx
Then there is another problem with ToArray()
: to build the array it goes by doubling, using powers of 2 (4 elements, 8 elements, 16, 32, 64, 128... 1024... 1024*1024 elements, ...) so before building an array of 600 million references (that alone is 4.8 gb, but is included in the 24 bytes/element, so we don't count it), it will build a temporary array of 512 million elements (4 gb), then one of 1024 million elements (8 gb, of which 3.2 useless), and then it will resize the one of 1024 million elements to 600 million.
So just before the end of the ToArray()
you will have:
So at minimum you'll need 14.4 + 8.0 + 5.4 = 27.8 gb... For 2 int
s... And this is a "theoretical minimum". I would increase it by at least 50-100% to make it a "practical minimum". And you are still handling 2 int
for each record :-)
Now... Alexei suggested using struct
. struct
have the advantage that they don't require a reference, and their size is nearly the exact size they use.
You could change the query to:
var query = (from p in behzad.test select p)
.Select(p => new MyStruct { ID = p.ID, Foo = p.Foo })
.ToArray();
Now each element is 8 bytes big, so you need
Total 13.4 gb.
Note that using a reference type like a string
inside you element is still a big no no (it would eat too much memory).