Let's assume that I have a big file (500GB+) and I have a data record
declaration Sample
which indicates a row in that file:
data Sample = Sample {
field1 :: Int,
field2 :: Int
}
Now what is the data structure suitable for processing
(filter/map/fold) on the collection of these Sample
datas ? Don
Stewart has answered here that the Sample
type should not be treated
as a list [Sample]
type but as a Vector
type. My question is how
does representing it as Vector
type solve the problem ? Doesn't
representing the file contents as a vector of Sample
type will also
occupy around 500Gb ?
What is the recommended method for solving these types of problem ?