0

I am trying to read a large dataset (DNA sequence, 13GB) in R using the read. Fastq function. Some datasets are open but others are not (10GB). Besides, the matrices of 12GB (or more) that I need to generate, are not processed either. My computer systems is 16GB, memory.limit in R is 36000. How can I fix this issue? An error occurs in R:

Error: Input/Output
  file(s):
    sar326-2021_R17_S6_R1_001.fastq
  message: 'Calloc' could not allocate memory (250000000 of 1 bytes)
user438383
  • 5,716
  • 8
  • 28
  • 43
  • You simply don't seem to have sufficient physical RAM. `read.Fastq` appears to return some kind of sparse object? Maybe some of the files are not sufficiently sparse. – Roland Dec 06 '21 at 13:58
  • yes thank you! it seems I have not enough space, despite in my computer: > sessionInfo() R version 4.1.1 (2021-08-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042) – Vani Maguire Dec 06 '21 at 16:39

1 Answers1

0

Use memory.limit(). You can increase the default using this command, memory.limit(size=2500), where the size is in MB. You need to be using 64-bit in order to take real advantage of this.

Rfanatic
  • 2,224
  • 1
  • 5
  • 21
  • Hi, Thank you! yes I have already set memory.limit(size=36000), but still doesnt work, I am using the 64 bit version of R too. > memory.limit() [1] 16231 > memory.size() [1] 403.39 – Vani Maguire Dec 06 '21 at 13:07
  • @Vani Maguire One solution may be to simply split up the query into smaller chunks – Rfanatic Dec 06 '21 at 13:30
  • I think that is not possible because is not a regular data frame but millions of reads from a sequencing platform, so I should not split it! – Vani Maguire Dec 06 '21 at 16:40