1

I have a text file that has 185405149 lines and a header. I am reading in this file within this bash script:

#!/bin/bash
#PBS -N R_Job
#PBS -l walltime=4:00:00
#PBS -l vmem=20gb

module load R/4.2.1

cd filepath/

R --save -q -f script.R

Part of the script is below:

# import the gtex data 
gtex_data <- read.table("/filepath/file.txt", header=TRUE)

I get the error: Error: cannot allocate vector of size 2.0 Gb Execution halted.

It's got nothing to do with the directory/filepath. I suspect its to do with memory. Even after zipping the file e.g file.txt.gz and using the command:

gtex_data <- read.table(gzfile("/filepath/file.txt.gz"), header=TRUE) 

It doesn't read the data.

I've tried with a smaller file e.g. reading the first 100 lines of file.txt and renaming it and loading it and it works fine.

I've even tried to increase vmem? Not sure what to do. I would be grateful for advice/help.

I've also checked the size of the file.

ls -lh file.txt -rw-r--r-- 1 ... 107M Oct 26 16:50 file.txt

AvniKaur
  • 39
  • 1
  • 6
  • 1
    2GB looks suspiciously like a limit of a 32-bit app, or filesystem without *largefiles* support. – Mark Setchell Oct 26 '22 at 22:25
  • Perhaps [reading in large text files in r](https://stackoverflow.com/questions/11782084/reading-in-large-text-files-in-r) or [Reading large data files in R • Bart Aelterman - INBO Tutorials](https://inbo.github.io/tutorials/tutorials/r_large_data_files_handling/) – glenn jackman Oct 27 '22 at 01:29
  • Are you processing stuff in a `grid-engine-client` like environment, or? – Chris Oct 27 '22 at 02:24

0 Answers0