0

I am trying to merge two files using pandas, one of which is very large(6gb). Whenever I try it, I get a Memory error, as my RAM(8gb) is probably too small to handle it. Any ideas on how I could fix this? my code is:

 import pandas as pd
broad_matched = pd.read_csv("FILE A", delim_whitespace=True)
broad_matched2 = broad_matched[~(broad_matched['P'] >= 0.05)]
SNPs= pd.read_csv("FILE B", 
                  sep='\t', 
                 names=["#CHROM","POS1","POS", "rsID","E","F"])
broad_matched2=broad_matched2.drop(columns=['LOG.OR._SE','ID','REF','ALT','ERRCODE','Z_STAT','OR','OBS_CT','TEST','FIRTH.','A1','#CHROM'])
Table1=pd.merge(broad_matched2,SNPs,on='POS',how='inner').dropna()
Table1.to_csv(r'D:/Table1', index = False)
miguel
  • 9
  • 3

1 Answers1

0

You should take a look at this post. The solution involves using dask dataframes.

Sabri B
  • 51
  • 3