Import multiple sas files in Python and then row bind

Question

I have over 20 SAS (sas7bdat) files all with same columns I want to read in Python. I need an iterative process to read all the files and rbind into one big df. This is what I have so far, but it throws an error saying no objects to concatenate.

import pyreadstat
import glob
import os

path = r'C:\\Users\myfolder'  # or unix / linux / mac path
all_files = glob.glob(os.path.join(path , "/*.sas7bdat"))

li = []

for filename in all_files:
    reader = pyreadstat.read_file_in_chunks(pyreadstat.read_sas7bdat, filename, chunksize= 10000, usecols=cols)
    for df, meta in reader:
        li.append(df)
    frame = pd.concat(li, axis=0)

I found this answer to read in csv files helpful: Import multiple CSV files into pandas and concatenate into one DataFrame

What is wrong with posted code? Error? Undesired result? Look into `pandas.concat` to row bin list of DataFrames. Maybe `pandas.concat(df for df, meta in reader)`? — Parfait, Jan 22 '23 at 04:15
Maybe initialize an empty list to hold all dataframes, append each dataframe to that list inside the loop over all your files, and then pass that list to `pd.concat()`. How huge are your files? — AlexK, Jan 22 '23 at 08:26
@AlexK Tried it, it won't work. All files put together are around 25 GB. — AAA, Jan 23 '23 at 19:41
Do you have that much RAM? Can't test your code, but have you tried to debug? Are you able to read each individual file? What does `li` contain before you pass it to `pd.concat`? Also, your last line should be outside the outer loop. — AlexK, Jan 23 '23 at 22:01

score 1 · Accepted Answer · answered Jan 23 '23 at 23:14

So if one has too big sas data files and plans to append all of them into one df then:

#chunksize command avoids the RAM from crashing...
for filename in all_files:
    reader = pyreadstat.read_file_in_chunks(pyreadstat.read_sas7bdat, filename, chunksize= 10000, usecols=cols)
    for df, meta in reader:
        li.append(df)
frame = pd.concat(li, axis=0, ignore_index=True)

Import multiple sas files in Python and then row bind

1 Answers1