Converting from SPSS to Pandas...result gives "b'var_name'" for all variables

Question

I'm trying to convert an SPSS file to Pandas, which is working fine. However, all variables present as "b'variable_name'". It puts a 'b' in front of each variable and single quotes around the original variable name. Is there a way to do this and keep the original variable name?

I've tried to rename the variables, but the quotations throw off the code...and besides...there are a lot of variables, so this is tedious and not ideal.

df = pd.DataFrame(list(s.SavReader(r'C:\Users\Nick\Desktop\GitProjects\Data\M2.sav', returnHeader=True, 
                                   recodeSysmisTo='NaN',ioUtf8=True,rawMode=True)))
df.head(10)

# Create a new variable called 'header' from the first row of the dataset
header = df.iloc[0]
# Replace the dataframe with a new one which does not contain the first row
df = df[1:]
# Rename the dataframe's column values with the header variable
M2 = df.rename(columns = header)
M2.head(10)

Here is the resulting dateframe. It's fine, but I need to get rid of the 'b' and the single quotes around each variable.

score 1 · Accepted Answer · answered Oct 14 '19 at 01:49

1

For a quick fix, to that :

header = list(map(str, df.iloc[0]))

So the b'' mean that all your header name are byte, not string. It's maybe du to the function used to read. Sav filw

answered Oct 14 '19 at 01:49

Florian Bernard

2,561
1
9
22

Converting from SPSS to Pandas...result gives "b'var_name'" for all variables

1 Answers1