I have two csv files which have different row numbers.
test1.csv
num,sam
1,1.2
2,1.13
3,0.99
test2.csv
num,sam
1,1.2
2,1.1
3,0.99
4,1.02
I would like to read the sam
columns and append them to an empty dataframe. Thing is that, when I read test1.csv
, I extract the base file name, test1 and want to append the sam
column based on the `column header in the empty dataframe.
big_df = pd.DataFrame(columns =['test1','test2'])
pwd = os.getcwd()
for file in os.listdir(pwd):
filename = os.fsdecode(file)
if filename.endswith(".csv"):
prog = filename.split('.')[0] # test1 test2
df = pd.read_csv(filename, usecols=['sam'])
# The read dataframe has one column
# Move/append that column to the big_df where column == prog
big_df[prog] = df
print(big_df)
But big_df
misses the fourth row of test2.csv.
test1 test2
0 1.20 1.20
1 1.13 1.1
2 0.99 0.99
I expect to see
test1 test2
0 1.20 1.20
1 1.13 1.1
2 0.99 0.99
3 NaN 1.02
How can I fix that?