I have some CSV files with exactly the same structure of stock quotes (timeframe is one day):
date,open,high,low,close
2001-10-15 00:00:00 UTC,56.11,59.8,55.0,57.9
2001-10-22 00:00:00 UTC,57.9,63.63,56.88,62.18
I want to merge them all into one DataFrame with only close price columns for each stock. The problem is different files has different history depth (they started from different dates in different years). I want to align them all by date in one DataFrame. I'm trying to run the following code, but I have nonsense in the resulted df:
files = ['FB', 'MSFT', 'GM', 'IBM']
stock_d = {}
for file in files: #reading all files into one dictionary:
stock_d[file] = pd.read_csv(file + '.csv', parse_dates=['date'])
date_column = pd.Series() #the column with all dates from all CSV
for stock in stock_d:
date_column = date_column.append(stock_d[stock]['date'])
date_column = date_column.drop_duplicates().sort_values(ignore_index=True) #keeping only unique values, then sorting by date
df = pd.DataFrame(date_column, columns=['date']) #creating final DataFrame
for stock in stock_d:
stock_df = stock_d[stock] #this is one of CSV files, for example FB.csv
df[stock] = [stock_df.iloc[stock_df.index[stock_df['date'] == date]]['close'] for date in date_column] #for each date in date_column adding close price to resulting DF, or should be None if date not found
print(df.tail()) #something strange here - Series objects in every column
The idea is first to extract all dates from each file, then to distribute close prices among according columns and dates. But obviously I'm doing something wrong. Can you help me please?