Writing a function to merge multiple stock data frames by column name into one using multiple csv files

Question

I have written the following code to merge a specific column in multiple csv files into a single data frame over a specified time period. Unfortunately, the following code doesn’t return the same column for all stocks, just the first one on my list. Any help here would be much appreciated.

Sample of Data frame is below:

tickers_list = [“ABBN.SW”, “ABDN.L”, “ADS.DE”, “1299.HK”, “AMC.AX”]

def merge_df_by_TI(col_name, sdate, edate, *tickers_list):
    mult_df = pd.DataFrame()
    for x in tickers_list:
    df = get_stock_df_from_csv
    mask = (df.index >= sdate) &           
           (df.index <= edate)
    mult_df[x] = df[mask][col_name]
    return df

To call the function I used the following:

mult_df = merge_df_by_TI(‘col_name’, S_DATE, E_DATE, *tickers_list)
mult_df

The get_stock_from_csv function is defined as:

def get_stock_df_from_csv(ticker):
    try:
        df = pd.read_csv(PATH + ticker + ’.csv’, index_col=0)
    except FileNotFoundError
        print(“File Does Not Exist”)
    else:
        return df

When I run the function I get the date column and the desired column of the first ticker in the list but not the same column for the other tickers in the list as expected.

I have checked the files for the other tickers and the columns are present for each of them. I have also checked the date range is present and the file path defined in the get_stock_from_csv function (this is basically the df.read_csv function with the filepath defined).

Please create a [minimal reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). You need to include the definition of your `get_stock_df_from_csv` function and sample of each of your input DataFrames using `df.head().to_dict()` — not_speshal, Aug 01 '23 at 12:40
Please provide enough code so others can better understand or reproduce the problem. — Community, Aug 01 '23 at 18:59

score 0 · Answer 1 · answered Aug 01 '23 at 12:23

0

It is hard to help without seeing sample of data, but in your loop you are not appending the dataframes to the mult_df. Also you are not explaining what behind get_stock_df_from_csv. See update code:

def merge_df_by_TI(col_name, sdate, edate, *tickers_list):
    mult_df = pd.DataFrame()
    for x in tickers_list:
        df = get_stock_df_from_csv
        mask = (df.index >= sdate) &           
               (df.index <= date)
        current_stock_df = df[mask][col_name]
        mult_df = pd.concat([mult_df, current_stock_df], axis=1)
    return mult_df

answered Aug 01 '23 at 12:23

gtomer

5,643
1
10
21

Thank you for your comments, very helpful. I have edited the original post with the required changes - could you please take another look? I tried the code you kindly sent me but unfortunately, that didn’t work. There was an error in the original code which I corrected: mult_df[x] = df[mask][col_name]. Any help will be greatly appreciated as this is quite new for me. Thank you – user22319714 Aug 01 '23 at 15:51
The error is because you are calling the function with a col_name which does not exist. Try for example: merge_df_by_TI(‘Open’,.... – gtomer Aug 02 '23 at 07:01

Writing a function to merge multiple stock data frames by column name into one using multiple csv files

1 Answers1