-1

I am fairly new to Python and wanting some assistance with generating a new column called Ticker when reading in multiple csv files. As the Yahoo! Finance API is depreciated, I am reading in csv data from Yahoo! Finance for 'GOOG', 'IBM' and 'AAPL'. The following code reads the individual csv files into one DateFrame, however, it is hard to distinguish which stock is which.

path = 
allFiles = glob.glob(path + "/*.csv")
frame = pd.DataFrame()
list_ = []
for file in allFiles:
     df = pd.read_csv(file,index_col=None, 
          header=0)
     list_.append(df)
frame = pd.concat(list_)
frame.head()

Is it possible to create a column called Ticker that has the name of the csv file for each observation for each stock? Eg. GOOG.csv is the file name for Google, IBM.csv is the file name for IBM...

This would make it easier to identify which stock is which.

oceanbeach96
  • 604
  • 9
  • 19

1 Answers1

0

According to this previous post, I am led to believe that you have two clear options. Either (1) include names=[] in the original read_csv command to specify the stock name, or (2) add the column names to the dataframe before loading.

Approach (1) might involve replacing your current read with the following code snippet:

df=pd.read_csv(file,names=[file[len(path)+1:-4]],index_col=None)

Here I assumed I could get the string of the desired ticker by looking at all the characters after the one slash following path, and up to the .csv.

Approach (2) might be accomplished by adding the following line of code after reading the csv but before appending the dataframe:

df.columns=[file[len(path)+1:-4]]

I have assumed in this response that you only have/want one column of data for each csv, but if you wanted to put multiple columns in you would simply specify more than one name in the list of column names.

mmurray
  • 1
  • 2