1

I have 1000+ text files. Each has dates ( which I have made the index) and stock prices (which are column 0). I have created the code to find an individual file's price's moving average, and rolling difference between the price and the moving average. I would like to create code to do this for every file. I have to upload them in groups because it uses too much memory to upload them at once.

I imagine I would have to use a for loop to iterate through the files and find the metrics for each one. But how would I do that? How can I upload all the files into a group, and say, group them into one variable, then create a loop to find the moving average and difference from price for each one?

Edit: I am using numpy,pandas, and matplotlib. I'd also like to be able to find the stocks which the difference from the moving average is the greatest.

Any help would be greatly appreciated

dergky
  • 105
  • 1
  • 9
  • If you edit some code into the question we can see what you have tried and you will get more precise answers. – EugeneProut Feb 12 '20 at 21:34
  • 1
    Can you include some of the code you've tried? For example this question show show to loop over multiple files https://stackoverflow.com/questions/53100599/applying-the-same-operations-on-multiple-csv-file-in-pandas – linamnt Feb 12 '20 at 21:36
  • Well I haven't tried much when it comes to uploading them all at once, frankly I don't know where to start. Can I interate through a folder on my computer? Like separate the files into different folders then use a for loop to upload all of them at once? I'm new to python so bare with me please. Thanks – dergky Feb 12 '20 at 21:37
  • @linamnt Thanks for showing me that question but the one problem is the files are all named by ticker(ABC, etc), rather being all the same name + a number. So unless I can rename all the files to add a number after eachone and take the ticker out, I can't do that. – dergky Feb 12 '20 at 21:41
  • I think the answer Hal posted below may be a good start, and then when you need to read lets say 50 files at a time, you can split your list of filenames into chunks using this example https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks – linamnt Feb 12 '20 at 21:54
  • Stack Overflow is not a substitute for guides, tutorials or documentations. Do you have a **specific question**? See: [ask], [tour], [help/on-topic]. – AMC Feb 13 '20 at 01:39

2 Answers2

1

If you are looking to just iterate over all of your input files in a given folder, you might want to try os.listdir() to get a list of filenames, which you can then process sequentially. If your files are spread over layers of folder, you could use os.walk() to traverse the directories. You can find info on these methods here: https://docs.python.org/3/library/os.html

Hal Jarrett
  • 825
  • 9
  • 19
0

How large are these 1000files? If they are a couple MB each, just guessing, merge all files into one single file and you can do whatever you want with it.

import pandas as pd
import csv
import glob
import os

#os.chdir("C:\\Users\\Excel\\Desktop\\test\\")
results = pd.DataFrame([])
filelist = glob.glob("C:\\your_path\\*.csv")
#dfList=[]
for filename in filelist:
    print(filename)  
    namedf = pd.read_csv(filename, skiprows=0, index_col=0)
    results = results.append(namedf)

results.to_csv('C:\\your_path\\CombinedFile.csv')
ASH
  • 20,759
  • 19
  • 87
  • 200
  • got the following: OSError: Initializing from file failed. But thank you, i'll play with it and see if I can get it. – dergky Feb 14 '20 at 00:53
  • That error doesn't make sense. Also, it works perfectly fine for me. Run it line by line to see where the error is coming in. I think you'll figure it out very soon. – ASH Feb 14 '20 at 01:51
  • [This question](https://stackoverflow.com/questions/50552404/oserror-initializing-from-file-failed-on-csv-in-pandas) addresses some of this file issues that can raise that error in Pandas and how to fix them - largely related to either nonstandard characters in filenames, or permissions issues. Worth a look to see if its related. – Hal Jarrett Feb 14 '20 at 02:13
  • Hummm, just thinking...maybe you have some special characters in some of your CSV files. Can you create 3 really basic CSV files and test the code on that? It should work fine. Then, you know there is something funky about the specific files you are working with. There are all kinds of text cleaning exercises you can do as you are importing various data sets. Finally, you may need to apply special encoding then reading CSV files, like 'encoding='utf-8''. See the following link for some guidance. https://stackoverflow.com/questions/904041/reading-a-utf8-csv-file-with-python – ASH Feb 14 '20 at 03:19
  • Hey guys, thanks for your input I really appreciate it. I figured out a way to append each file to a list and then create loops to iterate through each file and it seems to work. I have another question which I will post soon, but I just wanted to say I really appreciate the help. Cheers. – dergky Feb 14 '20 at 17:28