3

I have a python script which takes the filename as a command argument and processes that file. However, i have thousands of files I need to process, and I would like to run the script on every file without having to add the filename as the argument each time.

for example: process.py file1 will do exactly what I want

however, I want to run process.py on a folder containing thousands of files (file1, file2, file3, etc.)

I have found out it that it can be done simply in Bash

for f in *; do python myscript.py $f; done

However, I am on windows and don't want to install something like Cygwin. What would a piece of code for the Windows command line look like that would emulate what the above Bash code accomplishes?

Jahid
  • 21,542
  • 10
  • 90
  • 108
cars0245
  • 125
  • 2
  • 3
  • 7
  • Do you want a solution that is purely Python (i.e. done inside your Python script) or one that utilizes Windows batch abilities? – PTBNL Jun 17 '15 at 15:32
  • 1
    BTW, in bash it is better to use `python myscript.py "$f"`, otherwise it would break if there was a space in a filename. – cdarke Jun 17 '15 at 15:44
  • I am open to solutions that work either way. Ideally, it would be done within the python script, and then I could pass an argument for the folder location to process when I run the python script. However, if it is easier done with windows batch that is fine – cars0245 Jun 17 '15 at 15:46

4 Answers4

1
for %%f in (*.py) do (
    start %%f
)

I think that'll work -- I don't have a Windows box handy at the moment to try it

How to loop through files matching wildcard in batch file

That link might help

Community
  • 1
  • 1
Shaun
  • 3,777
  • 4
  • 25
  • 46
0
import os, subprocess
for f in os.listdir('.'):
    if os.path.isfile(f):
        subprocess.call(["python", "myscript.py", f])

this solution will work on every platform, provided the python executable is in the PATH.

Also, if you want to recursively process files in nested subdirectories, you can use os.walk() instead of os.listdir()+os.path.isfile().

fferri
  • 18,285
  • 5
  • 46
  • 95
0

Since you have python, why not use that?

import subprocess
import glob
import sys
import os.path

for fname in glob.iglob(os.path.join('some-directory-name','*')):
    proc = subprocess.Popen([sys.executable, 'myscript.py', fname])
    proc.wait()

What's more, its portable.

cdarke
  • 42,728
  • 8
  • 80
  • 84
  • this looks good, but isn't working for me. I replaced the 'some-directory-name' with an absolute reference to the folder containing the files, and 'myscript.py' with an absolute reference to the script I want to run. However, when I run the code it seems to freeze up and slow down my computer without ever giving me the output it is supposed to. Am I missing something? I'm attempting to run the code in iPython QTConsole – cars0245 Jun 17 '15 at 16:36
  • Sounds like it is working! If you have thousands of files then you are gong to get thousands of child processes. I don't know your environment, there could be a buffering issue explaining the lack of output - IDLE, for example, will not display output from a child process. I suggest you test this with a small number of files (say around 5) to prove the principle. Really your design of running a child process for every file is suspect. In reality the python script should be altered to access the files, rather than running them like this. – cdarke Jun 17 '15 at 18:39
0

For each file in current dir.

for %f in (*) do C:\Python34\python.exe "%f"

Update: Note the quotes on the %f. You need them if your files contain spaces in the name. You can also put any path+executable after the do.

If we imagine your files look like:

./process.py

./myScripts/file1.py

./myScripts/file2.py

./myScripts/file3.py

...

In your example, would simply be:

for %f in (.\myScripts\*) do process.py "%f"

This would invoke:

process.py ".\myScripts\file1.py"
process.py ".\myScripts\file2.py"
process.py ".\myScripts\file3.py"
SDekov
  • 9,276
  • 1
  • 20
  • 50