0

I've been trying to build a collection of exhaustive word lists for as many languages as possible and I ended up using LibreOffice's spell checking .dic and .aff files. The .dic file contains base forms of words and .aff contains rules to morph them. I found an existing .sh tool to combine these files into a .txt word list.

Now, because I'm doing this for many languages, I'd like to automate the process of running this tool on different .dic and .aff files in different languages. I wrote a little python script for this:

for lang in langs:
    dic_path = os.path.join(lang, [filename for filename in os.listdir(lang) if filename.endswith(".dic")][0])
    aff_path = os.path.join(lang, [filename for filename in os.listdir(lang) if filename.endswith(".aff")][0])

    command = [os.path.join("tools", "unmunch.sh"), dic_path, aff_path]
    outpath = os.path.join(lang, f"{lang}_words.txt")
    with open(outpath, "w") as f:
        subprocess.run(command, stdout=f, shell=True)

The problem is that the file at the outpath remains empty. In contrast, this different command does write to the desired file:

command = ["type", dic_path]
with open(outpath, "w") as f:
    subprocess.run(command, stdout=f, shell=True)

After trying this I executed the tool in cmd and found that it opens a new cmd window to run. This is different to what I experienced when running it in Git Bash which I usually use. In Git Bash I used the command:

tools/unmunch.sh dutch/dutch.dic dutch/dutch.aff >dutch/dutch_words.txt

And it worked. Whilst in cmd, running:

tools\unmunch.sh dutch\dutch.dic dutch\dutch.aff >dutch\dutch_words.txt

opens a new cmd window and writes the output there, instead of to the dutch\dutch_words.txt file. I assume this is what's happening when using subprocess in python, but I have no idea how to prevent this as I'm very unfamiliar with .sh files. Can anyone help me get the output written to a desired path?

  • Do you want to `.communicate()` with the process or just run it? Don't you want `subprocess.run()`? – KamilCuk Mar 14 '22 at 12:00
  • This is a known issue, see [Issue 30082](https://bugs.python.org/issue30082) on the Python issue tracker. This [thread](https://stackoverflow.com/questions/1016384/cross-platform-subprocess-with-hidden-window) may also be relevant. – metatoaster Mar 14 '22 at 12:04
  • @KamilCuk I'm new to using subprocess but I don't think there's a practical difference for what I'm trying to do is there? I have changed it now because I think it looks nicer, but it acts the same. – Luuk Verheijen Mar 15 '22 at 01:42
  • @metatoaster As far as I can tell those pages are about hiding the window, not about preventing a new console from starting altogether. My apologies if I didn't make that clear enough in my question – Luuk Verheijen Mar 15 '22 at 01:43
  • This [answer in another thread](https://stackoverflow.com/a/12555130/) provide additional hints - it ultimately depends on the particular Windows program being executed. – metatoaster Mar 15 '22 at 01:55
  • Alternatively, [this may also be a useful reference thread](https://stackoverflow.com/questions/4277963/how-to-call-cmd-without-opening-a-window) regarding `cmd` and its window. – metatoaster Mar 15 '22 at 02:08
  • @metatoaster Thank you for the help, I agree the cause likely lies in the .sh script I downloaded. I do still find it strange that a new console isn't opened when I use Git Bash instead of cmd – Luuk Verheijen Mar 15 '22 at 11:25

1 Answers1

0

My own solution:

As mentioned, the problem with opening a new window didn't occur in git bash. In the end all I needed to do was add git bash to the PATH environment variable and add "bash" at the start of the command like so:

command = ["bash", os.path.join("tools", "unmunch.sh"), dic_path, aff_path]

This meant a new window did not open, and therefore the output was written to the desired output file.