9

I have several hundred .rtf files that need to be converted to .txt.

I have tried reading and writing the contents of the files into a new text file, but this seems rather tedious.

Is there an easier way to do this with python 3?

The data in the .rtf files is formatted as a table, and I need to convert it into one long list in the .txt file.

m4148
  • 91
  • 1
  • 1
  • 3
  • are you only looking for a change of file extension only? If so doing it in bash/cmd is probably easiest. Can be done in python as well of course, plenty of examples around to list/loop over files in a directory, as well as examples of how to rename files with help of python. – ahed87 Dec 07 '17 at 16:12
  • Possible duplicate of [Is there a Python module for converting RTF to plain text?](https://stackoverflow.com/questions/1337446/is-there-a-python-module-for-converting-rtf-to-plain-text) – Lycopersicum Dec 07 '17 at 19:29
  • Did you find an answer to this? – a06e Nov 16 '18 at 13:56
  • @becko I did not, I ended up just doing it manually. – m4148 Dec 07 '18 at 18:49
  • 1
    checkout https://www.gnu.org/software/unrtf/. It's not Python though. – a06e Dec 07 '18 at 19:15

2 Answers2

10

I found this package: striprtf, it helped me. Sample usage from the docs:

from striprtf.striprtf import rtf_to_text
rtf = "some rtf encoded string"
text = rtf_to_text(rtf)
print(text)
Dos
  • 2,250
  • 1
  • 29
  • 39
-1
import os 

def convert_rtf_to_txt(directory):
    files = os.listdir(directory)

    for file in files:
        if os.path.isfile(os.path.join(directory, file)):
            filename, extension = os.path.splitext(file)

        if extension.lower() == ".rtf":
            rtf_file = open(os.path.join(directory, file), "r")
            rtf_content = rtf_file.read()
            rtf_file.close()

            new_name = f"{filename}.txt"
            txt_file = open(os.path.join(directory, new_name), "w")
            txt_file.write(rtf_content)
            txt_file.close()

            os.remove(os.path.join(directory, file))

print("RTF to TXT conversion complete.")

directory_path = "D:\\rtf files"
convert_rtf_to_txt(directory_path)

This code converts all RTF files in the specified directory to TXT format by reading the content of each RTF file, creating a corresponding TXT file with the same content, and finally removing the original RTF files.

  • 2
    Your code looks like it works great for changing the extensions of a set of files! Unfortunately, it doesn't address the question since it doesn't convert the content of the files. RTF has a specific format that the question asker is looking to convert to plain text. See the Wikipedia page https://en.wikipedia.org/wiki/Rich_Text_Format – cjm Jun 27 '23 at 13:19