0

I have a folder containing several .txt files I want converted into strings.

I want to convert each of them to strings, the output being either 1 file containing a single line of text for each file or a combination of all source files into 1 file where each source file is just 1 line of text.

Is there a way to do this with glob or fnmatch using the following code:

open("data.txt").read().replace('\n', '')
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
gulfy
  • 47
  • 2
  • 6
  • you want the *contents* of each `.txt` file written into a line in a resulting output file? – chickity china chinese chicken Jan 26 '19 at 02:39
  • To use `glob.glob()`, just put the code in your question inside a `for filename in glob.,glob("*.txt"):` loop and use `filename` instead of hardcoding `"data.txt"` in the call to `open()`. Also note you might want to change the `\n` character to something instead of deleting it to preserve where they once occurred. Of course the replacement character would need to be something that couldn't be in the actually text of the files... – martineau Jan 26 '19 at 02:42
  • I'm curious, what's the difference between the two options? "...the output being either 1 file containing a single line of text for each file or a combination of all source files into 1 file where each source file is just 1 line of text." – Mad Physicist Jan 26 '19 at 02:59
  • '@Mad Physicist re: diff. btn. multiple 1-line files or a single file... Convenience for whoever wanted to answer with some code since I can easily join files or split a file. Each of these files starts with an identical string and ends with a unique one. – gulfy Jan 26 '19 at 03:11

1 Answers1

1

Using your code, this creates "1 file containing a single line of text for each file":

import glob, os

myfolder = 'folder' # name of your folder containing `.txt

with open('data.txt', 'w') as outfile:
    for txtfile in glob.glob(os.path.join(myfolder,  "*.txt")):
        with open(txtfile, 'r') as f:
            outfile.write(f.read().replace('\n',''))
  • It's probably a better idea to open the output file once outside the loop. You definitely don't need the + mode. – Mad Physicist Jan 26 '19 at 02:53
  • You should also be careful to close the input files, probably using a with block. – Mad Physicist Jan 26 '19 at 02:55
  • I was going to suggest putting the last work block around the for loop, and opening the file for `w`. That way you don't have to hold all the lines in memory and do expensive list expansions. – Mad Physicist Jan 26 '19 at 03:06
  • thanks, still trying to understand your suggestions - so more like that? – chickity china chinese chicken Jan 26 '19 at 03:12
  • 1
    Yes. +1. Last nit: you don't need data anymore. – Mad Physicist Jan 26 '19 at 03:14
  • '@ davedwards: this did the trick nice and neat. My next goal is to associate each of these strings with it's corresponding value in a table. I think I have that covered but we'll see how that goes. – gulfy Jan 26 '19 at 03:14
  • thanks for your feedback guys. glad it helped @guffy post another question (possibly referencing this one) if you need more helping getting on. cheers. – chickity china chinese chicken Jan 26 '19 at 03:16
  • Those who found this page/post helpful may find code in Bill Bell's answer to to this question: (https://stackoverflow.com/questions/41913147/combine-a-folder-of-text-files-into-a-csv-with-each-content-in-a-cell) useful for related tasks. – gulfy Jan 26 '19 at 05:14