I have a batch of raw text files. Each file begins with Date>>month.day year News garbage
.
garbage
is a whole lot of text I don't need, and varies in length. The words Date>>
and News
always appear in the same place and do not change.
I want to copy month day year and insert this data into a CSV file, with a new line for every file in the format day month year.
How do I copy month day year into separate variables?
I tryed to split a string after a known word and before a known word. I'm familiar with string[x:y], but I basically want to change x and y from numbers into actual words (i.e. string[Date>>:News])
import re, os, sys, fnmatch, csv
folder = raw_input('Drag and drop the folder > ')
for filename in os.listdir(folder):
# First, avoid system files
if filename.startswith("."):
pass
else:
# Tell the script the file is in this directory and can be written
file = open(folder+'/'+filename, "r+")
filecontents = file.read()
thestring = str(filecontents)
print thestring[9:20]
An example text file:
Date>>January 2. 2012 News 122
5 different news agencies have reported the story of a man washing his dog.