So i have a file that is named "lorem.txt" (As a test file atm) that contains 2 lines:
hello:world
goodbye:now
I want to be able to remove the text occurring before the ':', but i cannot load the whole file into memory (As the file will in reality be much larger), which should result in:
world
now
As the final file (overwriting the previous data)
So far i have:
import re
with open ('lorem.txt', 'r+') as myfile: # Open lorem.txt for reading and writing
for myline in myfile: # For each line, read to a string,
re.sub(r'^.*?:',' ', myline) # Remove text? Regex is fine but it's refusing
However the regex is correct, but it doesn't actually write to the file!
Edit: I have done it in powershell instead, see below for it. Leaving this open incase anyone has an answer for python.
# This loops through all .txt files in a folder, removing all text occurring before a ':' on each line, then writing it to an output file.
$files = gci -file *.txt # Get all files ending with .txt, put them in an array named files
$a = 0 # Set variable to 0 (Currently a string!!!)
$n = [int]$a # convert it to a string in the 'N' variable, just to be sure
foreach($file in $files){ # For every file in this array, do the following
foreach($line in Get-Content $files[$n]){ # For every line in each file, do the following
New-Item .\$n.txt # Create a new file for each file it proccesses
Add-Content .\$n.txt $line.split(':')[1] # Write to this new file every loop
$n+1 # Increment 'n'
}
}