0

So i have a file that is named "lorem.txt" (As a test file atm) that contains 2 lines:

hello:world
goodbye:now

I want to be able to remove the text occurring before the ':', but i cannot load the whole file into memory (As the file will in reality be much larger), which should result in:

world
now

As the final file (overwriting the previous data)

So far i have:

import re
with open ('lorem.txt', 'r+') as myfile:  # Open lorem.txt for reading and writing
    for myline in myfile:              # For each line, read to a string,
        re.sub(r'^.*?:',' ', myline) # Remove text? Regex is fine but it's refusing

However the regex is correct, but it doesn't actually write to the file!

Edit: I have done it in powershell instead, see below for it. Leaving this open incase anyone has an answer for python.

# This loops through all .txt files in a folder, removing all text occurring before a ':' on each line, then writing it to an output file.
$files = gci -file *.txt # Get all files ending with .txt, put them in an array named files

$a = 0 # Set variable to 0 (Currently a string!!!)
$n = [int]$a # convert it to a string in the 'N' variable, just to be sure


foreach($file in $files){ # For every file in this array, do the following
foreach($line in Get-Content $files[$n]){ # For every line in each file, do the following
New-Item .\$n.txt # Create a new file for each file it proccesses
Add-Content .\$n.txt $line.split(':')[1] # Write to this new file every loop
$n+1 # Increment 'n'
}
}
GhostDog98
  • 29
  • 5
  • Does this answer your question? [How to search and replace text in a file?](https://stackoverflow.com/questions/17140886/how-to-search-and-replace-text-in-a-file) – akane Sep 15 '20 at 01:26
  • I can't _really_ use a temp file, as the text files i'll be processing are a few hundred gigabytes... – GhostDog98 Sep 15 '20 at 01:57
  • probably you can consider using `sed/perl/grep` for this kind of work – Onyambu Sep 15 '20 at 02:14

0 Answers0