-1

There is text file about 3 GB. I need to delete a some strings from this, but I'm not sure that my method is good. I did next steps: - read echo string from the doc - find needed strings to delete - get 2 massive: strings to save and strings to delete

What is must next steps? Yeah, this task looks easy for small docs, but there are more issues with giant file.

Paul K.
  • 79
  • 1
  • 11
  • Show us what you've tried that isn't working. For a file that size I would go through it line by line and put what you want to save into a temporary file. You probably don't want to try bringing the entire file into memory. – Dave May 04 '18 at 17:23
  • I wouldn't use PHP for that. You can do this very efficiently with sed under linux - see: https://stackoverflow.com/questions/5410757/delete-lines-in-a-text-file-that-contain-a-specific-string – maio290 May 04 '18 at 17:38
  • @Dave, you are right. My last step was saving of needed strings in array. I limited by 50 strings for testing, because I don't know what expect from 1000+ in memory and look for ways to make more optimize. – Paul K. May 06 '18 at 22:05

1 Answers1

0
if( $fh = fopen("file.txt", "r") ){ 
        $left='';

        while (!feof($fh)) {// read the file
        $temp = fread($fh);  
        $fgetslines = explode("\n",$temp);
        $fgetslines[0]=$left.$fgetslines[0];

             if(!feof($fh) )$left = array_pop($lines);           
             foreach($fgetslines as $k => $line){
                   //This is where you can build your check for the strings you want to remove 
                   //if statement or switch, which ever makes sence with your current logic.
                   //After excluding your strings from the temp file 
                   //overwrite your original file with the temp file of proper strings that you want.
              }
        }
}
fclose($fh);

I think this is what your looking for.

  • I didn't know before about fread. But I'm thinking that opations with 3 GB file as array is bad idea. Also, looks like to use fread and while is not correct. This script also will report about $lines doesn't exist. My main question is how to better work with 3 GB file and don't kill software. – Paul K. May 06 '18 at 22:33