0

I have multiple data files that, sometimes, I want to delete unnecessary extra data lines in them, say from line 8000 to the last line. I want to do that for all the data files, which are contained in .txt files. I used to do it manually, but it's a tedious job to do that frequently. I have combined and modified some codes that found I online, but I have two issues with it:

  1. I don't know how to set the range, so that the code will delete all lines after line 8000. until the end of data Therefore I have assumed that the number of lines to be deletes is some large number (1000). But that's not a good solution, since the total number of lines in the data file might be more than 9000.
  2. The second thing is that I cannot copy or move the created temp file, which contains the new data without the extra lines to replace the original file :( I found out the the universally unique identifier (UUID) in the temp folder is not the same as the one in the "outfile" for instance the UUID code for outfile in Matlab is tp58460076_aaad_4621_b2ac_c7036febc3f0 and in temp folder it is tp30ade3f8_3abc_4e00_a2ef_26d67c5f836e I have used both copyfile and movefile in Matlab

If someone have an alternative solution or knows what's going wrong with this code, please help!

close all
clear
clc
% Specify the folder where the files live.
myFolder = 'C:\Users\Emma\Data';
% Check to make sure that folder actually exists.  Warn user if it doesn't.
if ~isdir(myFolder)
  errorMessage = sprintf('Error: The following folder does not exist:\n%s', myFolder);
  uiwait(warndlg(errorMessage));
  return;
end
% Get a list of all files in the folder with the desired file name pattern.
filePattern = fullfile(myFolder, '*.txt'); % Change to whatever pattern you need.
theFiles = dir(filePattern);

for k = 1 : length(theFiles)
  baseFileName = theFiles(k).name;
  fullFileName = fullfile(myFolder, baseFileName);
  fprintf(1, 'Now reading %s\n', fullFileName);
%*****************************
first_line_to_delete = 8003;
num_lines_to_delete = 1000;
infilename = fullFileName;
[pathstr, file, ext] = fileparts(infilename);
backfile = fullfile(pathstr, [file '.bak']);
if strcmp(infilename, backfile)
  error('I refuse to edit a backup file! Nothing has been changed.');
end

outfile = tempname;   %a temporary file in TMP directory
fin = fopen(infilename, 'r');
if fin < 0; error('Input file does not exist'); end 
fout = fopen(tempname, 'w');
if fout < 0
  fclose(fin);
  error('Could not open temporary output file');
end

%read lines before the one to be deleted and write them to output
for K = 1 : first_line_to_delete - 1;
  inline = fgets(fin);
  if ~ischar(inline); break; end;  %end of file?
  fwrite(fout, inline);
end
for K = 1 : num_lines_to_delete;
  if ~ischar(inline); break; end   %in case EOF
  inline = fgets(fin);   %and do nothing with it
end
%copy all remaining input lines to output file
while ischar(inline)
  inline = fgets(fin);
  if ischar(inline)   %not if we hit EOF
    fwrite(fout, inline);
  end
end
fclose(fin);
fclose(fout);

%we did the copying and have a file with the desired
%result. Now put it in the proper place
[status,message,messageId]  = copyfile(infilename, backfile, 'f'); %Emma mod
if ~status
  if strcmp(infilename, backfile)
    fprintf(2, 'Good thing your programmer is paranoid about people overriding\nsanity checks, because something went wrong and you nearly lost your file!\n');
  else
    delete(backfile);
  end
  error('Could not rename file to .bak, file left untouched');
else
  [status,message,messageId]  = copyfile(fullfile(outfile, [myFolder, infilename]), 'f');
  if ~status
    error( ['Could not rename temp file to original name, original moved to ', backfile]);
  end
end
%**************************
end
Emma
  • 149
  • 10
  • http://stackoverflow.com/questions/19017994/how-do-i-limit-or-truncate-text-file-by-number-of-lines. Otherwise, just delete the `for K = ...` loop and the subsequent `while ischar...` loop entirely. – beaker May 13 '17 at 20:56
  • @beaker Thank you very much! A shell script of one line replaced this lengthy Matlab code :) Luckily I have access to some terminal! – Emma May 15 '17 at 12:19

1 Answers1

0

The answer to this question is the following command:

sed -i '8003,$ d' *.txt
Emma
  • 149
  • 10