4

I have a large text file (~3GB). While creating it, I made a mistake and the first few (not all) characters of the first line only are wrong. The rest of the characters in the first line are correct. All the other lines are also correct. My, quite simple, question is:

How do I remove the first n characters of such a large file? I don't want to delete the whole first line, only the first n characters. My requirements are:

  1. Without replacing all the remaining lines.
  2. Without reading the whole file.
  3. Ideally, using unix shell commands.

I have tried sed, but this replaces the first line and copies all the rest... Maybe I could use the "quit" option? Of course, I could create the file again, without making that mistake...

vefthym
  • 7,422
  • 6
  • 32
  • 58
  • possible duplicate of [How can I remove the first line of a text file using bash/sed script?](http://stackoverflow.com/questions/339483/how-can-i-remove-the-first-line-of-a-text-file-using-bash-sed-script) – Ciro Santilli OurBigBook.com Oct 16 '14 at 08:55
  • @CiroSantilli how is that a duplicate??? I don't want to delete the first line... – vefthym Oct 16 '14 at 08:56
  • Same principle: how to efficiently remove from front of file. Answer: not possible in Linux AFAIK. – Ciro Santilli OurBigBook.com Oct 16 '14 at 08:57
  • http://stackoverflow.com/questions/18072180/truncating-the-first-100mb-of-a-file-in-linux – Ciro Santilli OurBigBook.com Oct 16 '14 at 09:02
  • 1
    Related: http://stackoverflow.com/questions/18072180/truncating-the-first-100mb-of-a-file-in-linux Summary: removing data from the beginning of a file without modifying the rest is hard. – Joni Oct 16 '14 at 09:02
  • 1
    If you are worried about having enough eyes seeing this and providing a good answer, you can wait a couple of days and offer a bounty. Editing over and over again won't help much. – fedorqui Oct 16 '14 at 11:41
  • 1
    @fedorqui I 'll just wait a few hours. I didn't edit the question to get better answers, just to make it clearer for future reference :) – vefthym Oct 16 '14 at 11:46

1 Answers1

6

You can use:

sed -i.bak -r '1s/^.{10}//' file

This will create a backup file.bak and remove the first 10 characters from the first line. Note -i alone can also be used, to do in-place edit without backup.

Test

Original file:

$ cat a
1234567890some bad data and here we are
blablabla
yeah

Let's:

$ sed -i.bak -r '1s/^.{10}//' a
$ cat a
some bad data and here we are
blablabla
yeah
$ cat a.bak 
1234567890some bad data and here we are
blablabla
yeah
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    +1 Good answer, thanks. However, it deletes the first 10 characters and then copies the rest of the file. – vefthym Oct 16 '14 at 09:40
  • 1
    Well, you can also remove these characters in the first line like this: `sed -r '1{s/^.{10}//;q}' file > new_file`, which redirects to another file. Then use `tail -n +2 file >> new_file` to get everything from the 2nd line and finally `mv new_file file` to replace the original one. – fedorqui Oct 16 '14 at 09:44
  • Good point, too! However, the functionality I would want ideally would be: delete the first n characters, stop. What you describe is more like: delete the first n characters, copy the rest of the file, stop. Right? – vefthym Oct 16 '14 at 09:48
  • 1
    Well, I don't think there is such thing. Note that deleting content from the beginning of the file implies moving the rest of it. – fedorqui Oct 16 '14 at 09:50
  • 1
    @vefthym Whether you take MS Word or vi editor, if you delete some characters from the beginning the subsequent characters are automatically moved to the front. – cppcoder Oct 16 '14 at 09:56
  • 1
    You are both right, the point in my question about editors was irrelevant. So, if there is no way to do exactly what I want, I guess @fedorqui 's answer is the best I could get. – vefthym Oct 16 '14 at 10:10