14

I have a nearly 3 GB file that I would like to add two lines to the top of. Every time I try to manually add these lines, vim and vi freeze up on the save (I let them try to save for about 10 minutes each). I was hoping that there would be a way to just append to the top, in the same way you would append to the bottom of the file. The only things I have seen so far however include a temporary file, which I feel would be slow due to the file size. I was hoping something like:

grep -top lineIwant >> fileIwant

Does anyone know a good way to append to the top of the file?

Stephopolis
  • 1,765
  • 9
  • 36
  • 65
  • Re: "The only things I have seen so far however include a temporary file, which I feel would be slow due to the file size": you will need to read in the entire file, and write everything out, in any case, since you're "moving" every byte to a new position in the file. So you really might as well create a temporary file. – ruakh Feb 22 '13 at 20:37
  • 1
    "appending to the top" is usually known as "prepending", and there are a few other [questions on this topic](http://stackoverflow.com/questions/2690823/prepending-to-a-multi-gigabyte-file) – that other guy Feb 22 '13 at 20:53

5 Answers5

17

Try

cat file_with_new_lines file > newfile
Fredrik Pihl
  • 44,604
  • 7
  • 83
  • 130
7

I did some benchmarking to compare using sed with in-place edit (as suggested here) to cat (as suggested here).

~3GB bigfile filled with dots:

$ head -n3 bigfile
................................................................................
................................................................................
................................................................................

$ du -b bigfile
3025635308      bigfile

File newlines with two lines to insert on top of bigfile:

$ cat newlines
some data
some other data

$ du -b newlines
26      newlines

Benchmark results using dumbbench v0.08:

cat:

$ dumbbench -- sh -c "cat newlines bigfile > bigfile.new"
cmd: Ran 21 iterations (0 outliers).
cmd: Rounded run time per iteration: 2.2107e+01 +/- 5.9e-02 (0.3%)

sed with redirection:

$ dumbbench -- sh -c "sed '1i some data\nsome other data' bigfile > bigfile.new"
cmd: Ran 23 iterations (3 outliers).
cmd: Rounded run time per iteration: 2.4714e+01 +/- 5.3e-02 (0.2%)

sed with in-place edit:

$ dumbbench -- sh -c "sed -i '1i some data\nsome other data' bigfile"
cmd: Ran 27 iterations (7 outliers).
cmd: Rounded run time per iteration: 4.464e+01 +/- 1.9e-01 (0.4%)

So sed seems to be way slower (80.6%) when doing in-place edit on large files, probably due to moving the intermediary temp file to the location of the original file afterwards. Using I/O redirection sed is only 11.8% slower than cat.

Based on these results I would use cat as suggested in this answer.

Community
  • 1
  • 1
speakr
  • 4,141
  • 1
  • 22
  • 28
2

Try doing this :

using :

sed -i '1i NewLine' file

Or using :

ed -s file <<EOF
1i
NewLine
.
w
q
EOF
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • This is the correct way to solve the problem. Add the -i to edit the file. `sed -i '1i newline' /path/to/file` – Ken Feb 22 '13 at 20:40
  • 2
    Note to OP: this still uses a temporary file, but it at least hides the details from you. It won't be any faster, though. – chepner Feb 22 '13 at 21:15
2

The speed of such an operation depends greatly on the underlying file system. To my knowledge there isn't a FS optimized for this particular operation. Most FS organize files using full disk blocks, excepted for the last one, which may be partially used by the end of the file. Indeed, a file of size N would take N/S blocks, where S is the block size, and one more block for the remaining part of the file (of size N%S, % being the remainder operator), if N is not divisible by S.

Usually, these blocks are referenced by their indices on the disk (or partition), and these indices are stored within the FS metadata, attached to the file entry which allocates them.

From this description, you can see that it could be possible to prepend content whose size would be a multiple of the block size, by just updating the metadata with the new list of blocks used by the file. However, if that prepended content doesn't fill exactly a number of blocks, then the existing data would have to be shifted by that exceeding amount.

Some FS may implement the possibility of having partially used blocks within the list (and not only as the last entry) of used ones for files, but this is not a trivial thing to do.

See these other SO questions for further details:

At a higher level, even if that operation is supported by the FS driver, it is still possible that programs don't use the feature.

For the instance of that problem you are trying to solve, the best way is probably a program capable of catening the new content and the existing one to a new file.

Community
  • 1
  • 1
didierc
  • 14,572
  • 3
  • 32
  • 52
  • 1
    I guess my answer isn't much of use for your problem, but it might help understanding the reason behind the symptoms. – didierc Feb 22 '13 at 21:14
0
  cat file

   Unix
   linux   

It append to the the two lines of the file at the same time using the command

sed -i '1a C \n java ' file

 cat file
   Unix
   C
   java
   Linux

you want to INSERT means using i and Replace means using c

loganaayahee
  • 809
  • 2
  • 8
  • 13