5

I have a 23GB file and I would like to edit the 23rd line, but I have only 200 MB RAM available on the server. I do not want to open the file entirely because I have left only 20GB available disk space.

How can I do this. I tried to use head, tail sed but it seems it creates a temporary file. Is it possible to do it without a temporary file?

Paul
  • 26,170
  • 12
  • 85
  • 119
Amrida D
  • 329
  • 1
  • 5
  • 17
  • Did you try sed with edit in place? – Atropo Jun 09 '15 at 08:54
  • Most (all?) seds, GNU sed among them, use a temporary file for `-i`. – Wintermute Jun 09 '15 at 09:00
  • Is that a temporary file or a backup file? – Atropo Jun 09 '15 at 09:01
  • 1
    It reads from one file, writes to the other, then removes the original (unless it's instructed to keep a backup) and renames the new file to the old name. – Wintermute Jun 09 '15 at 09:03
  • 2
    Perhaps you could adapt the trick from [here](https://superuser.com/questions/378230/how-can-i-compress-a-file-on-linux-in-place-without-using-additional-disk-space), as in `sed '23s/foo/bar/' filename | dd of=filename conv=notrunc` (and truncate to the new length afterwards if the resulting file is shorter). **Test that before you use it on live data. You do have a backup, right? Right?!** – Wintermute Jun 09 '15 at 09:07
  • 2
    You rent a virtual machine, copy the 23GB file to it, edit it, check it, delete the original, and then copy the edited file back to the original location. Alternatively you buy some more storage hardware. – Paul Jun 09 '15 at 09:09
  • also sed '23 {s/foo/bar; q}' where q is quit can be reconsidered – josifoski Jun 09 '15 at 09:12
  • You can see this [question](http://unix.stackexchange.com/questions/11067/is-there-a-way-to-modify-a-file-in-place) that uses `dd` as in the @Wintermute example – Atropo Jun 09 '15 at 09:14
  • 3
    @josifoski a quit will stop to ouptut the lines after 23 so, new file will only have 23 lines at the end, not an option in this case – NeronLeVelu Jun 09 '15 at 09:20
  • 1
    Hmm...come to think of it, does the `dd` trick have a change of working if the transformation makes the file larger? Another thought: Compress the file, then `zcat` it through `sed`, i.e., `gzip file; zcat file.gz | sed '23s/foo/bar/' > file`. You might have enough space for that left. – Wintermute Jun 09 '15 at 09:22
  • copying part of file on usb, deleting that part of bigfile, using sed, some hdd memory will be freed or using that overwrite.c idea or best using better hardware – josifoski Jun 09 '15 at 09:41
  • Use c/c++. Seek to the position, write whatever you want. close the file ? – 123 Jun 09 '15 at 15:20
  • See http://stackoverflow.com/a/17331179/1745001 for another use of `dd` (your only hope). – Ed Morton Jun 09 '15 at 19:15
  • The answer to the following question might help you: http://stackoverflow.com/questions/8353536/emacs-to-read-large-files-14gb/8353828#8353828 – danizmax Jun 11 '15 at 12:41
  • Will the revised file have a 23rd line that is bigger, smaller, or the same size as the original? You can write code to handle any of the three cases (same size is easiest). Whether there's a standard tool suitable is considerably more debatable. You definitely need a back-up of some sort. Are you sure you can't compress some other files on the system and give yourself enough space to work in? Can't you get new disk space for the machine? (I'm guessing, given the mention of 200MiB memory, that it is an antique on its last legs and not expandable any more.) – Jonathan Leffler Jun 16 '15 at 01:02

2 Answers2

1

The solution is to edit the file with a hex editor. Hex editors are built to handle huge files, even whole disks and partitions.

You may find hexedit (ncurses based) or ghex (Gnome/Gtk based) useful. They are common utilities, therefore you will most probably find them in your distributions's repo.

All hex editors I have used, use a twin panel view with the left panel showing the bytes of the file in Hex, and the right panel trying to show an Ascii representation when that is possible.

In order to find and edit your 23rd line:

sed -n '23p' my_huge_dump.sql : Will print the contents of this line
sed -n '23p' my_huge_dump.sql | od -A n -t x1 : Will print the contents of this line in hexadecimal format.

or open the file with less -N my_huge_dump.sql and view the contents of line 23. (-N in less enables line numbering)

Now, knowing the content of the 23rd line:

  • If the text of this line is somewhat unique and different from surrounding lines, you may find it from the right (ascii) panel and navigate to this line with the arrows. In hexedit you use the Tab key to move between the Hex and Ascii panels. In gHex you can use your mouse as well. You may also search for the string you're interested: Move to the Ascii panel and press / in hexedit or use the menu in gHex.
  • If the line you want to edit has similar contents to other lines and you can't find it in the ascii panel, then you must count the "newline" separators to find the 23rd line. New lines (LF) are represented as 0A in hex. In the ASCII panel, new lines are represented as dots .

Then assuming you found the line you want to edit, you have the following options:

  • Hopefully, the new content of the 23rd line is shorter or equal in length to the existing content (so you won't need to grow and move the whole file). In this case, you have to enter the Fill-mode i.e. the mode in which you overwrite existing content typing over the old text. This is the default mode in both gHex and hexedit. Move to the location you want to edit and start typing. Pressing Backspace will undo your changes. If the new content is shorter than the existing, you may fill up the line with spaces to avoid truncating the file.
  • If the new content is longer than the existing one in this line, then you have to enter the Insert mode. You can do that using the Menu in gHex. In hexedit you have to use the EscI keybinding. Then start typing and the new characters will be appended in the current location.

In the first case, it is guaranteed that the editing and saving of the file will be instantaneous since an in-place edit will happen. In the later case, I'm not sure how the growing in size and the moving of following bytes will be handled, but I hope the filesystem uses a larger non-continuous block to move some of the contents and not move the whole file.

If you're happy with your changes, save the file:

  • Use the menu in gHex
  • Use Ctrlx in hexedit and answer (Y)es when questioned about whether to save the changes.

Always make sure you have a backup in place!

EDIT: I found out that gHex isn't suitable for your situation, since it tries to load the whole file in memory. hexedit will serve you fine. However, if you want a graphical editor like gHex, but with partial file loading capabilities, you may try wxHexEditor. Check also the Comparison of Hex editors page in Wikipedia.

henfiber
  • 1,209
  • 9
  • 11
0

Liquid Studio Community Edition contains a Large File Editor which can open and edit Terra-byte files on low spec machines, and its free.

It requires enough disk space to copy the file (when writing it back out), but hardly requires any memory.

Liquid Studio Large File Editor

Sprotty
  • 5,676
  • 3
  • 33
  • 52