0

I have a db dump file that is over 5 gigs in size and I'm looking to do a quick edit to the create database and use database command. This dump is provided to me.

I've been using vim to do this from the command line, but it takes a while to load. I'm able to use less to read very quickly. Is there a way to edit the file without having to wait several minutes for the full file to load in vim? This can be a param passed to vim, or different common way to edit files from command line.

I'm looking for a general solution that I can apply to other large files too, so would like a linux command that would allow me to edit the top of the file quickly.

Luboš Turek
  • 6,273
  • 9
  • 40
  • 50
James Oravec
  • 19,579
  • 27
  • 94
  • 160
  • @Meninx, the other question you suggest specifically states that they don't need to edit the file. This question does need edit. – James Oravec Feb 06 '17 at 15:29
  • Does your edit change the total size of the region edited? If so, is it towards the top or bottom of the file? – Charles Duffy Feb 06 '17 at 15:31
  • @CharlesDuffy, yes, it changes the total size, towards the top of the file. – James Oravec Feb 06 '17 at 15:32
  • 1
    Then a full rewrite of all content past what you're editing is unavoidable. POSIX semantics don't have a "insert some extra blocks here" operation, or a "shift everything down by X bytes" operation. – Charles Duffy Feb 06 '17 at 15:32
  • 1
    ...now, if you want to make *opening* the file faster, that's doable (perhaps by using an editor that supports `mmap`). But saving it is necessarily going to be a pain. – Charles Duffy Feb 06 '17 at 15:35
  • Does your edit increase or reduce the file size? – Leon Feb 06 '17 at 15:39
  • Is your file format sensitive to white space or does it support certain kind of comments (I mean - can you add certain content to your file without affecting its semantics)? – Leon Feb 06 '17 at 15:42
  • @CharlesDuffy, do you know of a CLI editor that supports `nmap` that comes installed by default by RHEL or CentOS? – James Oravec Feb 06 '17 at 15:43
  • @Leon, ...ahh -- in the case of the edit making it smaller, you're proposing just adding whitespace or comments? Good call. – Charles Duffy Feb 06 '17 at 16:02
  • @VenomFangs, I honestly don't know. I suppose I could look into whether emacs can be configured to only mmap what it's editing and operate in-place -- I mean, it can be configured for any number of things -- but it'd be easier just to do some magic with dd to copy out and directly operate on the chunk you care about, and then dd the rest of the source file onto the end when you're done. It'll be a big slow rewrite, but you're doomed to that anyhow. – Charles Duffy Feb 06 '17 at 16:03
  • 1
    @CharlesDuffy Yes. And in case of increased file size one can, provided that this is going to be done on a sufficiently new system, [`fallocate`](http://stackoverflow.com/a/37884191/6394138) a new block near the location of the edit and then pad the extra bytes with whitespace or comments. – Leon Feb 06 '17 at 16:09
  • @Leon, heh -- yay for non-POSIX semantics. That's a great answer, and I think it's unfortunate that the question it's on is closed. Perhaps if the OP there hadn't tagged in a bunch of languages it wasn't topical for, it wouldn't have gotten into audiences that it wasn't interesting to. I've edited the tagging over there and nominated to reopen. – Charles Duffy Feb 06 '17 at 16:26
  • `ex` (or rather `vim` in `ex` mode) might be useful. – chepner Feb 06 '17 at 16:30
  • @CharlesDuffy Thanks. I never thought that reopening that question made any sense (perhaps I had a newbie mentality then as well as possibly lacked the required privileges). I added my reopen vote to yours. – Leon Feb 06 '17 at 17:09

1 Answers1

0

You can use cat:

cat file_with_create_cmd db_dump > new_dump

If you want to use that in a subsequent command instead of writing it to a file, you may use process substitution:

process_dump <(cat file_with_create_cmd db_dump)
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • I was thinking about this approach too, was thinking would need a `split` then edit, then concat, but haven't looked into how much time the split time would take. Is this what you are thinking too? Or do you have something else in mind? – James Oravec Feb 06 '17 at 15:42
  • You just want to add the create commands *on top* of the dump? – hek2mgl Feb 06 '17 at 15:45
  • @VenomFangs, `split` is overkill. If you're only modifying content inside the first 1kb, use `dd` to copy out only the first 1kb (trivial!), edit that, and then dd everything *but* the first 1kb on top. That way the only part you're writing more than once is the segment that you're editing (being written first in its initial state, then a second time by your editor after changes have been made). – Charles Duffy Feb 06 '17 at 17:46