I tried opening a huge (~2GB) file in VIM but it choked. I don't actually need to edit the file, just jump around efficiently.
How can I go about working with very large files in VIM?
I tried opening a huge (~2GB) file in VIM but it choked. I don't actually need to edit the file, just jump around efficiently.
How can I go about working with very large files in VIM?
I had a 12GB file to edit today. The vim LargeFile plugin did not work for me. It still used up all my memory and then printed an error message :-(. I could not use hexedit for either, as it cannot insert anything, just overwrite. Here is an alternative approach:
You split the file, edit the parts and then recombine it. You still need twice the disk space though.
Grep for something surrounding the line you would like to edit:
grep -n 'something' HUGEFILE | head -n 1
Extract that range of the file. Say the lines you want to edit are at line 4 and 5. Then do:
sed -n -e '4,5p' -e '5q' HUGEFILE > SMALLPART
-n
option is required to suppress the default behaviour of sed to print everything4,5p
prints lines 4 and 55q
aborts sed after processing line 5 Edit SMALLPART
using your favourite editor.
Combine the file:
(head -n 3 HUGEFILE; cat SMALLPART; sed -e '1,5d' HUGEFILE) > HUGEFILE.new
HUGEFILE.new
will now be your edited file, you can delete the original HUGEFILE
.
This has been a recurring question for many years. (The numbers keep changing, but the concept is the same: how do I view or edit files that are larger than memory?)
Obviously more
or less
are good approaches to merely reading the files --- less
even offers vi
like keybindings for scrolling and searching.
A Freshmeat search on "large files" suggests that two editors would be particularly suited to your needs.
One would be: lfhex ... a large file hex editor (which depends on Qt). That one, obviously, entails using a GUI.
Another would seem to be suited to console use: hed ... and it claims to have a vim
-like interface (including an ex
mode?).
I'm sure I've seen other editors for Linux/UNIX that were able to page through files without loading their entirety into memory. However, I don't recall any of their names. I'm making this response a "wiki" entry to encourage others to add their links to such editors. (Yes, I am familiar with ways to work around the issue using split
and cat
; but I'm thinking of editors, especially console/curses editors which can dispense with that and save us the time/latencies and disk space overhead that such approaches entail).
Since you don't need to actually edit the file:
I wrote a little script based on Florian's answer that uses nano (my favorite editor):
#!/bin/sh
if [ "$#" -ne 3 ]; then
echo "Usage: $0 hugeFilePath startLine endLine" >&2
exit 1
fi
sed -n -e $2','$3'p' -e $3'q' $1 > hfnano_temporary_file
nano hfnano_temporary_file
(head -n `expr $2 - 1` $1; cat hfnano_temporary_file; sed -e '1,'$3'd' $1) > hfnano_temporary_file2
cat hfnano_temporary_file2 > $1
rm hfnano_temporary_file hfnano_temporary_file2
Use it like this:
sh hfnano yourHugeFile 3 8
In that example, nano will open up lines 3 through 8, you can edit them, and when you save and quit, those lines in the hugefile will automatically be overwritten with your saved lines.
I had the same problem, but it was a 300GB mysql dump and I wanted to get rid of the DROP
and change CREATE TABLE
to CREATE TABLE IF NOT EXISTS
so didn't want to run two invocations of sed
. I wrote this quick Ruby script to dupe the file with those changes:
#!/usr/bin/env ruby
matchers={
%q/^CREATE TABLE `foo`/ => %q/CREATE TABLE IF NOT EXISTS `foo`/,
%q/^DROP TABLE IF EXISTS `foo`;.*$/ => "-- DROP TABLE IF EXISTS `foo`;"
}
matchers.each_pair { |m,r|
STDERR.puts "%s: %s" % [ m, r ]
}
STDIN.each { |line|
#STDERR.puts "line=#{line}"
line.chomp!
unless matchers.length == 0
matchers.each_pair { |m,r|
re=/#{m}/
next if line[re].nil?
line.sub!(re,r)
STDERR.puts "Matched: #{m} -> #{r}"
matchers.delete(m)
break
}
end
puts line
}
Invoked like
./mreplace.rb < foo.sql > foo_two.sql
For huge one-liners (prints characters from 1
to 99
):
cut -c 1-99 filename
It's already late but if you just want to navigate through the file without editing it, cat
can do the job too.
% cat filename | less
or alternatively simple:
% less filename
emacs works very well with files into the 100's of megabytes, I've used it on log files without too much trouble.
But generally when I have some kind of analysis task, I find writing a perl script a better choice.
Old thread. But nevertheless( pun :) ).
$less filename
less works efficiently if you don't want to edit and just look around which is the case for examining huge log files.
Search in less works like vi
Best part, it's available by default on most distros. So won't be problem for production environment as well.