So I have came with a code that does the job quite efficiently.
Firstly, we can use the IO#each_line
method. Say we need the line at 3,000,000:
#!/usr/bin/ruby -w
file = File.open(File.join(__dir__, 'hello.txt'))
final = nil
read_upto = 3_000_000 - 1
file.each_line.with_index do |l, i|
if i == read_upto
final = l
break
end
end
file.close
p final
Running with the time
shell builtin:
[I have a big hello.txt file with #!/usr/bin/ruby -w #lineno in it!!]
$ time ruby p.rb
"#!/usr/bin/ruby -w #3000000\n"
real 0m1.298s
user 0m1.240s
sys 0m0.043s
We can also get the 1st line very easily! You got it...
Secondly, extending anothermh's answer:
#!/usr/bin/ruby -w
enum = IO.foreach(File.join(__dir__, 'hello.txt'))
# Getting the first line
p enum.first
# Getting the 100th line
# This can still cause memory issues because it
# creates an array out of each line
p enum.take(100)[-1]
# The time consuming but memory efficient way
# reading the 3,000,000th line
# While loops are fastest
index, i = 3_000_000 - 1, 0
enum.next && i += 1 while i < index
p enum.next # reading the 3,000,000th line
Running with time
:
time ruby p.rb
"#!/usr/bin/ruby -w #1\n"
"#!/usr/bin/ruby -w #100\n"
"#!/usr/bin/ruby -w #3000000\n"
real 0m2.341s
user 0m2.274s
sys 0m0.050s
There could be other ways like the IO#readpartial
, IO#sysread
and so on. But The IO.foreach
, and IO#each_line
are the easiest and quite fast to work with.
Hope this helps!