0

I'm using rubyzip to unzip some files during a rake task, but Im finding that the memory used isn't available after unzipping the files. I have to reboot the server to reclaim the memory. Anyone else having similar issues? Any workarounds?

I'm unzipping with the same code as the example on github

https://github.com/rubyzip/rubyzip

Zip::File.open('foo.zip') do |zip_file|
  # Handle entries one by one
  zip_file.each do |entry|
    # Extract to file/directory/symlink
    puts "Extracting #{entry.name}"
    entry.extract(dest_file)

  end
end

Any suggestions would be greatly appreciated!

Jeff Locke
  • 617
  • 1
  • 6
  • 17
  • is this related? http://stackoverflow.com/questions/27660966/why-does-ruby-release-memory-only-sometimes – BenjiBoyWick Feb 17 '15 at 12:55
  • if you're wanting to deliver zip files to users in a e.g. rails application, consider the zip_tricks gem .......... a lot of the memory problems can possibly be eliminated – BenKoshy Feb 06 '21 at 22:26

1 Answers1

1

If you have to operate on the data you can stream it directly from the zip archive and use one row at the time. Using this code I have no problem with memory usage.

require 'csv'
require 'zip'

zip_file = Zip::File.open('foo.zip')
entry = zip_file.entries.first
puts "Extracting #{entry.name}"
CSV.parse(entry.get_input_stream, headers: true) do |row|
  # do something with row
  p row
end

EDIT:

You can iterate over the stream (to not be limited about parsing CSV):

entry.get_input_stream.each do |line|
   p line
end
Andrea
  • 116
  • 1
  • 9
  • not sure if this will work for me. the zip file contains many underlying json files. don't think i can loop through the files with your suggested code – Jeff Locke Feb 17 '15 at 16:42
  • @JeffLocke please, check out my edit. Reading the code of csv.rb I found out that you can just iterate over the stream, so you are not limited to CSV files. Maybe you can give it a try, I am not 100% sure about the memory usage, but it should work fine. – Andrea Feb 17 '15 at 17:16