270

I need an easy way to take a tar file and convert it into a string (and vice versa). Is there a way to do this in Ruby? My best attempt was this:

file = File.open("path-to-file.tar.gz")
contents = ""
file.each {|line|
  contents << line
}

I thought that would be enough to convert it to a string, but then when I try to write it back out like this...

newFile = File.open("test.tar.gz", "w")
newFile.write(contents)

It isn't the same file. Doing ls -l shows the files are of different sizes, although they are pretty close (and opening the file reveals most of the contents intact). Is there a small mistake I'm making or an entirely different (but workable) way to accomplish this?

David Moles
  • 48,006
  • 27
  • 136
  • 235
Chris Bunch
  • 87,773
  • 37
  • 126
  • 127
  • 3
    That's a gzipped tar file (I hope). There are no "lines". Pls clarify what you're trying to achieve. – Brent.Longborough Sep 25 '08 at 01:26
  • are you trying to look at the compressed data or uncompressed content? – David Nehme Sep 25 '08 at 01:48
  • so chars in a compressed data stream will have roughly 1 in 256 chance of landing on "\n" defining end of a line, and that's ok if it doesn't expect "\r" too, see my answer below – Purfideas Sep 25 '08 at 01:53
  • This question should be re-titled as "Convert *binary* file to string", since `IO.read` would be the preferred answer otherwise. – Ian Nov 12 '14 at 17:44

9 Answers9

402

First, you should open the file as a binary file. Then you can read the entire file in, in one command.

file = File.open("path-to-file.tar.gz", "rb")
contents = file.read

That will get you the entire file in a string.

After that, you probably want to file.close. If you don’t do that, file won’t be closed until it is garbage-collected, so it would be a slight waste of system resources while it is open.

Rory O'Kane
  • 29,210
  • 11
  • 96
  • 131
David Nehme
  • 21,379
  • 8
  • 78
  • 117
  • 23
    The binary flag is only relevant on Windows, and this leaves the file descriptor open. File.read(...) is better. – Daniel Huckstep Dec 20 '11 at 22:59
  • Is there anything wrong with so many people looking this up and copy pasting it as a one-liner solution (like so many things on stackoverflow)? After all, it works, and the name for these functions were just an arbitrary choice of the ruby library designers. If only we had some language with synonyms... that still somehow knows exactly what we want in edge cases/ambiguous instances. Then I would just `contents = (contents of file "path to file.txt" as string)`. – masterxilo Dec 04 '14 at 20:58
  • 2
    This should be done in `begin {..open..} ensure {..close..} end` blocks – shadowbq Aug 04 '15 at 20:25
  • @Nobu If the file is malicious, could opening it up as a binary cause problems? – Arian Faurtosh Aug 19 '15 at 01:02
  • 3
    @ArianFaurtosh No, it's another method of reading the file -- it doesn't mean that it will be treated as an exectuable and run! That would be a horrifying side-effect for a simple 'read' method. – Matthew Read Oct 23 '15 at 16:29
  • The best one-liner (that does not leave the file descriptor open) IMHO would be `contents = File.open('path-to-file.tar.gz', 'rb', &:read)` – David Feb 01 '19 at 21:22
  • 2
    @David couldn't you simply do the following one-liner? `contents = File.binread('path-to-file.tar.gz')` See [apidock](https://apidock.com/ruby/v2_6_3/IO/binread/class). `File` is a subclass of `IO`. – vas May 17 '19 at 18:15
249

If you need binary mode, you'll need to do it the hard way:

s = File.open(filename, 'rb') { |f| f.read }

If not, shorter and sweeter is:

s = IO.read(filename)
  • In ruby 1.9.3+, IO.read will give you a string marked with the encoding in Encoding.default_external. I _think_ (?) the bytes will all be as they were in the file, so it's not exactly "not binary-safe", but you'll have to tag it with the binary encoding if that's what you want. – jrochkind Sep 22 '14 at 15:07
  • 1
    If shortness and sweetness is of the essence, the ampersand-symbol proc trick gives `s = File.open(filename, 'rb', &:read)` – Epigene Dec 04 '19 at 14:31
116

To avoid leaving the file open, it is best to pass a block to File.open. This way, the file will be closed after the block executes.

contents = File.open('path-to-file.tar.gz', 'rb') { |f| f.read }
Aaron Hinni
  • 14,578
  • 6
  • 39
  • 39
  • 10
    This is a better answer than David Nehme's because file descriptors are a finite system resource and exhausting them is a common problem that can easily be avoided. – Jeff McCune Jan 31 '12 at 00:15
20

Ruby have binary reading

data = IO.binread(path/filaname)

or if less than Ruby 1.9.2

data = IO.read(path/file)
bardzo
  • 305
  • 3
  • 6
17

how about some open/close safety.

string = File.open('file.txt', 'rb') { |file| file.read }
Stu Thompson
  • 38,370
  • 19
  • 110
  • 156
Alex
  • 5,909
  • 2
  • 35
  • 25
  • why not an explicit .close? Such as in the OP file.close when done? – Joshua Apr 13 '12 at 18:05
  • 3
    File.open() {|file| block} automatically closes when the block terminates. http://ruby-doc.org/core-1.9.3/File.html#method-c-open – Alex May 15 '12 at 01:09
  • 15
    This is identical to [Aaron Hinni's answer](http://stackoverflow.com/a/131096/215168) that was posted in 2008 (except not using OP's file and variable names)... – Abe Voelker Jun 26 '12 at 13:20
16

on os x these are the same for me... could this maybe be extra "\r" in windows?

in any case you may be better of with:

contents = File.read("e.tgz")
newFile = File.open("ee.tgz", "w")
newFile.write(contents)
Purfideas
  • 3,288
  • 1
  • 24
  • 17
6

You can probably encode the tar file in Base64. Base 64 will give you a pure ASCII representation of the file that you can store in a plain text file. Then you can retrieve the tar file by decoding the text back.

You do something like:

require 'base64'

file_contents = Base64.encode64(tar_file_data)

Have look at the Base64 Rubydocs to get a better idea.

  • Great, this looks like it'll work too! I'll have to check it out if for some reason reading the binary contents goes sour. – Chris Bunch Sep 25 '08 at 02:02
4

Ruby 1.9+ has IO.binread (see @bardzo's answer) and also supports passing the encoding as an option to IO.read:

  • Ruby 1.9

    data = File.read(name, {:encoding => 'BINARY'})
    
  • Ruby 2+

    data = File.read(name, encoding: 'BINARY')
    

(Note in both cases that 'BINARY' is an alias for 'ASCII-8BIT'.)

David Moles
  • 48,006
  • 27
  • 136
  • 235
-2

If you can encode the tar file by Base64 (and storing it in a plain text file) you can use

File.open("my_tar.txt").each {|line| puts line}

or

File.new("name_file.txt", "r").each {|line| puts line}

to print each (text) line in the cmd.

Joshua Pinter
  • 45,245
  • 23
  • 243
  • 245
Boris
  • 71
  • 1
  • 11