9

Here is my YAML file, 'test.yml':

---
alpha: 100.0
beta: 200.0
gama: 300.0
--- 3
...

The first document is a hash.

The second document is an integer.

I am trying to load these to a Ruby program as a hash and an integer.

Here is my current attempt:

require 'yaml'

variables = YAML.load_file('test.yml')
puts variables.inspect
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
labelcd6
  • 119
  • 1
  • 5

3 Answers3

31

To access multiple YAML documents in a single file, use the load_stream method (as already mentioned by "matt" in a comment to one of the other answers):

YAML.load_stream(File.read('test.yml')) do |document|
  puts document
end
Andrew Marshall
  • 95,083
  • 20
  • 220
  • 214
Markus Miller
  • 3,695
  • 2
  • 29
  • 33
  • 1
    It looks like `#load_stream` accepts a YAML-encoded string rather than a filename. – Eric Walker Oct 23 '20 at 20:49
  • Yes, this should be `YAML.load_stream(File.read('test.yml'))` or the call with the block. With a filename only, `load_stream` will return `["test.yml"]`, not an array contained both parsed documents from the stream. – Olivier Lacan Jan 04 '21 at 19:40
5

The Tin Man is correct that the OP should not be using multiple documents for his specific problem; however, the situation of multiple documents in a YAML stream does occur in practice, for example when multiple YAML documents are appended to a single file, so it's worth knowing how to deal with it.

require 'yaml'

yaml = <<EOT
---
alpha: 100.0
beta: 200.0
gama: 300.0
---
int: 3
...
EOT

loop do
  puts YAML.load(yaml)
  break if !yaml.sub!(/(?<!#)(-{3}.+?(?=-{3})\n?){1}/m,'')
  break if yaml.empty?
end

# >> {"alpha"=>100.0, "beta"=>200.0, "gama"=>300.0}
# >> {"int"=>3}
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Penn Taylor
  • 362
  • 2
  • 9
  • 10
    You should just use [`load_stream`](http://ruby-doc.org/stdlib-2.1.0/libdoc/psych/rdoc/Psych.html#method-c-load_stream) rather than do this yourself (note the `Psych` constant is the same as `YAML` in current Rubys). – matt Jan 20 '14 at 23:03
  • 2
    It's unfortunate that the ruby-doc page for [YAML](http://ruby-doc.org/stdlib-2.1.0/libdoc/yaml/rdoc/YAML.html) doesn't provide a link to the Psych documentation. It doesn't even mention that there is more to the public interface than `load`, `dump`, and `to_yaml`. – Penn Taylor Jan 21 '14 at 03:08
  • 1
    @matt You should post your comment as an answer instead, because it is actually the correct answer to the question and it's buried under a zero-voted answer. – Justin Force Sep 12 '14 at 20:08
  • I do notice, with Ruby 2.4.1, psych 2.2.2, that using the `%YAML 1.2` as a "document header" makes Psych throw an error, of `Psych::SyntaxError` – FilBot3 Feb 16 '18 at 14:43
-4

Don't use multiple documents; They're not a substitute for defining individual elements in your data:

require 'yaml'

yaml = <<EOT
---
hash:
  alpha: 100.0
  beta: 200.0
  gama: 300.0
int: 3
EOT

YAML.load(yaml)   
# => {"hash"=>{"alpha"=>100.0, "beta"=>200.0, "gama"=>300.0}, "int"=>3}

You can access the contents by assigning YAML.load(yaml) to a variable:

data = YAML.load(yaml)
data['hash'] # => {"alpha"=>100.0, "beta"=>200.0, "gama"=>300.0}
data['int'] # => 3

Think of it this way, you're asking for an object of some sort from YAML after it parses the YAML file. You need to be able to extract specific values from it, so make it easy on yourself and define an array or a hash that contains what you want, in a way that works the way your brain does, within YAML's limitations and specs.

If I'm going to be creating a complicated structure, I do it in Ruby first and have YAML dump the format for me:

require 'yaml'

data = {
  "hash" => {
    "alpha" => 100.0,
    "beta" => 200.0,
    "gama" => 300.0
  },
  "int" => 3
}

puts data.to_yaml

# >> ---
# >> hash:
# >>   alpha: 100.0
# >>   beta: 200.0
# >>   gama: 300.0
# >> int: 3

I can put the Ruby code into a script and run it, redirecting it to a YAML file:

ruby test.rb > test.yaml

Then I can expand the structure:

require 'yaml'

data = {
  "hash" => {
    "alpha" => 100.0,
    "beta" => 200.0,
    "gama" => 300.0
  },
  "int" => 3,
  "array" => %w[a b c]
}

puts data.to_yaml

# >> ---
# >> hash:
# >>   alpha: 100.0
# >>   beta: 200.0
# >>   gama: 300.0
# >> int: 3
# >> array:
# >> - a
# >> - b
# >> - c

Testing it round-trip:

require 'yaml'

yaml = <<EOT
---
hash:
  alpha: 100.0
  beta: 200.0
  gama: 300.0
int: 3
array:
- a
- b
- c
EOT

YAML.load(yaml)
# => {"hash"=>{"alpha"=>100.0, "beta"=>200.0, "gama"=>300.0}, "int"=>3, "array"=>["a", "b", "c"]}

Iteratively do that until you're comfortable with YAML syntax, then you can build/tweak your YAML file by hand.

Now, here's how smart it is. The YAML spec supports aliases, which let us define a variable, then reuse it multiple times using & and * respectively. Creating those by hand is a pain when your document gets large, but the YAML driver is smart and will output them for you:

require 'yaml'

FOO = ['foo']
BAR = ['bar']

foobar = [FOO, BAR]

data = {
  "foo" => FOO,
  'bar' => BAR,
  'foobar' => foobar,
}

puts data.to_yaml


# >> ---
# >> foo: &1
# >> - foo
# >> bar: &2
# >> - bar
# >> foobar:
# >> - *1
# >> - *2

foo: &1 defines ["foo"], and gets reused at the bottom as *1.

"Yaml Cookbook at the YamlForRuby site" is a great reference for working with YAML.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • 4
    There are reasonable use cases where you would want to use multiple documents. For example a stream of documents. – Markus Miller Feb 16 '17 at 21:16
  • 1
    Yes, but in the OP's case using multiple documents wasn't the right approach. Answers aren't supposed to just solve the problem as asked, they're also supposed to educate. We see a lot of [XY Problem](http://xyproblem.info) type questions so pointing to a more appropriate way of handling the issue is important too. – the Tin Man Feb 16 '17 at 21:43
  • 2
    Sure, then it might be useful to try to make the answer a bit more precise by pointing out that multiple documents might not be the correct approach in this specific case. I'm mainly thinking about the first sentence, which sounds very drastic. – Markus Miller Feb 16 '17 at 21:44