4

I'd like to check if a string is valid YAML. I'd like to do this from within my Ruby code with a gem or library. I only have this begin/rescue clause, but it doesn't get rescued properly:

def valid_yaml_string?(config_text)
  require 'open-uri'
  file = open("https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration")
  hard_failing_bad_yaml = file.read
  config_text = hard_failing_bad_yaml
  begin
    YAML.load config_text
    return true
  rescue
    return false
  end
end

I am unfortunately getting the terrible error of:

irb(main):089:0> valid_yaml_string?("b")
Psych::SyntaxError: (<unknown>): mapping values are not allowed in this context at line 6 column 19
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:203:in `parse_stream'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:151:in `parse'
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/lib/ruby/1.9.1/psych.rb:127:in `load'
from (irb):83:in `valid_yaml_string?'
from (irb):89
from /home/kentos/.rvm/rubies/ruby-1.9.3-p374/bin/irb:12:in `<main>'
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Ninjaxor
  • 876
  • 12
  • 27

1 Answers1

8

Using a cleaned-up version of your code:

require 'yaml'
require 'open-uri'

URL = "https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration"

def valid_yaml_string?(yaml)
  !!YAML.load(yaml)
rescue Exception => e
  STDERR.puts e.message
  return false
end

puts valid_yaml_string?(open(URL).read)

I get:

(<unknown>): mapping values are not allowed in this context at line 6 column 19
false

when I run it.

The reason is, the data you are getting from that URL isn't YAML at all, it's HTML:

open('https://github.com/TheNotary/the_notarys_linux_mint_postinstall_configuration').read[0, 100]
=> "  \n\n\n<!DOCTYPE html>\n<html>\n  <head prefix=\"og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# githubog:"

If you only want a true/false response whether it's parsable YAML, remove this line:

STDERR.puts e.message

Unfortunately, going beyond that and determining if the string is a YAML string gets harder. You can do some sniffing, looking for some hints:

yaml[/^---/m]

will search for the YAML "document" marker, but a YAML file doesn't have to use those, nor do they have to be at the start of the file. We can add that in to tighten up the test:

!!YAML.load(yaml) && !!yaml[/^---/m]

But, even that leaves some holes, so adding in a test to see what the parser returns can help even more. YAML could return an Fixnum, String, an Array or a Hash, but if you already know what to expect, you can check to see what YAML wants to return. For instance:

YAML.load(({}).to_yaml).class
=> Hash
YAML.load(({}).to_yaml).instance_of?(Hash)
=> true

So, you could look for a Hash:

parsed_yaml = YAML.load(yaml)
!!yaml[/^---/m] && parsed_yaml.instance_of(Hash) 

Replace Hash with whatever type you think you should get.

There might be even better ways to sniff it out, but those are what I'd try first.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • In IRB try `!true` then try `!!true`, then try `!('foo'=='foo')` followed by `!!('foo'=='foo')`. – the Tin Man May 02 '13 at 01:53
  • Thanks Tin Man! well put. This was a sever hole in google I was stumbling on (I was forgetting to put in the rescue expression). About !!, I think that must be useful in cases like !!nil when you want a method returning either true or false, not nils or strings? Very tidy, thanks again. – Ninjaxor May 02 '13 at 02:04
  • 1
    Yes, `!!` is a nice shorthand that forces true/false responses. It works in any language that uses `!` for a logical NOT. – the Tin Man May 02 '13 at 02:15
  • Uff! what an explanation,perfect,awesome,like it :) – Arup Rakshit May 02 '13 at 04:44
  • Nice answer Tin Man but I think maybe your last example needs updating `!!yaml[/^---/m] && parsed_yaml.instance_of?(Hash)` – corysimmons Dec 06 '13 at 15:41
  • Note to everyone: [**Do not rescue Exception**](http://stackoverflow.com/questions/10048173/why-is-it-bad-style-to-rescue-exception-e-in-ruby) - I edited the answer to not do this any more... – averell Jul 22 '14 at 15:34
  • It's considered bad form to edit people's code. Suggest changes instead. The change you made broke the code; Rescuing Exception handles the raised exception, whereas removing `Exception` results in the parse error being propagated, causing the code to crash. – the Tin Man Jul 22 '14 at 17:26
  • Be aware that `json` is often valid `yaml`, so the ruby YAML parser will also report valid yaml for a json encoded string (this makes the authors response not in any kind invalid, it's a yaml spec thin) – Thomas Fankhauser Feb 19 '20 at 08:22