3

I've been reading through tutorial after tutorial, but nothing seems to be working out for me. The goal is to take a XML document with elements and attributes and insert the data in a database. Each element/attribute will be a column in the database, and each entry is a row. Here is the made-up XML doc that I've been working with:

<?xml version="1.0"?>
<library>
  <NAME><![CDATA[Favorite Books]]></NAME>
  <book ISBN="11342343">
    <title>To Kill A Mockingbird</title>
    <description><![CDATA[Description#1]]></description>
    <author>Harper Lee</author>
  </book>
  <book ISBN="989894781234">
    <title>Catcher in the Rye</title>
    <description><![CDATA[This is an extremely intense description.]]></description>
    <author>J. D. Salinger</author>
  </book>
  <book ISBN="123456789">
    <title>Murphy's Gambit</title>
    <description><![CDATA[Daughter finds her dad!]]></description>
    <author>Syne Mitchell</author>
  </book>
</library>

So I want to have a table with 2 entries, each entry having an ISBN, Title, Description, and Author. That's the basics. (The CDATA is completely optional I suppose. If that's part of my problem, by all means let's get rid of it...)

End goal is a bit more complicated. Multiple libraries with multiple books. Databases have the relationship between them, so I can reference the Library database from my Book database, and vice versa. I'm completely lost, definitely a rookie, but I have a good working computer knowledge and am willing to test and try things out.

I'm using Rails 3.2.6 with the default SQLite3 database (3.6.20). I've installed REXML, ROXML, LibXML, etc and read through APIs and walkthroughs, but things just aren't working out. There has to be an easy way to turn the XML doc into a Library object (with a .name method) with Book objects (having .title, .author, .isbn, and .description methods).

Any help is truly apprecaited!

Update!

Okay, next question. I've been fooling around with the logic behind this, and wanted to know the best way to do the following...

Suppose I have this new and improved XML file.

<?xml version="1.0"?>
<RandomTag>
  <library name='Favorite Books'>
    <book ISBN="11342343">
      <title>TKAM</title>
      <description>Desc1</description>
      <author>H Lee</author>
    </book>
    <book ISBN="989894781234">
      <title>Catcher in the Rye</title>
      <description>Desc2</description>
      <author>JD S</author>
    </book>
  </library>
  <library name='Other Books'>
    <book ISBN="123456789">
      <title>Murphy\'s Gambit</title>
      <description>Desc3</description>
      <author>Syne M</author>
    </book>
  </library>
</RandomTag>

So now we have two libraries, the first titled "Favorite Books" and having 2 books, the second titled "Other Books" and having a single book.

What is the best way for each book to know which library it belongs to? Originally, I created a Library database and a Book database. Each Book object had a library_id field, which referenced the correct Library. Thus, each database could correctly fill in using syntax like "@library.books.each do |b| b.title". This, however, only worked when I had one library.

I tried nesting the Book loop you gave me inside a similar Library loop, but the .css method finds every single match, regardless of where it resides. Is there .css method that finds UNTIL a specific point?

To rephrase, I'd like to be able to import each book into its respective library. I can't add any fields to the XML file.

Thanks again.

XML Slayer
  • 1,530
  • 14
  • 31

1 Answers1

11

I did something similar using the Nokogiri library.

doc = Nokogiri::XML(xml_data)

doc.css('book').each do |node|
  children = node.children

  Book.create(
    :isbn => node['ISBN'],
    :title => children.css('title').inner_text,
    :description => children.css('description').inner_text,
    :author => children.css('author').inner_text
  )
end

Update

You could create a quick test by doing this:

First install the nokogiri gem:

gem install nokogiri

Then create a file called text_xml.rb with the contents:

require 'nokogiri'

doc = Nokogiri::XML('<?xml version="1.0"?>
  <library>
    <NAME><![CDATA[Favorite Books]]></NAME>
    <book ISBN="11342343">
      <title>To Kill A Mockingbird</title>
      <description><![CDATA[Description#1]]></description>
      <author>Harper Lee</author>
    </book>
    <book ISBN="989894781234">
      <title>Catcher in the Rye</title>
      <description><![CDATA[This is an extremely intense description.]]></description>
      <author>J. D. Salinger</author>
    </book>
    <book ISBN="123456789">
      <title>Murphy\'s Gambit</title>
      <description><![CDATA[Daughter finds her dad!]]></description>
      <author>Syne Mitchell</author>
    </book>
  </library>')

doc.css('book').each do |node|
  children = node.children

  book = {
    "isbn" => node['ISBN'], 
    "title" => children.css('title').inner_text, 
    "description" => children.css('description').inner_text, 
    "author" => children.css('author').inner_text
  }

  puts book
end

And finally run:

ruby test_xml.rb

I suspect you weren't escaping the single quote in Murphy's Gambit when you pasted in your xml.

Mike Neumegen
  • 2,436
  • 1
  • 24
  • 39
  • I've played around with Nokogiri as well, to no avail, but I don't believe I've used this exact syntax. I'll give it a shot and report back... – XML Slayer Jul 07 '12 at 19:22
  • Okay, this isn't giving me an error, so I guess we're on the right track. It doesn't do anything though? I'm typing this in the rails console, and when I enter 'end', it just reports back '0'. doc.root also returns 'nil', so I'm thinking something didn't go as planned... Any ideas? – XML Slayer Jul 09 '12 at 15:54
  • Thanks a lot! I had been trying to use methods like book.title to manually create a new Book and insert it into the table. Didn't realzie you could just do Book.create(book) within the foreach loop. Thanks for your patience, I'm on my own now! – XML Slayer Jul 10 '12 at 16:46