2

I want to extract the search form, from this webpage, and render it on the "static_pages/home" page of my Rails app: Codepen Example of "static_pages/home"

Steps taken:

  1. I created the following Ruby script to verify that I could actually extract the form:

    require 'nokogiri'
    require 'open-uri'
    
    url = 'http://websoc.reg.uci.edu/perl/WebSoc'
    data = Nokogiri::HTML(open(url))
    
    form = data.xpath('//form[@action="http://websoc.reg.uci.edu/perl/WebSoc"]')
    puts form 
    
  2. Shifting over to Rails, I included Nokogiri and OpenURI in my gem file and used bundle to install the gems.

  3. I created a StaticPages controller:

    class StaticPagesController < ApplicationController
     def home
      require 'nokogiri'
      require 'open-uri'
    
      url = 'http://websoc.reg.uci.edu/perl/WebSoc'
      data = Nokogiri::HTML(open(url))
      @form = data.xpath('//form[@action="http://websoc.reg.uci.edu/perl/WebSoc"]')
     end
    end
    
  4. And an accompanying view:

    <h1>StaticPages#home</h1>
    <p>Find me in app/views/static_pages/home.html.erb</p>
    <%= @form %>
    

The HTML code is successfully extracted but it is rendered as text instead of HTML. It seems like either:

@form = data.xpath('//form[@action="http://websoc.reg.uci.edu/perl/WebSoc"]')

or

<%= @form %>

converts the extracted HTML to text. How can I insert the HTML content I have extracted as HTML and not as text?

My research has suggested using Net:HTTP.

Community
  • 1
  • 1
jkarimi
  • 1,247
  • 2
  • 15
  • 27
  • 1
    Isn't it because Rails automatically escape html code in `<%= @form %>`, right? How about using like `<%= @form.html_safe %>`? (Sorry I don't know very proper way to write it in your Rails version) – gh640 Nov 02 '14 at 01:09
  • This will help you https://cbabhusal.wordpress.com/2015/08/28/ruby-on-rails-why-do-we-need-to-html_safe-string-why-html-tags-not-rendered/ – Shiva Aug 29 '15 at 01:26

2 Answers2

1

Simply putting <%= @form.html_safe %>, in the view will return an error. This is because @form is formatted as text, not as HTML. To correct this:

  1. go to the Static Pages controller and change:

    @form = data.xpath('//form[@action="http://websoc.reg.uci.edu/perl/WebSoc"]') 
    

    to @form = data.xpath('//form[@action="http://websoc.reg.uci.edu/perl/WebSoc"]').to_html.

  2. Now @form stores the HTML as HTML, instead of text. To render this in the view, we need to change:

    <%= @form %>
    

    to

    <%= @form.html_safe %>
    

By default, Rails will convert <%= @form %> to text as a security precaution; you do not want malicious code embedded into your page. By declaring @form.html_safe we tell Rails that the HTML content is intended and, therefore safe. This allows the contents of @form to render in the view as HTML.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
jkarimi
  • 1,247
  • 2
  • 15
  • 27
0

Your question stated that you get text while it was Nokogiri::XML::NodeSet.

"How do I scrape HTML between two HTML comments using Nokogiri?" is a similar question for scrapping the Nodes. Once you get the string html_string, you may use html_string.html_safe.

Community
  • 1
  • 1
mohameddiaa27
  • 3,587
  • 1
  • 16
  • 23