I have a script that inserts URLs into existing XHTML pages. The URLs have tracking codes with ampersands, and Nokogiri automatically replaces them with the escaped version &
. I understand why, but the escaped URL means that the tracking doesn't work, as the tracking code has been changed.
I've checked out How to save unescaped & in nokogiri xml?, How can i put a string with an ampersand in an xml file with Nokogiri?, and Preventing Nokogiri from escaping characters?, but I'm not quite sure how using the builder or using cdata works in the context of what I'm trying to do.
Here's a simplified version of what I am currently doing (with main_link
being pulled from an external source):
doc = Nokogiri::XML(open("file.xhtml"))
link = doc.css("a")[0] # the actual file may contain multiple links, not just one
main_link = "http://www.url.com/"
tag = "?blah&blah=blahblah"
link["href"] = main_link + tag
new_content = doc.to_xml
File.open("new_file.xhtml", "w") { |f| f.write(new_content) }
#=> <a href="http://www.url.com/?blah&blah=blahblah">link</a>
I've done this, which works:
content = File.read("file.xhtml")
content.gsub!("&","&")
File.open("updated_file.xhtml", 'w') { |file| file.write(content) }
#=> <a href="http://www.url.com/?blah&blah=blahblah">link</a>
but I'd like to avoid reopening/resaving files, since I'm working with a lot at one time and want to be as efficient as possible.
Is this doable with Nokogiri? Should I be looking elsewhere to accomplish this?