After using REXML as the XML parser for my dissertation for a while now I suddenly came up against a big problem with it. Basically, I was trying to bring back elements of a large XML document based on its attributes. the XPATH do this is something like the following, where "song" is the element and "genre_id" is the attribute:
//song[@genre_id=30]
That should return all elements named "song" with an attribute named "genre_id" that equals 30. Except it didn't.
Now, being as REXML is the default XML parser included in the Ruby library I figured it would be well tested and this must be a bug in my code. I spent about 3 hours working out that, in actual fact, the problem lies with REXML.
The long and short of it is this: querying using attributes when you use the default namespace tag in your XML document is broken. It will always return nil.
Luckily enough for me (and anyone here looking for a solution to the problem) Sam Ruby has solved the problem with a monkey patch. Monkey patches in Ruby are dynamically loaded patch that extends or modifies code in Ruby without altering the source code. Put the following in a ruby (.rb) file, anywhere you like:
require 'rexml/document'
doc = REXML::Document.new '<doc xmlns="ns"><item name="foo"/></doc>'
if not doc.root.elements["item[@name='foo']"]
class REXML::Element
def attribute( name, namespace=nil )
prefix = nil
prefix = namespaces.index(namespace) if namespace
prefix = nil if prefix == 'xmlns'
attributes.get_attribute( "#{prefix ? prefix + ':' : ''}#{name}" )
end
end
end
Then issue the following in a terminal:
export RUBYOPT='-rubygems -r/home/rubys/bin/monkey_patches'
That should solve the problem.
