[Greasemonkey] Question: XPath on XMLHttpRequests

Edward Lee edilee at gmail.com
Fri Apr 15 23:29:17 EDT 2005

Does anybody know how to evaluate an xpath expression on the html page
returned from (GM_)XMLHttpRequests? I've looked around, but couldn't
find much because the page returned is text/html and doesn't have a
DOM tree to use. I'll show what I came up with that kind of does  what
I wanted it to do, but if there is an easier way, please tell me how.
:) [Should I post/ask this elsewhere.. this mailing list has increased
in activity quite a bit in the last month or so... But I figured this
ability to evaluate xpath on xmlhttprequests could be quite useful to
other scripts]

You can try this out by going to http://ed.agadak.net/jsshell.php and
copy/pasting the code below. Copy of it is at
http://ed.agadak.net/greasemonkey/xpathXML.txt if wrapping goes all
crazy below :p

// standard xmlhttprequest with async set to false
var req = new XMLHttpRequest();
req.open('get', 'http://ed.agadak.net/jsshell.php', false);

// create a range because DOMParser will most likely fail [and doesn't
take text/html]
var range = document.createRange();
// range needs to have a node selected, but I'm not sure if it matters
what it's set to for this
// http://www.mozilla.org/docs/dom/domref/dom_range_ref.html
// create a document fragment with the response of the xmlhttprequest
// it seems to strip out html, head, body tags and puts everything
there as a child of the fragment
// putting the div seems to fill in for where the body tag should be..
more on edAttr later
var frag = range.createContextualFragment('<div edAttr>' +
req.responseText + '</div>');

// create a container for the document fragment
var newDiv = document.createElement('div');
// make it so it's not shown when appending to the document
newDiv.style.display = 'none';
// document.evaluate or even XPathEvaluator.evaluate doesn't want to
take frag as the root, so make it part of the document

// even if doing an evaluate on frag for a root, it will find all elements
// edAttr check on the div is to make sure that it's only from the
body of xmlhttprequest
document.evaluate('//*[ancestor::div[@edAttr]]', document, null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength;
// should give a value of 4 because of the a, br, a, br on the page

// probably document.body.removeChild(newDiv) at some point to clean things up

Hopefully there's an easier (and more robust) way to do this, but I
learned quite a bit with the above and the other things I looked at
trying to get it to work. If there isn't a direct way, maybe someone
else has a better way of doing this.


More information about the Greasemonkey mailing list