On 4/12/05, Scott R. Turner <srt at aero.org> wrote:
> I didn't know about textContent... What text does it include -- only
> text that will be rendered on the page?

First of all, I meant document.body.textContent, not document.textContent.

Second, document.body.textContent on http://www.google.com/ currently returns:

Google <!-- function qs(el) {if (window.RegExp &&
window.encodeURIComponent) {var
(el.href.indexOf("q=")!=-1) {el.href=el.href.replace(new
RegExp("q=[^&$]*"),"q="+qe);} else {el.href+="&q="+qe;}}return 1;} //
--> Web Images Groups News Froogle LocalNew! Desktop more » Advanced
Search Preferences Language Tools Advertising Programs - Business
Solutions - About Google(c)2005 Google - Searching 8,058,044,651 web

It appears to include comments (including commented script),
free-standing text, link text, and alt text of images.  (The inclusion
of alt text can be confirmed by doing document.body.textContent on
http://www.cnn.com/, which contains an image whose alt text is "Click
here to skip to main content" -- text which appears nowhere else in
the page.)

So all in all, maybe not the best idea.  Or at least it would require
an explanation of exactly what you were getting.


