| Home | Trees | Index | Help |
|
|---|
| Module htmllib :: Class HTMLParser |
|
ParserBase--+ |SGMLParser--+ | HTMLParser
This is the basic HTML parser class.
It supports all entity names required by the XHTML 1.0 Recommendation. It also defines handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.| Method Summary | |
|---|---|
Creates an instance of the HTMLParser class. | |
This method is called at the start of an anchor region. | |
This method is called at the end of an anchor region. | |
ddpop(self,
bl)
| |
do_base(self,
attrs)
| |
do_br(self,
attrs)
| |
do_dd(self,
attrs)
| |
do_dt(self,
attrs)
| |
do_hr(self,
attrs)
| |
do_img(self,
attrs)
| |
do_isindex(self,
attrs)
| |
do_li(self,
attrs)
| |
do_link(self,
attrs)
| |
do_meta(self,
attrs)
| |
do_nextid(self,
attrs)
| |
do_p(self,
attrs)
| |
do_plaintext(self,
attrs)
| |
end_a(self)
| |
end_address(self)
| |
end_b(self)
| |
end_blockquote(self)
| |
end_body(self)
| |
end_cite(self)
| |
end_code(self)
| |
end_dir(self)
| |
end_dl(self)
| |
end_em(self)
| |
end_h1(self)
| |
end_h2(self)
| |
end_h3(self)
| |
end_h4(self)
| |
end_h5(self)
| |
end_h6(self)
| |
end_head(self)
| |
end_html(self)
| |
end_i(self)
| |
end_kbd(self)
| |
end_listing(self)
| |
end_menu(self)
| |
end_ol(self)
| |
end_pre(self)
| |
end_samp(self)
| |
end_strong(self)
| |
end_title(self)
| |
end_tt(self)
| |
end_ul(self)
| |
end_var(self)
| |
end_xmp(self)
| |
error(self,
message)
| |
handle_data(self,
data)
| |
This method is called to handle images. | |
Reset this instance. | |
Begins saving character data in a buffer instead of sending it to the formatter object. | |
Ends buffering character data and returns all data saved since the preceding call to the save_bgn() method. | |
start_a(self,
attrs)
| |
start_address(self,
attrs)
| |
start_b(self,
attrs)
| |
start_blockquote(self,
attrs)
| |
start_body(self,
attrs)
| |
start_cite(self,
attrs)
| |
start_code(self,
attrs)
| |
start_dir(self,
attrs)
| |
start_dl(self,
attrs)
| |
start_em(self,
attrs)
| |
start_h1(self,
attrs)
| |
start_h2(self,
attrs)
| |
start_h3(self,
attrs)
| |
start_h4(self,
attrs)
| |
start_h5(self,
attrs)
| |
start_h6(self,
attrs)
| |
start_head(self,
attrs)
| |
start_html(self,
attrs)
| |
start_i(self,
attrs)
| |
start_kbd(self,
attrs)
| |
start_listing(self,
attrs)
| |
start_menu(self,
attrs)
| |
start_ol(self,
attrs)
| |
start_pre(self,
attrs)
| |
start_samp(self,
attrs)
| |
start_strong(self,
attrs)
| |
start_title(self,
attrs)
| |
start_tt(self,
attrs)
| |
start_ul(self,
attrs)
| |
start_var(self,
attrs)
| |
start_xmp(self,
attrs)
| |
unknown_endtag(self,
tag)
| |
unknown_starttag(self,
tag,
attrs)
| |
| Inherited from SGMLParser | |
Handle the remaining data. | |
Feed some data to the parser. | |
| |
| |
| |
| |
| |
Handle character reference, no need to override. | |
| |
| |
| |
Handle entity references. | |
| |
| |
| |
| |
| |
| |
Enter literal mode (CDATA). | |
Enter literal mode (CDATA) till EOF. | |
| |
| |
| Inherited from ParserBase | |
Return current line number and offset. | |
| |
| |
| |
| |
| |
| Class Variable Summary | |
|---|---|
dict |
entitydefs = {'zwnj': '‌', 'aring': '\xe5', 'gt': ...
|
| Method Details |
|---|
__init__(self,
formatter,
verbose=0)
|
anchor_bgn(self, href, name, type)This method is called at the start of an anchor region. The arguments correspond to the attributes of the <A> tag with the same names. The default implementation maintains a list of hyperlinks (defined by the HREF attribute for <A> tags) within the document. The list of hyperlinks is available as the data attribute anchorlist. |
anchor_end(self)This method is called at the end of an anchor region. The default implementation adds a textual footnote marker using an index into the list of hyperlinks created by the anchor_bgn()method. |
handle_image(self, src, alt, *args)This method is called to handle images. The default implementation simply passes the alt value to the handle_data() method. |
reset(self)Reset this instance. Loses all unprocessed data.
|
save_bgn(self)Begins saving character data in a buffer instead of sending it to the formatter object. Retrieve the stored data via the save_end() method. Use of the save_bgn() / save_end() pair may not be nested. |
save_end(self)Ends buffering character data and returns all data saved since the preceding call to the save_bgn() method. If the nofill flag is false, whitespace is collapsed to single spaces. A call to this method without a preceding call to the save_bgn() method will raise a TypeError exception. |
| Class Variable Details |
|---|
entitydefs
|
| Home | Trees | Index | Help |
|
|---|
| Generated by Epydoc 2.1 on Sun Apr 22 21:30:28 2007 | http://epydoc.sf.net |