pyzor.digest¶
- class pyzor.digest.DataDigester(msg, spec=None)¶
Bases: object
The major workhouse class.
- atomic_num_lines = 4¶
- digest¶
- classmethod digest_payloads(msg)¶
- email_ptrn = <_sre.SRE_Pattern object at 0x7f540b2dfcb0>¶
- handle_atomic(lines)¶
We digest everything.
- handle_line(line)¶
- handle_pieced(lines, spec)¶
Digest stuff according to the spec.
- longstr_ptrn = <_sre.SRE_Pattern object at 0x7f540f14a3c0>¶
- min_line_length = 8¶
- classmethod normalize(s)¶
- static normalize_html_part(s)¶
- classmethod should_handle_line(s)¶
- unwanted_txt_repl = ''¶
- url_ptrn = <_sre.SRE_Pattern object at 0x7f540b191100>¶
- value¶
- ws_ptrn = <_sre.SRE_Pattern object at 0x7f540b1e28a0>¶
- class pyzor.digest.HTMLStripper(collector)¶
Bases: HTMLParser.HTMLParser
Strip all tags from the HTML.
- handle_data(data)¶
Keep track of the data.
- class pyzor.digest.PrintingDataDigester(msg, spec=None)¶
Bases: pyzor.digest.DataDigester
Extends DataDigester: prints out what we’re digesting.
- handle_line(line)¶
- pyzor.digest.get_digest(msg)¶