Eamil6 Working NotesΒΆ

RFC 3454 may be relevant at some point. It describes how to ‘prepare’ unicode strings so you can compare them for equality and have some chance that they will be equal (modulo spelling) even if entered by different methods and/or different people. Python has the stringprep module for working with this protocol.

It might be interesting to have the parser generate a “spam score” based on how many RFC rules the message breaks or bends. I’d give the items scores and base the score magnitude on the occurence of the defects/bends in spam/ham data samples. issue 1155362 (closed) made me think of this, I’ll try to add other relevant issues and notes to this paragraph: issue 1162477 (closed), issue 1672568 (closed).

Notice of the partial funding of the grant proposal.

Exarkun has suggested an API involving attributes for headers (msg.subject) and keyword arguments to set them (Message(subject=’hello’)). Extended:

Message(from=AddressHeader, to=AddressHeader, body=text, attachments=(list-of-Messages))

Gregg Lind wants a helper that takes a fn and turns it into an attachable Message that uses mimetypes to figure out the type.

Message(..., attachments=[MIMEPart.from_filename(name) for name in attachment_filenames])

RFC 1428 may make interesting background reading.

May need to fish out the most recent RFCs for ISO-2022. Found a note that DICOM uses “the full ISO-2022 standard”, which is apparently harder to parse than the subsets. That’s presumably the codec module’s responsibility and not ours, though. RFC 2237 covers ISO-2022-JP-1. RFC 1554 covers ISO-2022-JP-2.

http.client’s parser_headers method goes out of its way to cater to some idiosyncracies of the email parser (and one of the problems is the inability to handle bytes). This should be fixed.

It also has a special subclass that defines a method getallmatchingheaders, which is used in exactly one place, and I’m thinking is probably redundant with get_all (which is used elsewhere in the module) even now, much less with email6.