IMAP Sucks

Ok, First let me say that IMAP rocks. IMAP is great. It is awesome. It rules. It makes it really easy to write a mail reader with lots of features that would otherwise be hard. It handles much of that messy RFC parsing for you.

But this article isn't about how great IMAP is. This article is about why IMAP sucks. So how can something so great suck so much? Well, the idea is great, but the implementation is riddled with inconsistencies that are useless annoyances on the developer tasked with writing an IMAP client.

BODYSTRUCTURE

Far and away the worst nightmare of IMAP is the format for the bodystructure data. This really useful command returns the MIME structure of a document. Unfortunately, they condense this down into a format that is harder to parse than the original email MIME structure.

For some bizarre reason, they decided that lists with one item are a special case deserving special attention. If a body has only one part, instead of presenting it as a list of body parts with only one item e.g. ((TEXT PLAIN ...)), they make it special, making it explicitly one item (TEXT PLAIN ...). On some human level, this almost feels sensible, since it streamlines the look for a human reader. It's just one of many choices where the protocol designers chose to do something that seems "intuitive".

But computers are not humans. All they've accomplished here is to insure that the program has to special case this situation, instead of handling it just like any other list. Just another block of code that is absolutely unnecessary and useless.

But they weren't even consistent about that. If you have a multipart that only has one part, then in that case you do get a list with only one element. Granted, you shouldn't normally have a multipart with only one part, but since it can happen out there in the real world, my code has to handle it.

MIME documents can contain other kinds of MIME documents, hence there's an idea of sub-lists and containers. But again, instead of going simple, they went hard. There are two different kinds of containers - a multipart container, and a message container. They fall into completely different places in the BODYSTRUCTURE format, and have to be handled completely differently, even though they serve the same purpose. More code to handle things that could have been handled together.

And the top level container in the format is always a message/rfc822 structure. But instead of explicitly stating that, the top level is a special case where you simply know in advance that it's a message/rfc822 because what else could it be. Once again, they made the structure seem more human-intuitive, but at the same time, harder for a computer to parse.

The whole point of parsing the body structure is to figure out what sections it has, so the email client can then request those sections. Because of the nightmare described above, there is no simple relationship between the BODYSTRUCTURE results, and the section paths needed to retrieve sections. It should have been the case that the list structure of this result exactly corresponded to the path structure of the section numbers, but they do not.

I do have to slightly concede that point that message/rfc822 containers are special - they always only contain one single element. But I think the failure throughout all of this is to simply accept that and move on. IMAP attempts to "skip over" the rfc822 container layer so that you don't always have the extra path element that aesthetically servers no purpose. In other words, every message will have section 1, the message, and then sub parts to section one if it's multipart. IMAP seems to find never having a top-level section 2 distasteful, but I say so what. Why make my code more complicated to serve some arbitrary designer aesthetic?

Even if you decided this was desirable, it's only a partial explanation. A single exception for skipping over the message/rfc822 layer of the true sdtructure would be acceptable, and everything else could still have been much more regularly formatted.

Somewhat related, but not exactly the same is the subject of section numbers and paths. Why didn't the designer allow the separator to be used as a leading path element. ".1" should mean the same as "1" (the first section). That way when assembling paths using a recursive routing that goes through sublists, I wouldn't need a special case to check and see if the recursion is at the top level.

General Parsing Complications

IMAP has five different kinds of objects. It has plain objects. A word like hello without any interfering special characters can be an object. If you need to put in some particular special character then you can quote the text. If you need a space then "hello there" can be quoted with double quotes. Some optional codes are included in square brackets, e.g. [UIDVALIDITY 123456789] and that's one whole object. There's also sub-lists which are inside of matching parenthesis. So for special characters we have double quote, square brackets, parenthesis, and we also have carriage return and linefeed. There's one more kind of quote, which can handle quoting any character, by providing a string length followed by that number of characters, which can be anything and don't have to be parsed at all. It looks like this: {31}hello<CR><LF>there (dude named "tom")

Why, oh WHY didn't they just stick with string length, and lists, and dump everything else? Yes maybe it's not a bad idea to have IMAP protocol be somewhat human-readable. But the server and client have to go through massive contortions on every object figuring out how to format or parse it. If every object was string-length encoded, the server and client would have a very simple job.

But even within it's own convoluted rules for having useless extra ways of quoting things, IMAP can't resist the temptation to be yet more inconsistent. Server responses like OK and BAD are followed by a text string that can contain any special characters except newline and carriage-return, and do not need to be quoted. The string is terminated by EOL. Great. Up until this bit of bad news, you could at least parse out the IMAP objects without regard to content, even though it was needlessly complicated. Now you actually have to look at the content while you're parsing out the objects, and provide special handling for these responses. Still more useless extra code.

Carriage returns

IMAP adds carriage returns to plain text email that's separated only by linefeeds. They claim that carriage-return linefeed is the internet standard for line separators. Maybe that's a justified claim, and maybe it's the insane ravings of one man. I don't really care one way or the other. No other mail transports felt the need to mess with perfectly legal and legitimate email data. IMAP shouldn't either. IMAP should keep it's greasy little fingers off the payload unless it's absolutely necessary.

envelope FROM

There's no way in IMAP to get the the envelope From (SMTP MAIL-FROM) data. I don't want to get into RFC wars, and whether or not this information is needed, or useful, or trustworthy. They could've provided a means to get this info to those who wanted it, and they didn't. Enough said.

mbox format concurrent access

IMAP "doesn't allow" concurrent access to mbox format files. I have no idea why. As far as I know, a server could implement this. It would be tricky in spots, but it's doable.

arbitrary header lines

The only thing IMAP lets you store in messages when you save them are flags. I'd like to be able to add arbitrary header lines. For example, a deleted date would allow me to expunge deletions a fixed time after deletion without the use of external indexes.

Unseen versus unread

IMAP does not allow you to distinguish between messages that are marked unread, and messages that have no status line at all.

Sending mail

IMAP has no provision for sending email. This is a problem because authenticated mail sending is the way of the future, and IMAP therefore requires users to authenticate to send mail, even though they are always authenticated (OR, it encourages applications to cache passwords in memory which is a REALLY REALLY bad idea).

No Automatic Updates in EXAMINE

When you have a mailbox SELECTed, IMAP automatically notifies you of newly received messages. But when you EXAMINE a mailbox, you receive no such notifications. This is particularly a problem because of mailbox types that do not have an implementation for concurrent access.

OK,BAD,NO - inconsistent meaning

apparently there is not a consistent mapping from BAD/NO to whether or not the failed command changed state

Syncronization

is synchonization of multiple connections really NOT required!?!?!?
Tom Fine's Home Send Me Email