Developing a C# POP3 client: First Release on CodePlex

Finally a first release of Opo.Net on CodePlex! Yeah! :-)

With this release it's possible to connect to a POP3 server, recieve messages and convert it into Opo.Net.MailMessage instances. Sending mail messages via SMTP is the next step. I think it's now a lot easier becaus some classes can be reused for the other way round.

A very simple example for using the Pop3Client:

Pop3Client pop3 = new Pop3Client("pop.example.org", 110, "accountName", "password");
pop3.Connect();
pop3.Login();
string mimeData = pop3.GetMessage(1) // recieve first message on server
pop3.Logout();
pop3.Disconnect(); 
IMailMessageConverter converter = new MimeMailMessageConverter();
IMailMessage message = converter.ConvertFrom(mimeData);

Console.WriteLine("Subject: " + message.Subject);
Console.WriteLine("From: " + message.From.ToString());
Console.WriteLine("To: " + message.To.ToString());
Console.WriteLine("");
Console.WriteLine(message.Body);

This will output something like this:

Subject: Test message
From: "Example Email 1" <email1@example.org>
To: "Example Email 2" <email2@example.org>, "Example Email 3" <email3@example.org>

This is the message body.

Posted by: Dave
Posted on: 11/7/2008 at 10:17 PM
Tags: , , , , Categories: Actions: E-mail | Kick it! | DZone it! | del.icio.us
Post Information: Permalink | Comments (4) | Post RSSRSS comment feed
Administration:

RFC 2822 compliant date parser

You may have read my series of posts on writing a POP3 client in C#. Normally you'll recieve mail messages in MIME format. MIME dates conform to RFC 2822 (http://www.ietf.org/rfc/rfc2822.txt, 3.3):

Example: Mon, 01 Jan 2001 00:00:00 +0100 (non-bold: optional)

It's also possible that the time offset (+0100) has a format like "GMT" or "PST". These are actually obsolete but nonetheless used sometimes. The goal is now to parse these dates and save them as UTC date. So this is the code with some additional comments:

public DateTime ParseRfc2822Date(string dateTime)
{
	// replace alphabetical time zones with numerical
	date = date.ToLower();
	date = date.Replace("bst", "+0100");
	date = date.Replace("gmt", "-0000");
	date = date.Replace("edt", "-0400");
	date = date.Replace("est", "-0500");
	date = date.Replace("cdt", "-0500");
	date = date.Replace("cst", "-0600");
	date = date.Replace("mdt", "-0600");
	date = date.Replace("mst", "-0700");
	date = date.Replace("pdt", "-0700");
	date = date.Replace("pst", "-0800");

	DateTime parsedDateTime = DateTime.MinValue;

	// Regular expression that matches RFC 2822 compliant dates, contains two groups "DateTime" and "TimeZone"
	string pattern = @"(?:(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun), )?";
	pattern += @"(?<DateTime>\d{1,2} (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{4} \d{2}\:\d{2}(?:\:\d{2})?)";
	pattern += @"(?: (?<TimeZone>[\+-]\d{4}))?";
	Regex r = new Regex(pattern, RegexOptions.IgnoreCase);
	Match m = r.Match(date);
	if (m.Success)
	{
		// Remove preceding "0" that all dates match the same pattern ("d MMM yyyy")
		string dateTime = m.Groups["DateTime"].Value.TrimStart('0');
		
		// Parse the date and time parts (without time zone)
		parsedDateTime = DateTime.ParseExact(dateTime, new string[] { "d MMM yyyy hh:mm", "d MMM yyyy hh:mm:ss" }, CultureInfo.InvariantCulture, DateTimeStyles.None);

		// If time zone is declared, set the offset
		string timeZone = m.Groups["TimeZone"].Value;
		if (timeZone.Length == 5)
		{
			// Create new TimeSpan representing the time zone offset to UTC
			int hour = Int32.Parse(timeZone.Substring(0, 3));
			int minute = Int32.Parse(timeZone.Substring(3));
			TimeSpan offset = new TimeSpan(hour, minute, 0);

			// Set the offset using DateTimeOffset
			parsedDateTime = new DateTimeOffset(parsedDateTime, offset).UtcDateTime;
		}
	}
	return parsedDateTime;
}

You can download this code as part of the Opo.Net project on CodePlex.


Posted by: Dave
Posted on: 11/2/2008 at 5:54 PM
Tags: , Categories: Actions: E-mail | Kick it! | DZone it! | del.icio.us
Post Information: Permalink | Comments (4) | Post RSSRSS comment feed
Administration:

Developing a C# POP3 Client: Part 05 - MIME Messages

This is number five in a series of posts on developing a POP3 client in C#. Take a look at the previous ones:

MIME - Multipurpose Internet Mail Extensions

With the POP3 client developed so far, we are able to download mail messages from the server. We receive these messages in plain text, not very comfortable to read and definively not unsuitable for displaying in an application. First these messages may look a bit chaotic but in fact they are sort of object oriented.

I'll now overview the structure of MIME messages in short. If you'd like some more detailed info I recommend you to read the MIME article on Wikipedia and the RFCs related to MIME, starting with RFC 2045.

A MIME messages consists of one or several parts called entities. Each entity has some headers and a body part which can either hold some content like the message or an attachment, or other entities.

There's only one header which occurs in every header: Content-Type. This can be any kind of Internet Media Type like text/html, image/gif or audio/mpeg. There are two types you may are not familiar with but are important for MIME messages. These are multipart/mixed and multipart/alternative.

Multipart/mixed defines an entity which contains other entities of various content and content types. Multipart/alternative is used to give a alternative view of the same content as plain text message or HTML formatted message.

The sample MIME message shown in the figure below is a message with the content type "multipart/mixed" which contains a multipart/alternative part with a plain text and the HTML view and two attachments:

MIME Message structure

So I said that MIME is some sort of object oriented. Each box in the figure is an object with some headers and some content or a collection of other object of the same type (maybe not the same content type, but always with a header and some content or a collection or other objects of the same type :-)

MIME Headers

The headers section of a MIME entity is composed as follows (example):
From: <sender@example.org>
To: "Recipient" <recipient@example.org>
Subject: Here comes the subject of the email
Date: Fri, 16 Mai 2008 20:23:48 +0100
MIME-Version: 1.0
Content-Type: multipart/mixed;
      boundary="-----=_NextPart_000_001A_123849.12A98DE
X-Priority: 3
...

It's always the name of the header followed by a colon and a value. Maybe you have noticed the indention seventh line. The Content-Type header can contain more than one value, so after the first one there's a semicolon and the second value is on the next line with some whitespace(s) at the beginning.

So parsing the headers should really not be very difficult.

Multipart/...

Entities with a Content-Type of "multipart/..." must define a boundary string which is used to separate the different parts of the multipart entity. It can exists of whatever characters but it must be unique in the message. MIME entities are always surrounded by these boundary strings with a precedent "--". So for two parts there are three boundaries, one before the first part, one after the second and one between them.

...
Content-Type: multipart/mixed;
      boundary="-----=_NextPart_000_001A_123849.12A98DE
...

-------=_NextPart_000_001A_123849.12A98DE
Content-Type: multipart/alternative;
      boundary="-----=_NextPart_001_001B_159832.124F987

-------=_NextPart_001_001B_159832.124F987
Content-Type: text/plain;
      charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

This is the plain text content of the message.

-------=_NextPart_001_001B_159832.124F987
Content-Type: text/html;
      charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html><body>
<p>This is the html text content of the message.<p>
</body></html>
-------=_NextPart_001_001B_159832.124F987

-------=_NextPart_000_001A_123849.12A98DE
Content-Type: image/gif;
      name="image.gif"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
      filename="image.gif"

Nt2YcPes8V1lfZc7GvoeAMEBrhps15YA9K6EABRe8VivewAIvwIAbO08iC/n4raleCZIXtuFi7wB
Aj8QFwx+8JXjSN8K0DB5JT5ecpk/QOK15bgleD7rjOtcOTBPOQCE7geaD3zoO1f6AAjw8AsIAADL
HQPSmc4anN/26QAo7wkKAAACVP2xHbc2ym8bAAB0HdXlDTvIQxB1rDNH67slLG99LnconL3ub996
/3fRq371rG+9618P+9jLfva0r73tb4/73Ot+97zvve9/D/zgC3/4xC++8Y+P/ORDPAEAOw==

-------=_NextPart_000_001A_123849.12A98DE

Content-Transfer-Encoding

quoted-printable

Quoted-printable is an encoding using printable characters. Characters other than alphanumerics are encoded using the "=" and a hexadecimal double figure which represents the character's numeric value. For more details have a look at Quoted-printable on Wikipedia. For decoding a quoted-printable encoded text we only have to search the text for =xx where xx is a hexadecimal number and replace it with the appropriate character:

//... 
Regex hexRegex = new Regex(@"(\=([0-9A-F][0-9A-F]))", RegexOptions.IgnoreCase); 
content = hexRegex.Replace(content, new MatchEvaluator(HexMatchEvaluator)); 
//... 
private static string HexMatchEvaluator(Match m) 
{ 
    int dec = Convert.ToInt32(m.Groups[2].Value, 16); 
    char character = Convert.ToChar(dec); 
    return character.ToString(); 
} 

base64

Base64 encoding is mostly used for attachments. For more details on how it works, read the Base64 article on Wikipedia. I'll just show you the code to decode a base64 encoded string. The easiest way may be the following:

public static T Base64Deserialize<T>(string s) 
{ 
    using (MemoryStream ms = new MemoryStream(Convert.FromBase64String(s))) 
    { 
        return (T)new BinaryFormatter().Deserialize(ms); 
    } 
} 

Parsing MIME messages

These are the basics you have to know about MIME messages. Now we could begin to parse the messages and bring them in a usable format - and this will be the topic of my next post. Just have a little patience. :-)


Posted by: Dave
Posted on: 5/16/2008 at 10:49 PM
Tags: , , , Categories: Actions: E-mail | Kick it! | DZone it! | del.icio.us
Post Information: Permalink | Comments (8) | Post RSSRSS comment feed
Administration: