How did we ever survive without e-mail? I use e-mail for just about everything. My users send me an e-mail when they have a program request. My programs send me e-mail when there's a problem, and they send me diagnostics. I use e-mail to make plans with friends. My wife sends me e-mail with lists of items to pick up from the store on my way home from work. E-mail even provides the main form of communication between me, my copy editor, and the newsletter coordinator while I'm writing this newsletter for you. And of course, it's used to distribute it, too!
As a programmer, I find that it's never enough to know how useful something is. I have to know how it works. How can I use it in my programs? This week, I decided to start a series of articles that explore how e-mail works from a programmer's perspective. In this first installment, I write about the format of e-mail messages. I show you some code that creates some messages and sends them with the Send MIME Mail (QtmmSendMail) API.
Originally, e-mail was a text-only medium. You could send only plain 7-bit US-ASCII characters in a message. Incorporating fonts, colors, pictures, or attachments was impossible. This is the way the e-mail standard was designed and the way that all the programs were developed for e-mail.
Rather than explain the plain-text format in detail, I show you what it looks like and then give you a brief description of what it means. If you need the full details, please refer to the official standard, which is documented in RFC 2822.
Here's a sample of what a standard plain-text e-mail message looks like;
From: Scott Klement <sklement@iseriesnetwork.com> To: Faithful Reader <freader@example.com> Date: Thu, 27 Jul 2006 102106 -0500 Subject: Testing a text-only message Dear Faithful, This demonstrates a simple e-mail message. It's text-only (no fonts, colors, pictures, attachments), which is all that was allowed in the original e-mail standard. Notice that none of the lines of this e-mail exceeds 78 characters, which is what is strongly recommended (but not required) by RFC 2822. Have a nice day, -- Scott
At the start of the message is a group of keywords. These lines consist of a keyword (From, To, Date, and Subject) followed by a colon, followed by the value of that keyword. You can think of these keywords as variables to be set in the programs that process the message. The group of keywords is collectively known as the "header" of the e-mail message. The first blank line in the message denotes the end of the headers and the start of the text message itself.
From: Scott Klement <sklement@iseriesnetwork.com> To: Faithful Reader <freader@example.com> Date: Thu, 27 Jul 2006 102106 -0500 Subject: Testing a text-only message X-AS400-UserID: KLEMSCOT X-Virus-Scanned: by ClamAV X-Scotts-Cool: Yeah
The headers that start with X- are those that I made up. If the message should be bounced due to an error, I can see those keywords, and they might help me diagnose the error. They're also useful if you write a program to process the message, because you can embed information intended for that program.
Message headers can be continued on the next line if necessary. To do so, start the next line with a space or with a tab character. For example:
From: Scott Klement <sklement@iseriesnetwork.com> To: Faithful Reader <freader@example.com>, Bob the Builder <canhefixit@yeshecan.com>, Mickey Mouse <mmouse@disney.com>, George Washington <gwash@washington.edu> Date: Thu, 27 Jul 2006 102106 -0500 Subject: Testing a text-only message
In the preceding example, the value of the To keyword didn't fit nicely on one line, so I wrapped it to a second line. Because the second line starts with a space, the mail reader understands that it's a continuation of the previous line.
The message header is transmitted with the e-mail message anywhere it's sent across the Internet. Any program that touches the message adds its own headers as needed and typically leaves the other headers intact. When e-mail programs display the message to the user, they display only those headers that they think the user wants to see (e.g., the date and subject) and hide the others.
The plain-text e-mail format is the most efficient, if least flexible, way to format an e-mail message. Many people prefer plain-text messages. I recommend that you use this format unless you need a feature that it doesn't offer.
The Multipurpose Internet Mail Extensions (MIME) standard addresses these needs. MIME provides keywords that can be used to break a message into multiple parts and tell a mail reader what type of data is stored in each part. The official standards for MIME are described in RFCs 2045, 2046, 2047, 2048, and 2049.
To provide more advanced text formatting, MIME lets you put HTML tags in your message, as long as you tell the mail reader that the message is in HTML format instead of plain text. For example, here's an HTML e-mail message in MIME format:
From: Scott Klement <sklement@iseriesnetwork.com>
To: Faithful Reader <freader@example.com>
Date: Thu, 27 Jul 2006 102106 -0500
Subject: Testing an HTML message
MIME-Version: 1.0
Content-type: text/html
<html>
<head>
<style type="text/css">
.red {font-size: 14pt; font-family: Arial; color: #ff0000;}
.normal {font-size: 14pt; font-family: Arial;}
.emphasis {font-style: italic;
font-weight: bold;
font-size: 14pt;
font-family: Arial;}
</style>
</head>
<body>
<p class="normal">Dear Faithful,</p>
<p class="normal">This demonstrates a MIME message. In the header, it
identifies itself as MIME with the MIME-Version keyword. It tells the mail
reader that it uses HTML for the text.</p>
<p class="normal">HTML provides capabilities that text-only e-mail cannot.</p>
<p class="emphasis">For example, I can provide emphasis!</p>
<p class="red">I can even use color!</p>
<p class="normal">I still keep my message text wrapped at 78 characters,
but due to the nature of HTML, it'll all be stuck together when it's
displayed. I need to use HTML paragraph and line break tags to make the
formatting look nice.</p>
<p class="normal">Have a nice day,<br />
<i>Scott Klement</i></p>
</body>
</html>
As you can see, the header of the message is almost identical to the original. The MIME-Version keyword has been added, and this tells the e-mail reader that it's a MIME message. The Content-Type keyword is another keyword defined in the MIME standard, and it describes the format of the message, in this case, telling the mail reader that the message is in HTML format.
Not everyone wants to read their mail in HTML format, however. Some people prefer HTML, and others prefer plain text. MIME makes it possible to let people choose which format they prefer. For example:
From: Scott Klement <sklement@iseriesnetwork.com>
To: Faithful Reader <freader@example.com>
Date: Thu, 27 Jul 2006 102106 -0500
Subject: Testing both alternatives
MIME-Version: 1.0
Content-type: multipart/alternative; boundary="--=_ScottsBoundaryYeah"
Your mail reader doesn't support MIME!
----=_ScottsBoundaryYeah
Content-type: text/plain
Dear Faithful,
For your convenience, I've sent you this message in both text and HTML
format. You are currently reading the text version.
Thanks,
Scott
----=_ScottsBoundaryYeah
Content-type: text/html
<html>
<head>
<style type="text/css">
.normal {font-size: 14pt; font-family: Arial;}
.emphasis {font-style: italic;
font-weight: bold;
font-size: 14pt;
font-family: Arial;}
</style>
</head>
<body>
<p class="emphasis">Dear Faithful,</p>
<p class="normal">For your convenience, I've sent you this message in both
text and HTML format. You are currently reading the HTML version.</p>
<p class="normal">Thanks,<br />
Scott</p>
</body>
</html>
----=_ScottsBoundaryYeah--
In the preceding example, I specified a Content-Type of multipart/alternative. This tells the mail reader that I list several different message parts, and each one is an alternative version of the others.
The Content-Type keyword also specifies a boundary string. The mail reader looks for this boundary string to determine where each alternative starts and ends.
At the top of the message text is a sentence that tells the recipient that his mail reader doesn't support MIME. You see, a MIME mail reader looks for a boundary string to start each alternative part of the message, so it the reader doesn't display any text printed before the boundary string. An old e-mail reader that's not MIME compatible doesn't understand that it has to look for a boundary, so it displays everything after the message header, including the sentence about MIME compatibility.
When a mail reader looks for a boundary string, it expects that boundary string to be prefixed by two hyphens to indicate the start of "sub-message" (or, "message part"), and it looks for the same boundary string to be both prefixed and suffixed by two hyphens to indicate the end of the e-mail message.
As the programmer, you can use anything that you like for a boundary string (I used "--=_ScottsBoundaryYeah" in this example). However, you should try to pick something that has little chance of occurring in the text of a typical e-mail message.
After each boundary string is another section of keywords that apply only to one message part. In the preceding example, the first part has a Content-Type keyword that identifies it as a plain-text message. The second part has a Content-Type that identifies it as an HTML message. Because these parts are specified as "alternatives" to one another, the e-mail reader program displays only one of them. Typically, users can configure their software to specify whether they prefer text or HTML.
Multipart/alternative always displays only one part of the message. It specifies that the parts are alternative to one another. What if you want to have several message parts grouped together? That's what multipart/related is for.
For example, if I want to display a picture in an HTML message, I can create one message part that contains the HTML, and another part that contains the picture, and I can mark them as "related" so that they're displayed together. Here's an example of that:
From: Scott Klement <sklement@iseriesnetwork.com>
To: Faithful Reader <freader@example.com>
Date: Thu, 27 Jul 2006 102106 -0500
Subject: Testing HTML with a picture
MIME-Version: 1.0
Content-Type: multipart/related; boundary="--=_ScottsNiftyBoundary"
This is a multipart message in MIME format.
----=_ScottsNiftyBoundary
Content-Type: text/html;
<html>
<body bgcolor="#ffffff" text="#000000">
<b>Hey, this is a really silly e-mail message.<br>
<br>
<img alt="" src="cid:sklogo.gif@iseriesnetwork.com"
height="54" width="44"><i>Scott Klement</i><br>
</b>
</body>
</html>
----=_ScottsNiftyBoundary
Content-Type: image/gif; name="sklogo.gif"
Content-Transfer-Encoding: base64
Content-ID: <sklogo.gif@iseriesnetwork.com>
Content-Disposition: inline; filename="sklogo.gif"
R0lGODlhrgDTALMAAAAAAAAAjAAA/wCAAAD/AACAgAD8/IAAAPwAAIAAgP0A/oCAAP//AKTI
8Pr6+gAAACH5BAEAAAwALAAAAACuANMAAAT+kMlJq7046827/2AojmRpImiqrqnpvnAsT2xt
33Ou7+ft/zWecDgEGo8ronI5QjqfKKZ0WoFan9Qs8crFar+xLsJBLpvPDjF43YSi3/C3lU3f
.
.
several more lines of encoded data go here
.
.
4baRrS1Fcctb3bKVt13w7RqAywfhvoS4azWuS4ELq1y4QqG50NVtBAAAOw==
----=_ScottsNiftyBoundary--
Because I used multipart/related in this message, each message part is related to the others. In the HTML of the preceding example, I refer to an image (i.e., a picture) using an <img> tag. The target of that tag is a string that started with "cid." CID stands for "content-id" and refers to the contents of another message part. In the second message part, you see a content-id keyword that matches the one referenced by the <img> tag.
You also might have noticed that the content of the second message part is not text. It's a sequence of seemingly random letters and numbers. In fact, this is what's known as base64 encoded data (I discuss base64 a bit more momentarily). There's a keyword at the top of the second message part called Content-Transfer-Encoding, and it tells the e-mail software that the data is base64 encoded.
Just like the preceding sample message, any data that's not plain text must be encoded using a transfer encoding such as base64.
Limiting the message to plain text causes a problem! Not everything is text. In the preceding example, I wanted to attach a picture, and pictures aren't made up of US-ASCII text characters. Other document types, including PDF, Excel, Word, zip, and MP3 (to name a few!), aren't text documents but sometimes need to be sent through e-mail. On the iSeries, files containing packed or binary numbers, Save Files, and files containing Binary Large Objects (BLOBs) are examples of non-text files.
An even bigger problem is that of globalization! US-ASCII characters might be adequate for the United States but can't be expected to work in other countries where they have different character sets.
The MIME standard provides two methods of "encoding" data so that it can be converted to legal US-ASCII text, then sent over the Internet and converted back to its original form. These encoding methods are called "quoted-printable" and "base64." For complete details about how these encoding schemes work, please see RFC 2045.
I have developed a service program (in RPG) that you can use to encode data into base64 encoding, as well as decode the same data. It's available as open-source software from my Web site at the following link:
http://www.scottklement.com/base64
/copy IFSIO_H
D CRLF c x'0d25'
D filename s 50A varying
D fd s 10I 0
D header s 2000A varying
D body s 32767A varying
D fromName s 100A varying
D fromAddr s 300A varying
D toName s 100A varying
D toAddr s 300A varying
D subject s 80A varying
/free
fromName = 'Scott Klement';
fromAddr = 'sklement@iseriesnetwork.com';
toName = 'Faithful Reader';
toAddr = 'freader@example.com';
subject = 'Testing a text-only message';
filename = %str(tmpnam(*omit));
unlink(filename);
fd = open( filename
: O_CREAT+O_EXCL+O_WRONLY+O_CCSID
: M_RDWR
: 819 );
if (fd = -1);
ReportError();
endif;
callp close(fd);
fd = open( filename : O_WRONLY + O_TEXTDATA );
if (fd = -1);
ReportError();
endif;
header =
'From: ' + fromName + ' <' + fromAddr + '>' + CRLF
+'To: ' + toName + ' <' + toAddr + '>' + CRLF
+'Date: ' + maildate() + CRLF
+'Subject: ' + subject + CRLF
+ CRLF;
body =
'Dear Faithful,' + CRLF
+ CRLF
+'This demonstrates a simple e-mail message. It's text-only '
+'(no fonts, colors,' + CRLF
+'pictures, or attachments) which is all that was allowed '
+'in the original e-mail' + CRLF
+'standard.' + CRLF
+ CRLF
+'You'll also notice that none of the lines of this e-mail '
+'exceeds 78 characters,' + CRLF
+'which is what is strongly recommended (but not required) '
+'by RFC 2822.' + CRLF
+ CRLF
+'Have a nice day,' + CRLF
+'-- Scott' + CRLF;
callp write(fd: %addr(header)+2: %len(header));
callp write(fd: %addr(body)+2: %len(body));
callp close(fd);
If you've never used the IFS APIs before, check out the links that I provide at the end of this article. The links are to articles that describe how to use the IFS APIs. For now, what you need to understand is that the preceding code formats an e-mail message into some variables in my RPG program. One long statement assigns the header keywords to the header variable, and another statement assigns the message text to the body variable. These variables are then written to a file in the IFS. The IFS APIs automatically convert the data to ASCII as it's written.
Now that the message has been written to a file, it can be sent. IBM provides an API called Send MIME Mail (QtmmSendMail), and it sends an e-mail message using the iSeries SMTP server. The following code uses that API to send the temporary file that I just created:
/copy sendmail_h
D recip ds likeds(ADDTO0100)
D dim(1)
D NullError ds
D BytesProv 10I 0 inz(0)
D BytesAvail 10I 0 inz(0)
.
.
recip(1).NextOffset = %size(ADDTO0100);
recip(1).AddrFormat = 'ADDR0100';
recip(1).DistType = ADDR_NORMAL;
recip(1).Reserved = 0;
recip(1).SmtpAddr = toAddr;
recip(1).AddrLen = %len(toAddr);
QtmmSendMail( FileName
: %len(FileName)
: fromAddr
: %len(fromAddr)
: recip
: %elem(recip)
: NullError );
Naturally, to try this code, you must have the SMTP server configured and running on your iSeries. Otherwise, it won't get very far! I described how to configure e-mail on the iSeries and how to send simple messages with the SNDDST command in the following thread of the iSeries Network forums:
http://www.iseriesnetwork.com/isnetforums/showthread.php?t=43930
You can download the code samples for this article from the following link:
http://www.pentontech.com/IBMContent/Documents/article/52911_93_Email1.zip
Later, I wrote a new and improved tutorial about RPG and the IFS for iSeries NEWS magazine. If you're a ProVIP member of the iSeries Network, you can read those articles at the following links:
http://www.iseriesnetwork.com/article.cfm?id=19312
http://www.iseriesnetwork.com/article.cfm?id=19473
http://www.iseriesnetwork.com/article.cfm?id=19626
http://www.iseriesnetwork.com/article.cfm?id=19751
http://www.iseriesnetwork.com/article.cfm?id=20050
http://www.iseriesnetwork.com/article.cfm?id=20141
http://www.iseriesnetwork.com/article.cfm?id=20235