MIME Content-Transfer-Encoding Header and Encoding Methods
(Page 3 of 3)
In contrast, the base64 encoding is more often used for raw binary data that is not in human-readable form anyway, such as graphical images, audio, video and application files. The idea behind it is simple: the data that needs to be sent can have any value for each 8-bit byte, which is not allowed. So, why not rearrange the bits so the data fits into the 7-bit ASCII limits of RFC 822?
This is done by processing the data to be sent three bytes at a time. There are 24 bits in each three-byte block, which are carved into 4 sets of 6 bits each. Each 6-bit group has a value from 0 to 63, and is represented by a single ASCII character as presented in Table 249.
For example, suppose the first three bytes of the data to be sent were the decimal values 212, 39 and 247. These cannot all be expressed in 7-bit ASCII. In binary form they are:
11010100 00100111 11110111
We can divide these into four 6-bit groups:
110101 - 00 0010 - 0111 11 - 110111
Which yields the four values 53, 2, 31 and 55. Thus, the values 214, 39 and 247 would be encoded as the three ASCII characters 1Cf3. The conceptual steps of this process are shown in Figure 303.
This 3-to-4 encoding is done for all the data. The converted ASCII characters are then placed into the body of the entity instead of the raw binary data, 76 characters to a line. I showed how this is done in the second body part in the example of Table 248 (except I didn't put 76 characters per line, to keep the line lengths short). One final character is involved in this scheme, the equal sign (=), which is used as a padding character when needed.
Since base64 characters are regular ASCII, they appear to SMTP like a regular text message. Of course the data looks like gibberish to us, but that's not a problem since it will be converted back to its regular form and displayed to the recipient as an image, movie, audio or whatever.
The main drawback of the base64 method? It is about 33% less efficient than sending binary data directly, using something like FTP. The reason is that three 8-bit bytes of binary data are sent as four ASCII characters, but of course, each ASCII character is represented using 8 bits itself. So there is 1/3 extra overhead when using base64. In most cases this is not a big deal, but it can be significant if downloading very large e-mail files over a slow Internet connection.
Note that RFC 2046 also defines two other encodings: ietf-token and x-token. These are included to allow new encoding types to be defined in the future.
Home - Table Of Contents - Contact Us
The TCP/IP Guide (http://www.TCPIPGuide.com)
Version 3.0 - Version Date: September 20, 2005
© Copyright 2001-2005 Charles M. Kozierok. All Rights Reserved.
Not responsible for any loss resulting from the use of this site.