FTP Data Representation: Data Types, Data Structures and Format Control
(Page 1 of 3)
The most general way of designing the File Transfer Protocol would have been to make it treat all files as black boxes. The file would be represented as just as a set of bytes. FTP would pay no attention to what the file contained, and would simply move the file, one byte at a time, from one place to another. In this, it would seem to be very similar to the copy command that is implemented on most file systems, which likewise creates a copy without looking into the file to see what it contains.
So what would be the problem with that, you may wonder? Well, for some types of files, this is exactly what we want, but for others, it introduces a problem. The reason is that certain types of files use different representations on different systems. If you copy a file from one place to another on the same computer using a copy command, there is no problem: the same representation for files is used everywhere within that computer. But when you copy it to a computer that uses a different representation, however, you may encounter difficulties.
The most common example of this is a type of file that may surprise you: simple text files. All ASCII text files use the ASCII character set, but they differ in the control characters used to mark the end of a line of text. On UNIX, a line feed (LF) character is used; on Apple computers, a carriage return (CR); and Windows machines used both (CR+LF).
If you move a text file from one type of system to another using regular FTP, the data will all get moved exactly as it was. Moving a text file from a UNIX system to a PC as just a set of bytes would mean programs would not properly recognize end of line markers. Avoiding this predicament requires that FTP move past the idea that all files are just bytes and incorporate some intelligence to handle different types of files. The FTP standard recognizes this by allowing the specification of certain details about the file's internal representation prior to transfer.
The first piece of information that can be given about a file is its data type, which dictates the overall representation of the file. There are four different data types specified in the FTP standard:
In practice, the two data types most often used are ASCII and image. The ASCII type is used for text files, and allows them to be moved between systems with line-end codes converted automatically. The Image type is used for generic binary files, such as graphical images, ZIP files and other data that is represented in a universal manner. It is also often called the binary type for that reason.
Home - Table Of Contents - Contact Us
The TCP/IP Guide (http://www.TCPIPGuide.com)
Version 3.0 - Version Date: September 20, 2005
© Copyright 2001-2005 Charles M. Kozierok. All Rights Reserved.
Not responsible for any loss resulting from the use of this site.