What is the most widely used TCP/IP and Internet service? If you answered FTP, you're right. (If you didn't choose FTP, this answer may come as a bit of a surprise, but FTP remains the most widely used service, although the World Wide Web is quickly catching up.) FTP's popularity is easy to understand. The FTP software is supplied with every version of UNIX and Linux; it's easy to install, configure, and use; and it gives users access to a wealth of information with very little effort.
Earlier chapters of this book have mentioned FTP, and most user-oriented books deal with using FTP in some detail. If all you want to use FTP for is connecting to another machine and transferring files, then you don't have to do much more than enable the FTP service on your system. Much more interesting to many is turning your Linux machine into an FTP site, where others can call in and obtain files you make available. That's the primary focus of this chapter: setting up an FTP site on your Linux machine. The chapter begins, though, with a quick look at using FTP and the way FTP runs on TCP. This information should help you understand how FTP works and what it does with TCP/IP.
The File Transfer Protocol (FTP) is one protocol in the TCP/IP family used to transfer files between machines running TCP/IP (FTP-like programs are also available for some other protocols). The File Transfer Protocol enables you to transfer files back and forth and manage directories. FTP is not designed to give you access to another machine to execute programs, but it is the best utility for file manipulation. To use FTP, both ends of a connection must be running a program that provides FTP services. The end that starts the connection (the client) calls the other end (the server) and establishes the FTP protocol through a set of handshaking instructions.
Usually, when you connect to a remote system via FTP, you must log in. In order to log in, you must be a valid user with a username and password for that remote machine. Because it is impossible to provide logins for everyone who wants to access a machine that allows anyone to gain access, many systems use anonymous FTP instead. Anonymous FTP allows anyone to log in to the system with the login name of ftp, guest, or anonymous and either no password or a login name for their local system.
Using FTP to connect to a remote site is easy. You have access to the remote machine either through the Internet (directly or through a service provider) or through a wide or local area network if the remote machine is directly reachable. To use FTP, you start the FTP client software and provide the name of the remote system to which you want to connect. For example, assuming you can get to the remote machine through a LAN or the Internet (which knows about the remote machine thanks to DNS), you issue the following command:
ftp chatton.com
This command instructs your FTP software to try to connect to the remote machine chatton.com and establish an FTP session.
When the connection is completed (and assuming that the remote system allows FTP logins), the remote prompts for a userID. If anonymous FTP is supported on the system, a message usually tells you exactly that. The login following is shown for the Linux FTP archive site sunsite.unc.edu:
ftp sunsite.unc.edu 331 Guest login ok, send your complete e-mail address as password. Enter username (default: anonymous): anonymous Enter password [tparker@tpci.com]: |FTP| Open 230- WELCOME to UNC and SUN's anonymous ftp server 230- University of North Carolina 230- Office FOR Information Technology 230- SunSITE.unc.edu 230 Guest login ok, access restrictions apply. FTP>
After the login process is completed, you see the prompt FTP> indicating that the remote system is ready to accept commands.
When you log on to some systems, you may see a short message that may contain instructions for downloading files, any restrictions that are placed on you as an anonymous FTP user, or information about the location of useful files. For example, you may see messages like the following (taken from the Linux FTP site):
To get a binary file, type: BINARY and then: GET "File.Name" newfilename To get a text file, type: ASCII and then: GET "File.Name" newfilename Names MUST match upper, lower case exactly. Use the "quotes" as shown. To get a directory, type: DIR. To change directory, type: CD "Dir.Name" To read a short text file, type: GET "File.Name" TT For more, type HELP or see FAQ in gopher. To quit, type EXIT or Control-Z. 230- If you email to info@sunsite.unc.edu you will be sent help information 230- about how to use the different services sunsite provides. 230- We use the Wuarchive experimental ftpd. if you "get" <directory>.tar.Z 230- or <file>.Z it will compress and/or tar it on the fly. Using ".gz" instead 230- of ".Z" will use the GNU zip (/pub/gnu/gzip*) instead, a superior 230- compression method.
Once you are on the remote system, you can use familiar Linux commands to display file contents and move around directories. To display the contents of a directory, for example, use the command ls (some systems support the DOS equivalent DIR). To change to a subdirectory, use the cd command. To return to the parent directory (the one above the current directory), use the command cd ... As you can see, these commands are the same ones you would use on your local machine, except that you are now navigating on the remote system. To change directories on your local machine, you can use the lcd command.
FTP has no keyboard shortcuts (such as pressing the Tab key to fill in names that match). You have to type in the name of files or directories in their entirety (and do so correctly). If you misspell a file or directory name, you will get error messages and have to try again. If you are performing the FTP session through an X window, you can cut and paste lines from earlier in your session. Users of gpm can cut and paste from character-based screens.
Transferring files is the whole point of FTP, so you need to know how to retrieve a file from the remote system, as well as how to put a new file there. When you have moved through the remote system's directories and found a file you want to move back to your local system, use the get command. Place the filename after the command, for example:
get "soundcard_driver"
This command transfers the file soundcard_driver from the remote machine to the current directory on your local machine. When you issue a get command, the remote system transfers data to your local machine and display a status message when it is completed. There is no indication of progress when a large file is being transferred, so be patient. Many versions of FTP support a command called hash that displays a pound sign after every 1024 bytes has been transferred. This command gives you a visual indication of the progress of the transfer.
FTP> get "file1.txt" 200 PORT command successful. 150 BINARY data connection for FILE1.TXT (27534 bytes) 226 BINARY Transfer complete. 27534 bytes received in 2.35 seconds (12 Kbytes/s).
If you want to transfer a file the other way (from your machine to the remote, assuming you are allowed to write to the remote machine's filesystem), use the put command in the same way. The following command transfers the file comments from your current directory on the local machine (you can specify full pathnames) to the current directory on the remote machine (unless you change the path).
put "comments"
The commands get (download) and put (upload) are always relative to your home machine. You are telling your system to get a file from the remote and put it on your local machine, or to put a file from your local machine onto the remote machine. (This process is the exact opposite of Telnet, which has everything relative to the remote machine. It is important to remember which command moves in which direction, or you could overwrite files accidentally.)
The quotation marks around the filenames in the preceding examples are optional for most versions of FTP, but they do prevent shell expansion of characters, so they can be recommended. For most files, the quotation marks are not needed, but using them is a good habit to get into.
Some FTP versions provide a wildcard capability using the commands mget and mput. Both the FTP get and put commands usually transfer only one file at a time, which must be specified completely (no wildcards). The mget and mput commands enable you to use wildcards. For example, to transfer all the files with a .doc extension, you could issue the following command:
mget *.doc
You have to try the mget and mput commands to see if they work on your FTP version. (Some FTP get and put commands allow wildcards, too, so you can try wildcards in a command line to see if they work.)
FTP allows file transfers in several formats, which are usually system-dependent. The majority of systems (including Linux systems) have only two modes: ASCII and binary. Some mainframe installations add support for EBCDIC, while many sites have a local type that is designed for fast transfers between local network machines (the local type may use 32- or 64-bit words).
The difference between the binary and ASCII modes is simple. Text transfers use ASCII characters separated by carriage return and newline characters. Binary mode allows transfer of characters with no conversion or formatting. Binary mode is faster than text and also allows for the transfer of all ASCII values (necessary for non-text files). FTP cannot transfer file permissions, as these are not specified as part of the protocol.
Linux's FTP provides two modes of file transfer: ASCII and binary. Some systems automatically switch between the two when they recognize that a file is in binary format, but you shouldn't count on the switching unless you've tested it before and know it works. To be certain, it is a good idea to set the mode manually. By default, most FTP versions start up in ASCII mode, although a few start in binary.
To set FTP in binary transfer mode (for any executable file or file with special characters embedded for spreadsheets, word processors, graphics, and so on), type the following command:
binary
You can toggle back to ASCII mode with the command ascii. Because you will most likely be checking remote sites for new binaries or libraries of source code, it is a good idea to use binary mode for most transfers. If you transfer a binary file in ASCII mode, it will not be executable on your system.
ASCII mode includes only the valid ASCII characters and not the Ctrl-key sequences used within binaries. Transferring an ASCII file in binary mode does not affect the contents except in very rare instances. When transferring files between two Linux (or any UNIX) systems, using binary mode will handle all file types properly, but transfers between a Linux and non-UNIX machine can cause problems with some types of files. ASCII mode is only suitable for transferring straight text files.
To quit FTP, type the command quit or exit. Both will close your session on the remote machine, and then terminate FTP on your local machine. Users have a number of commands available within most versions of FTP, the most frequently used of which are the following:
ascii | Switches to ASCII transfer mode |
binary | Switches to binary transfer mode |
cd | Changes directory on the server |
close | Terminates the connection |
del | Deletes a file on the server |
dir | Displays the server directory |
get | Fetches a file from the server |
hash | Displays a pound character for each block transmitted |
help | Displays help |
lcd | Changes directory on the client |
mget | Fetches several files from the server |
mput | Sends several files to the server |
open | Connects to a server |
put | Sends a file to the server |
pwd | Displays the current server directory |
quote | Supplies an FTP command directly |
quit | Terminates the FTP session |
For most versions, FTP commands are case-sensitive. If you type commands in uppercase, FTP will display error messages. Some versions perform a translation for you, so it doesn't matter which case you use. Because Linux uses lowercase as its primary character set for everything else, you should probably use lowercase with all versions of FTP, too.
The File Transfer Protocol uses two TCP channels: TCP port 20 is used for data, and port 21 is for commands. Both these channels must be enabled on your Linux system for FTP to function. The use of two channels makes FTP different from most other file transfer programs. By using two channels, TCP allows simultaneous transfer of FTP commands and data. FTP works in the foreground and does not use spoolers or queues.
FTP uses a server daemon that runs continuously and a separate program that is executed on the client. On Linux systems, the server daemon is called ftpd. The client program is ftp.
During the establishment of a connection between a client and server, and whenever a user issues a command to FTP, the two machines transfer a series of commands. These commands are exclusive to FTP, and are known as the internal protocol. FTP's internal protocol commands are four-character ASCII sequences terminated by a newline character, some of which require parameters. One primary advantage of using ASCII characters for commands is that users can observe the command flow and understand it easily, which helps in a debugging process. Also, a knowledgeable user can use the ASCII commands directly to communicate with the FTP server component without invoking the client portion (in other words, communicating with ftpd without using ftp on a local machine). This procedure is seldom done, however, except when debugging (or showing off).
After logging in to a remote machine using FTP, you are not actually on the remote machine. You are still logically on the client, so all instructions for file transfers and directory movement must be with respect to your local machine and not the remote one. The process followed by FTP when a connection is established is as follows:
A debugging option is available from the FTP command line by adding -d to the command. This option displays the command channel instructions. Instructions from the client are shown with an arrow as the first character; instructions from the server have three digits in front of them. A PORT in the command line indicates the address of the data channel on which the client is waiting for the server's reply. If no PORT is specified, channel 20 (the default value) is used. Unfortunately, the progress of data transfers cannot be followed in the debugging mode. The following is a sample session with the debug option enabled:
$ ftp -d tpci_hpws4 Connected to tpci_hpws4. 220 tpci_hpws4 FTP server (Version 1.7.109.2 Tue Jul 28 23:32:34 GMT 1992) ready. Name (tpci_hpws4:tparker): ---> USER tparker 331 Password required for tparker. Password: ---> PASS qwerty5 230 User tparker logged in. ---> SYST 215 UNIX Type: L8 Remote system type is UNIX. ---> Type I 200 Type set to I. Using binary mode to transfer files. ftp> ls ---> PORT 47,80,10,28,4,175 200 PORT command successful. ---> TYPE A 200 Type set to A. ---> LIST 150 Opening ASCII mode data connection for /bin/ls. total 4 -rw-r----- 1 tparker tpci 2803 Apr 29 10:46 file1 -rw-rw-r-- 1 tparker tpci 1286 Apr 14 10:46 file5_draft -rwxr----- 2 tparker tpci 15635 Mar 14 23:23 test_comp_1 -rw-r----- 1 tparker tpci 52 Apr 22 12:19 xyzzy Transfer complete. ---> TYPE I 200 Type set to I. ftp> <Ctrl-d> $
You may have noticed in the preceding code how the mode changed from binary to ASCII to send the directory listing, and then back to binary (the system default value).
Whether you decide to provide an anonymous FTP site or a user-login FTP system, you need to perform some basic configuration steps to get the FTP daemon active and get the directory system and file permissions properly set to prevent users from destroying files or accessing files they shouldn't. The process can start with choosing an FTP site name. You don't really need a site name, although it can be easier for others to access your machine (especially anonymously) if you have one. The FTP site name is in the following format:
ftp.domain_name.domain_type
In this syntax, domain_name is the domain name (or an alias) of the FTP server's domain, and domain_type is the usual DNS extension. For example, you could have an FTP site name like the following example:
ftp.tpci.com
This name shows that this is the anonymous FTP access for anyone accessing the tpci.com domain. It is usually a bad idea to name your FTP site with a specific machine name, such as the following:
ftp.merlin.tpci.com
This name makes it difficult to move the FTP server to another machine in the future. Instead, use an alias to point to the actual machine on which the FTP server sits. This is not a problem if you are a single machine connected to the Internet through a service provider, for example, but is often necessary with a larger network. The alias is easy to set up if you use DNS. Set the alias in the DNS databases with a line like the following:
ftp.tpci.com. IN CNAME merlin.tpci.com.
This line points anyone accessing the machine ftp.tpci.com to the real machine merlin.tpci.com. If the machine merlin has to be taken out of its FTP server role for any reason, a change in the machine name on this line points the ftp.tpci.com access to the new server. (A change in the alias performed over DNS can take a while to become active, as the change must be propagated through all the DNS databases.) The period following the domain name is very important because it prevents expansion of the name to include the domain again (which would result in merlin.tpci.com.tpci.com).
The FTP daemon, ftpd, must be started on the FTP server (some Linux versions use the daemon wu.ftpd as the server). The daemon is usually handled by inetd instead of the rc startup files, so ftpd is only active when someone needs it. This approach is best for all but the most heavily laden FTP sites. When ftpd is started using inetd, the inetd daemon watches the TCP command port (channel 21) for an arriving data packet requesting a connection, and then spawns ftpd.
Make sure that inetd can start the ftpd daemon by checking the inetd configuration file (usually /etc/inetd.config) for a line that looks like the following:
ftp stream tcp nowait root /usr/etc/ftpd ftpd -l
If the line doesn't exist, add it to the file. With most Linux systems, the line is already in the file although it may be commented out. Remove the comment symbol if this is the case. The FTP entry essentially specifies to inetd that FTP is to use TCP and that it should spawn ftpd every time a new connection is made to the FTP port. In the preceding example, the ftpd daemon is started with the -l option, which enables logging. You can ignore this option if you want.
There are several ftpd daemon options that you can add to the /etc/inetd.config line to control ftpd's behavior. The most commonly used options are as follows:
-d | This option adds debugging information to the syslog. |
-l | This option activates logging of sessions (only failed and successful logins, not debug information). If the -l option is specified twice, all commands are logged, too. If specified three times, the size of all get and put file transfers are added, as well. |
-t | This option sets the timeout period before ftpd terminates after a session is concluded (default is 15 minutes). The value is specified in seconds after the -t option. |
-T | This option sets the maximum timeout period (in seconds) that a client can request. The default is two hours. This enables a client to alter the normal default timeout for some reason. |
-u | This option sets the umask value for files uploaded to the local system. The default umask is 022. Clients can request a different umask value. |
If you are going to set up a user-based FTP service, where each person accessing your system has a valid login name and password, then you must create an account for each user in the /etc/passwd file. If you are not allowing anonymous FTP access, do not create a generic login that anyone can use.
To set up an anonymous FTP server, you must create a login for the anonymous user ID. This is done in the normal process of adding a user to the /etc/passwd file. The login name is whatever you want people to use when they access your system, such as "anonymous" or "ftp". You need to select a login directory for the anonymous users that can be protected from the rest of the filesystem. A typical /etc/passwd entry looks like the following:
ftp:*:400:51:Anonymous FTP access:/usr/ftp:/bin/false
This entry sets up the anonymous user with a login of ftp. The asterisk password prevents anyone gaining access to the account. The user ID number (400) is, of course, unique to the entire system. For better security, it is a good idea to create a separate group just for the anonymous FTP access (edit the /etc/group file to add a new group), then set the ftp user to that group. Only the anonymous FTP use should belong to that group, as it can be used to set file permissions to restrict access and make your system more secure. The login directory in the example above is /usr/ftp, although you could choose any directory as long as it belongs to the anonymous FTP user (for security reasons, again). The startup program shown in the preceding example is /bin/false, which helps protect your system from access to accounts and utilities that do not have a strong password protection.
As you will see in the next section, "Setting Permissions," you can try to make the entire anonymous FTP subdirectory structure a filesystem unto itself, with no allowance for the anonymous user to get anywhere other than /usr/ftp (or whatever directory you use for anonymous access). For this reason, you need to create a mini-filesystem just for the anonymous FTP access, which holds the usual directory names and basic files anyone logging in needs. Part of the procedure is summarized in a checklist at the end of this chapter.
The process for setting up the directories your anonymous FTP login needs is simple, requiring you to create a number of directories and copy files into them. Here are the basic procedures:
The copies of the /etc/passwd and /etc/group files are copied into the ~ftp/etc directory to bypass the actual files in /etc. Edit these files to remove all passwords and replace them with an asterisk, preventing access to those accounts through anonymous FTP. Remove all entries in both /etc/passwd and /etc/group that are used names or groups (in other words, used by a valid user or group on your system), as well as most other entries except those used by the anonymous FTP login (usually just anonymous and bin).
You can use the ~ftp/pub directory structure to store the files you want to allow anonymous users to access. Copy them into this directory. You can create subdirectories as you need them for organizational purposes. It may be useful to create an upload directory somewhere in the ~ftp/pub directory structure that has write permission, so that users can upload files to you only into this upload area.
You can use the chroot command to help protect your system. The chroot command makes the root directory appear to be something other than / on a filesystem. For example, when chroot has been set for the anonymous FTP login, any time the anonymous user types a cd command, it can always be relative to their home directory. In other words, when they type cd /bin, they will really be changing to /usr/ftp/bin if the root has been set to /usr/ftp. This helps prevent access to any other areas of the filesystem than the FTP directory structure. The changes are effective only for the user ID the chroot command was run for.
If you do create an upload area, you may want to set the permissions to allow execute and write, but not read (to prevent another user downloading the files someone else has uploaded).
To set the permissions for files and directories used by your anonymous FTP users, follow the following procedure. If the directories or files do not already exist, copy or create them as necessary:
In general, you should have your FTP directories set so that all permissions for directories under ~ftp prevent write access by user, group, and other. Make sure the directories and files under ~ftp are set to allow the anonymous login to read them. (The directories need execute permission to allow the anonymous users to enter them and obtain directory listings.) This set of permissions provides pretty good security.
You can set the ownership of files and directories with the chown command. This command
chown root ~ftp/dev
sets the owner of ~ftp/dev to root, for example. All directories in the ~ftp directory structure should have the permissions set with the chmod command. This command
chmod 555 dir_name
sets read-execute permission only for the directory, for example. The exception to this rule is the upload directory, which can have write permission, as noted earlier.
Before you let anyone else onto your Linux FTP system, log in to it yourself and try to access files you shouldn't be able to, try to move into directories that are outside of the ~ftp structure, and try to write files where you shouldn't be able to. This provides a useful test of the permissions and directory structure. Spend an hour or so trying to read, write, copy, and move files, then try some su commands to try and log in as someone else (such as root or a valid system user). Make sure your system is buttoned up: if you don't, someone else will find the holes and exploit them.
It is a useful idea to set up a mailbox for the FTP administrator so that users on other systems who need help or information can send mail to you. Create a user and mailbox for a login such as ftp-admin and alias the mailbox to yourself or another person (or just log in as ftp-admin occasionally to check the mail).
Because this book covers system administration, it won't go into much detail about how to organize your directory structure, but a few useful tips may help you. To begin, decide what you want to store on your FTP directories and organize the structure logically. For example, if you are making available programs you have written, set up separate directories for each. A README file in each directory will help show browsers what is contained therein. A master README or INSTRUCTIONS file in the ~ftp directory can help explain how your site is set up and its contents (the uppercase letters draw a user's attention to the files immediately).
The FTP system discussed earlier, supplied with practically every Linux distribution, requires a bit of work to make it secure. Even then, it is still vulnerable to very experienced hackers. A better alternative is available if you are paranoid about your system's security: WU FTP. Developed at Washington University, WU FTP adds some extra features to the standard FTP system:
If these features sound useful, you can obtain a copy of the source code of WU FTP from several sites, although the primary site is wuarchive.wustl.edu. Check for the file /packages/wuarchive-ftpd/wu-ftpd-X.X.tar.Z (where X.X is the latest version number). You will get the source code, which needs to be compiled on your Linux system.
WU FTP uses a number of environment variables to control the service, and the accompanying documentation helps you set it up properly. Setting up WU FTP is much more complex than standard FTP, and the extra security, which is useful, may be unnecessary for many FTP site machines you may set up at home or work (unless you have sensitive information).
The information in this chapter enables you to set your system up as a full anonymous FTP site, or just for users you want to gain access. The process is simple, although you have to take care to ensure that the file permissions are properly set. Once your FTP site is up, you can let others on the Internet or your local area network know you are running, and the type of material you store on your system. Then sit back and share!