Unix Crash Course!

The following guide will attempt to present a simple, though comprehensive, account of the basic server-side programs and commands that we hope will be of use to both the casual and the more advanced user. For more extensive reference, see the List of Unix utilities on Wikipedia and the Unix manual pages.

Even if you are not technically-oriented and have never worked with a shell, you can learn to work progressively with the different commands that, when used together, will give you a level of control that graphical user interfaces, by design, cannot provide.

A SSH (Secure Shell) client program must be installed on you workstation in order to connect to the server and issue commands. The "shell" is simply program that receives the commands that you type. Different shell programs are available, and each provide different features (such as auto-completion) - it is mostly a matter of taste. Free, easy to use SSH clients are available for Windows, Mac, Unix and other platforms.

Table of contents

Moving across directories
File management
Image manipulation
Sifting through text with grep
Process management
File permissions
Archives
Locating files with find
Downloading files
Editing text
Browsing the web

Use Autocompletion (TAB key)

Good command-line interfaces are designed to reduce unnecessary typing to a minimum. The most useful device for achieving this goal is autocompletion. You can use autocompletion by simply hitting TAB (with most keyboards) in the middle of entering your command. If you are in the process of entering a command name, the command will be auto-completed based on the available programs. If you are entering an argument, autocompletion will use the files in the current directory - a convenient feature (especially when dealing with filenames containing spaces).

Where different completion options exist, the program will display the possibilities.

Moving across directories

The cd (change directory) command will set the working directory to the specified directory.

  $ cd new_directory
  $ cd www/logs

If the name of the directory itself contains spaces (or other special characters), you need to either enclose the directory name in quotes, or precede the space with a backslash (\) character:

  $ cd "new directory with spaces"
  $ cd new\ directory\ with\ spaces

So far, we have assumed that the new directory is a subdirectory of the current (working) directory. If you want to move to a directory under a specific location, you can use an absolute path name:

  $ cd /home/myself/www/mystuff
  $ cd /usr/share/doc

For convenience, ~ and $HOME will be substituted for your home directory. Just cd without arguments will move you to your home directory.

  $ cd $HOME/www/logs
  $ cd ~/www/logs
  $ cd

The pwd command displays the current (or working) directory. You will usually not need this command since most shell programs already display the working directory in the prompt (the text shown left of the cursor when you are entering a new command).

  $ pwd

File management

To list the contents of the working directory, you can use the ls command. ls has an almost humorous number of options (documented in its manual page), but you will most likely use only -l (detailed listing), -d (don't recurse into the specified directory), -R (recurse into all subdirectories), and -a (show hidden files; i.e., files that start with a dot).

  $ ls
  $ ls -la
  $ ls -ld SomeDir
  $ ls -R SomeDir

cat can be used to display the entire contents of a text file to the terminal:

  $ cat ~/ToDo.txt

It is also common to concatenate files together using cat:

  $ cat ToAppend.txt >> TargetFile.txt

The cp (copy) command will copy one or more files. To create a copy of SourceFile named NewFile, you would use:

  $ cp SourceFile NewFile

cp can also copy a file into a given directory:

  $ cp Image.jpg ~/www/MyImages/

Entire directories can be copied in the same way. We can use the -p (preserve) option to preserve all file attributes:

  $ cp -Rp Images ~/www/MyImages/

The mv (move) command will move (i.e., rename) files or directories:

  $ mv SomeFile TargetFile
  $ mv SomeFile ~/MyDir/
  $ mv SomeDir ~/MyDir/

The touch utility is useful for creating empty files (or updating the modification time on an existing file):

  $ touch EmptyFile

The mkdir command will create a new directory:

  $ mkdir NewDir

The rm (remove) command will delete single files, or entire directories if used with the -R option:

  $ rm EmptyFile
  $ rm -iR NewDir

Unless the -i option is used, rm will never prompt for confirmation before deleting files. rm does not perform any backup. The only way to recover data deleted by mistake is to recover from backup. You can request backup restores from us (or perform the recovery yourself by logging into a backup server).

Occasionally, files with a dash (-) as the first character in the file name are created, usually by mistake. If you need to delete such a file, use the backslash as an escape character (quotes alone will not work in this case):

  $ rm \-StupidFile

tail is used to display the last few lines of a text file. It is useful for displaying log files:

  $ tail -f ~/www/logs/access
  $ tail -f ~/www/logs/error

Symbolic links are used to create short cuts, or multiple names for a single file or directory (possibly across directories). They are created with the ln command. The example below will create a new name for MyFile:

  $ ln -s MyFile LinkToMyFile

Image manipulation

The convert utility (part of the ImageMagick distribution) is a convenient tool for image manipulation from the command line.

To generate a set of thumbnails for all the JPEG files in the current directory, you would use a command like:

  $ convert '*.jpg' -resize 120x120 thumbnail%03d.png

Converting a series of PNG files to JPEG format is as simple as:

  $ convert *.png images.jpg

Here is a simple Perl script that creates thumbnails for all *.jpg files in the current directory as necessary, and outputs an HTML code fragment for a table: GenThumbnails.pl.

For more information, see ImageMagick: Command-line Processing.

Sifting through text with grep

grep is useful when working with any type of text. Given text input, grep will output only those lines matching a given pattern. Patterns are described using regular expressions (or "regexps"), which is a powerful language that is recognized by many text editors, programming languages, and of course grep.

You don't need to learn regular expressions to use grep. Typically, grep is used to search a text file for lines containing a substring:

  $ grep 'John Smith' ~/Customers.txt

With the -r option, grep will scan all files in the given directory:

  $ grep -r 'John Smith' ~/Customers/

-r is not always a good option when you want to search through specific files, that is why grep is often combined with find. Let's say you want to search for "Joe Smith" in all of your *.html and *.txt files in ~/www/, you could use:

  $ find ~/www/ -name \*.html -or -name \*.txt -exec grep -H {} \;

The -H flag tells grep to display the filename along with every output line. Everything in between -exec and \; will be executed as a command on every *.html and *.txt file located by find. In the command, the {} will be substituted for the file name.

You can also perform more advanced searches using regular expressions with egrep:

  $ egrep '(John|Joe) Smith' ~/Customers.txt

Process management

In the same way that your workstation allows you to have multiple programs, such as web browsers and word processors running on your desktop at the same time, the server enables you to execute multiple programs concurrently. Instances of an executing program are called processes.

Even if you have never logged into your shell, you possibly have a number of processes running right now. Whenever someone visits a PHP or CGI script on your website, a process is created to serve the request, and terminated when the data has been transferred. If you are using FastCGI, like we recommend, you may have a few processes running persistently. If you are logged into our Control Panel, you have at least one process running.

To see the list of processes executing under your account, use the ps command:

  $ ps -x

If you omit the -x flag, ps will only return the processes that have been started from your current shell session.

More information about processes such as the amount of CPU and memory resources used, are shown with the -u flag:

  $ ps -ux

The top utility will display your processes and periodically update the information. It is a common practice for developers to have a dedicated shell session on their desktop for purposes of running top.

  $ top -U $USER

Typically, processes are expected to perform some work and then terminate, sleep or wait on some resource (the "STATE" column in top shows this information). Sometimes, programming errors (e.g., infinite loops) can cause processes to spin out of control and keep executing code forever. If such a thing happens, the process can be terminated manually with the kill or skill command:

  $ kill 1234
  $ kill -KILL 1234
  $ skill php

As mentioned previously, processes can be associated with your current ssh session. If you are using, for instance, a text editor, you can usually hit Ctrl-Z to place the editor in the background. When you want to bring it back up, issue the fg command. The jobs command will display the list of processes in the background.

  $ jobs

As another example, if you are executing an expensive image processing command like convert or uncompressing a large archive, you can place the command in the background and do something else. Hit Ctrl-Z (this will stop the process), and use the bg command to bring the process into the background. jobs should be showing the command along with a number identifying the task. To bring task number 1 back into foreground, you would use:

  $ fg 1

File permissions

Unix allows you to manage permissions on individual files and directories using a simple but effective system. The chmod command is employed to set read, write, and execute permissions on files and directories. To see the permissions currently associated with a file or directory, use ls -l:

  $ ls -l *.txt
  -rw-------  1 myself mygroup  0 Mar 28 12:45 PrivateFile.txt
  -rw-rw-r--  1 myself mygroup  0 Mar 28 12:47 PublicFile.txt

Every file is associated with an owner (here myself), a group (here mygroup) and a set of permissions. The first field in the listing specify the privileges of the (owner), of the (group members), and of (everyone else), on the file.

The available privileges are r (reading), w (writing) and x (executing if file is an executable, or in the case of a directory, x permits navigation under the directory).

In this example, PrivateFile.txt is readable and writeble ("rw-") by the owner (myself), and inaccessible to anyone else. PublicFile.txt can be read and written by the owner and the members of mygroup ("rw-"), and is readable by everyone else ("r--").

If you wanted to make PrivateFile.txt readable by users in the mygroup group, you would use chmod:

  $ chmod g+r PrivateFile.txt

To change the group associated with a file, use the chgrp command. You can always create new groups and assign users to those groups using csoftadm (or use "Unix groups" in the Control Panel).

  $ chgrp mygroup PrivateFile.txt

File permissions of PHP/CGI scripts

On our servers, web applications (CGI, PHP or otherwise) execute under your own Unix user privileges, and may securely access your private (e.g., mode 0600) files.

Many web hosts lack this level of security, and will execute web applications under a shared UID, requiring that you make your web-accessible data readable or writeable by anyone on their servers. Many popular web applications (e.g., Gallery, as of this writing) will even assume by default that you are using such an insecure setup.

On our servers, there is never need to make any file or directory world-writeable in order for any of your applications to access them. If your application's instructions mention use of permission modes such as 0666 on files and 0777 on directories, you can ignore them and instead use secure modes such as 0600 and 0700.

Archives

Server-side compression (or decompression) of files is a common task. There are utilities available to create and extract archives in various formats, such as: .tar, .gz, .bz2, .zip and .rar.

The tar command can be used to extract (-x option) archives in .tar format. It can also operate on .tar.gz (or .tgz) files when given the -x flag. The optional -v (verbose) flag causes progress to be displayed.

  $ tar -xvf my.tar                  # Extract a .tar
  $ tar -xzvf his.tar.gz             # Extract a .tar.gz
  $ tar -xjvf her.tar.bz2            # Extract a .tar.bz2

It is possible to extract only specified files from an archive:

  $ tar -xvf MyImages.tar vacation1.jpeg

The -t option is useful for inspecting the contents of an archive without unpacking it:

  $ tar -tf my.tar
  $ tar -tzf his.tar.gz

The -c (create) option is used to create a new .tar archive. Usually, the archive is subsequently compressed with gzip or bzip2:

  $ tar -cvf www.tar -c ~/www/   # Create www.tar from ~/www/ directory
  $ gzip www.tar                 # Compress www.tar into www.tar.gz
  $ bzip2 www.tar                # Compress www.tar into www.tar.bz2

The most useful option to gzip and bzip2 is the compression level: -1 (fastest) through -9 (best) are accepted.

To unpack a .zip archive into the current directory, use the unzip command:

  $ unzip her.zip

A .zip archive is created with the zip command. Additional utilities such as zipsplit are available as well.

  $ zip -r MyImages.zip ~/www/images/

.rar archives are unpacked with unrar:

  $ unrar -x Archive.rar

Locating files with find

The find tool is used to search files, or perform some action on multiple files in a directory hierarchy.

To display all *.jpg files in the ~/www/ directory (and subdirectories), you would use:

  $ find ~/www -iname '*.php'

It is important to place the *.php pattern in quotes (or use \*.php) in order to avoid shell expansion.

To combine find with the grep command is an efficient way to search through text files (-i = case insensitive search, -o = always print filenames).

  $ find ~/www -name .htaccess -exec grep -io 'redirect' {} \;

find can also execute a command on every file (or directory) it locates. The following will find any directory under ~/www/ with an access mode of 0777 (a world-writable directory), and will change it to a sane 0751 mode:

  $ find ~/www -type d -perm 777 -exec chmod 751 {} \;

The following locates any directory owned by the group OldGroup and changes ownership to NewGroup:

  $ find . -type d -group OldGroup -exec chgrp NewGroup {} \;

The following prints all files in your home that have been modified in the last 24 hours:

  $ find $HOME -mtime 0

The following uses ls to print a detailed listing of all files in your home that have been modified in the last 7 days:

  $ find $HOME -mtime 7 -exec ls -l {} \;

Here is a nice tutorial on the find command.

Downloading files

The wget tool can be used to download files (or complete directory hierarchies with -r) from FTP and HTTP sites:

  $ wget http://somesite.ext/somefile
  $ wget -r http://somesite.ext/somedir/

Don't use the -r option unless you know what you are doing. It should be used with care as it follows all links recursively.

You can resume an aborted, partially completed download with the -c option:

  $ wget -c http://somesite.ext/somefile

There are many more options available; see the wget manual page.

Editing text

Text edition is by far the most common task that users perform via their shell, since it is convenient to edit files directly on the server.

Several editors are available. The easiest editor for beginners is probably nano (or pico).

The more advanced vim editor is also available. vim is an improved version of vi, packed with features such as syntax highlighting, multiple windows, scripting, indenting, unicode support and more.

Another (somewhat ridiculously) feature-rich editor is emacs. There is also a smaller and faster emacs-compatible editor called mg, which is always available on our servers.

Browsing the web

Initially developed in 1992, lynx is the tried and true text web browser available and is still commonly used. It is simple to use: links are navigated using the tab key (interestingly, Microsoft holds the software patent for having invented the technique 12 years later...) Another text-based browser is links, which adds supports for tables and frames.

  $ lynx www.csoft.net/docs/course.html
  $ links www.csoft.net