The 5 Linux Packaging Types You Need to Know
Unix, Windows, and Linux administrators must be familiar with the various types of packaging formats that are used with their operating system—all of which can be found in our Linux course. Typically, candidates for sysadmin certification will be tested on the relevant formats in the certification exam.
For example, in the system operation and maintenance section of the CompTIA Linux+ certification exam, candidates are given a scenario and asked how they would conduct software installations, configurations, updates, and removals using various package types.
What is a Package Format?
Package formats are used when multiple files — such as those required for the installation of software — need to be distributed. The package is created to combine all the required data files, both pre-compiled binaries, as well as source, text or data files, into a single archive. The packages may also be created for back-up storage and portability. The package will also contain information such as the software's name, its purpose, and version number, as well as any operating versions and requirements for the target environment. This information, sometimes called a manifest of dependencies, allows developers of Linux software to provide guidance to sysadmins as to what is necessary for the software to run correctly on their version of Linux.
Packages are created to work with a particular package manager. This is the software that the sysadmins use to unpack the Linux software and prepare it for installation and operation. All package managers perform similar basic functions, but each has their own user interface and internal workings. Some Linux vendors and distributions such as Red Hat and Debian have created their own package managers and package formats.
However, this does not mean that the managers and formats cannot be used on any Linux version. In fact, Debian's Alien command-line utility allows sysadmins to convert between formats and some package managers work with multiple packaging formats.
Packaging Formats Covered by CompTIA Linux+ Certification
While there are many different Linux packaging formats, there are five that are featured in the objectives for the CompTIA Linux+ certification exam. Let's look at those formats and see when they are used and why. The five packaging formats are:
RPM packages (.rpm)
Debian packages (.deb)
TAR archives (.tar)
TGZ archives (.tgz)
GZip Archives (.gz)
RPM Packages (.rpm)
The .rpm packaging format is one of the most common Linux packaging formats. Originally designed and developed for the Red Hat Linux distribution, the .rpm format was used by the Red Hat Package Manager (RPM). The format is used on other Linux distributions such as OpenSUSE and was selected as the packaging format for the Linux Standard Base.
RPM files are most often used to hold software binaries, but they may also be used for uncompiled software source files. RPM packages containing binaries have a .rpm extension, whereas .src.rpm is normally used for source packages. A tag in the package's header informs the package manager whether the package contains binary or source files.
Debian Packages (.deb)
This packaging format was developed for the Debian Linux distribution. It is the de-facto standard for Debian Linux and derivatives such as Ubuntu. Each Debian package contains two archive files: one with control information, the other with the installable data. These archive files are in the .tar format, which we will discuss below.
The base building block for Debian package management is the DPKG utility. It handles the basics of packing, unpacking, installing, and removing software from Debian Linux and its derivatives. Sysadmins typically would not use DPKG directly, preferring a more user-friendly interface that sits on top of it, such as the Advanced Package Tool (APT), or Aptitude which itself sits on top of APT. For a modern graphical user interface, you could try the Synaptic front-end to APT.
TAR Archives (.tar)
The TAR format was originally designed by Bell Laboratories in 1979 for the UNIX operating system. In those dim distant days, tape drives were used for archiving, for data transfer, and for backup and recovery. The TAR (or Tape ARchive) command allowed administrators to copy groups of files and directories into a combined archive file. The archive files created by TAR are commonly called tarballs. Subsequently, admins could un-archive the files and their directory structure on the same or different computer system.
Despite its origin as a tape archiver, TAR is still a commonly used archiving and packaging format in both the Linux and Windows worlds. Historically, the TAR command did not compress the files in the tarballs! If you combined five 1-megabyte files, then you would get a 5-megabyte tarball! Nowadays, implementations of the TAR command have options to compress files, but by convention, uncompressed tarballs all have a .tar extension.
In the final two packaging formats, we will look at options to create compressed archive files.
TGZ Archives (.tgz)
Normally, for ease of transfer and storage, you would want to compress the files in your tarball! For that reason, most Linux implementations of TAR have options that allow you to create a compressed tarball — sometimes with your choice of compression technique. TGZ archives are tarballs that are created and compressed using the GNU Zip (Gzip) compression technique. The amount of compression depends on the original file types, but the Gzip manual claims that text files can be compressed by 60%-70%.
TGZ archive files can be created directly by using the Gzip compression option (tar z) at the TAR command line, or by using Gzip to compress a standard TAR archive file. In both cases, the resulting archive will have a .tgz extension — or sometimes it will be .tar.gz.
GZip Archives (.gz)
Finally, .gz archives are files that are compressed directly using the Gzip utility. Gzip only compresses single files and creates a .gz file for each file. So, if you wanted to create a compressed archive of multiple files and directories, you would first need to combine them into a single file! Didn't we just discuss tar?
Given the naming conventions related to .gz, .tgz, and .tar.gz extensions, how should you unpack these archives? You should use Gzip on the first one and you will get the original uncompressed file. On the .tgz and .tar.gz archives, you could run Gzip and get uncompressed TAR archives, which you must further unpack. Alternatively, you could cut that step by directly using the x and z options on the TAR command line (tar xzf file.tar.gz) to decompress the files and extract them from the archive.
It is important for Linux sysadmins, as well as DevOps professionals, to be up-to-speed on commonly used packaging formats. They are an important aspect of managing and maintaining a productive Linux environment. If you are planning on going for the CompTIA Linux+ certification exam, then you'd better know your TARs, TGZs, and GZs.