Do You Speak Tar?
For a lot of people, the GNU
tar command’s options seem obscure and hard to
use. The most common ones exist only in a short form and always appear grouped
in the same order and often without a leading hyphen, e.g.
archive.tgz and not
tar -v -z -x -f archive.tgz. Additionally,
work without any “option”.
These options, or rather these commands, can be seen as a (small) language that
you can learn to
speak, write or read. Each command has its own
meaning that sometimes depend on which other commands are used with it.
tar’s sentences start with a verb. There’s no subject, because you’re giving
an order to
tar. This verb is followed by zero or more modifiers that give
more context to the action. The last part is the object(s) on which the action
is made. Spaces are not needed between
tar’s words because they all consist
of one letter.
The two most common actions are “create” (
c) and “extract”
x). The first one is used to create an archive from some files; and the
second is used to extract that archive in order to get back those files.
tar implementations support one more action: “list” (
list an archive’s content without extracting it. Some implementations support
two variants of “create” that are “append” (
r) and “update”
u). The former appends files to an existing archive; the latter updates
files in the archive for which there exist a more recent version.
Unfortunately we now know all
tar actions but can’t do much without knowing
how to apply them to an object. Let’s dive into objects and we’ll see the
tar has a very limited set of objects: archives. Each
tar command operates
on one archive, that is given by
f (for “file”) followed by its path.
Files added or extracted from archives are simply given as extra arguments to these commands without needing any special word.
We’re now ready to write our first meaningful sentences.
tar, please create an archive file
file2” is written as
tar cf foo.tar file1 file2.
“Extract archive file
foo.tar” is written as
tar xf foo.tar.
“List archive file
foo.tar” is written as
tar tf foo.tar. You get
Note that actions like “extract” or “list” accept additional
arguments for the file patterns you want to extract/list. Say you have a big
archive from which you only want to extract one
important.txt file. Just give
this information to
tar and it’ll kindly extract it for you:
tar xf big-archive.tar important.txt
You might wonder what is this “file” word for if we always need it.
Well, we can remove it. But if we do so, our
tar command doesn’t have any
object left, so it’ll look at something else:
Actions that read archives operate on
STDIN if you don’t give them a
You can also be explicit by giving
The “create” action will output the archive on
STDOUT if you don’t
give it a name (or use
f -). You still need to give it the name of the files
to put in that archive:
Note that you can’t extract an archive to
STDOUT without a modifier.
tar operates on files, not on data streams. By default it doesn’t compress
its content so creating a
tar archive for one file doesn’t make much
Now that we know how to write basic sentences, let’s add some modifiers to them.
In my experience the most used modifiers are
z. The first one is the
“verbose” flag and makes
tar more chatty. When creating or
extracting an archive it’ll print each file’s name as it’s (un)archiving
it. When listing an archive it’ll print more info about each file.
Compare both outputs below:
v modifier can be combined with any other one mentioned below.
z will tell
tar to use Gzip to (de)compress the archive. Nowadays
tar-ing with no compression is rarely used, and Gzip is ubiquitous. Just add
z to your modifiers and
tar will create compressed archive and
extract them. The convention is to use
.tgz for such
Other common modifiers include
j that works exactly like
z but (de)compress
using Bzip2 instead of Gzip. Such archives usually end with
.tbz2. It’s not named
B because those were already taken when
this modifier was introduced.
j one can use its capital friend,
J. This one compresses using
xz instead of the last Gzip or Bzip2. These archives use the extensions
Note you can also (de)compress archives by yourself if you don’t remember these modifiers:
There is a dozen of other modifiers you can find in the manpage, but let’s
mention two more:
k. You may remember from the first section that I
wrote you can’t extract an archive to
STDOUT without a modifier. Well,
that modifier is called
O when extracting will print the content of the archive on
STDOUT. This is the same output as you would get by calling
cat on all the
files in it.
The last modifier I wanted to mention is
k, which tells
tar not to override
existing files when extracting an archive. That is, if you already have
a file called
important.txt in your directory and you un-
tar an archive
k modifier, you can be sure it won’t override your existing
I hope this post helped you have a better understanding of
tar commands, and
how they’re not that complicated. I put a few (valid) commands below, just so
you can see if you understand what’s they’re doing: