Some further Bash tips (HPR Show 1903)

Dave Morriss


Table of Contents

Expansion

There are seven types of expansion applied to the command line in the following order:

  • Brace expansion (we looked at this subject in the last episode 1884)
  • Tilde expansion
  • Parameter and variable expansion (this was covered in episode 1648)
  • Command substitution
  • Arithmetic expansion
  • Word splitting
  • Pathname expansion

We will look at some more of these in this episode but since there is a lot to cover, we'll continue in a later episode.

Tilde expansion

This is a convenient way of referring to a home directory in a file path, though there are other less well-known uses which we will also examine.

Tilde on its own

Consider the following example. Imagine you are in the directory Documents and you want to look at your .bashrc file. Here are some ways of doing this:

cd Documents
less ../.bashrc
less $HOME/.bashrc
less ~/.bashrc
  • The first method uses .. to refer to the directory above the current one.
  • The second uses the variable HOME, which is usually created for you when you login, and points to your home directory.
  • The third method uses a plain tilde (~) which means the home directory of the current user.

Actually the tilde in this example uses the contents of the HOME variable, just like the example above it. If you happened to change this variable for some reason the changed version would be used. If there is no HOME variable then the defined home directory of the current user will be looked up.

Note: The line beginning '->' is what will be generated by the following statements. I will be using this method of signifying output throughout these notes (unless it's confusing).

echo ~
-> /home/hprdemo

cd Documents
HOME=$PWD
echo ~
-> /home/hprdemo/Documents

Warning changing HOME can lead to great confusion, so it's not recommended. For example, after such a change the cd command without any argument moves to wherever HOME points to rather than to the expected home directory. This is a demonstration, not a recommendation!

Tilde and a login name

If the tilde is followed by a login name (username) then it refers to the home directory of that login name:

echo ~hprdemo
-> /home/hprdemo
echo ~postgres
-> /var/lib/postgresql

This is useful for example in multi-user environments where you want to copy files to or from someone else's directory - assuming the permissions have been set to allow this of course.

By the way, if you have changed the HOME variable it can be reset either by logging out and back in again or with the ~login_name form as in the following:

HOME=~hprdemo
echo ~
-> /home/hprdemo

Like many instances in Bash, the login name after the tilde can be completed by pressing the Tab key. If you happen to work in an environment with many login names, then take care when doing this since it might require a search of the entire name space. I used to work at a University with up to 50,000 login names, and pressing Tab inappropriately could result in a big search and a very long delay!

Tilde with a plus sign

There are other forms of tilde expansion. First, ~+ uses the value of the PWD variable. This variable is used by Bash to track the directory you are currently in.

cd Documents
echo ~+
-> /home/hprdemo/Documents

Tilde and a minus sign

There is another variable, OLDPWD that is used to hold the previous contents of PWD. This can be accessed with ~-.

cd Documents
echo ~-
-> /home/hprdemo
echo ~+
-> /home/hprdemo/Documents

Tilde and the directory stack

There is one more way in which tilde expansion can be used in Bash. This links to the directory stack that we looked at in show 1843. In that show we saw the pushd and popd commands for manipulating the stack by adding and removing directories. We also saw the dirs command for showing the contents of the stack.

Using ~ followed by a + or a - and a number references a directory on the stack. Using dirs -v we can see the stack with numbered entries (we're not using the -> here as it might be confusing):

dirs -v
0  ~/Documents
1  ~

In such a case the tilde sequence ~1 (or ~+1) references the stack element numbered 1 above:

echo ~1
-> /home/hprdemo

Note how the tilde stored in the stack representation is expanded in this example.

The directory returned is the same as that reported by dirs -l +1 where the -l option requests the full form be displayed.

As discussed in show 1843 we can also reference stack elements in reverse order. So the tilde expression ~-1 in the above scenario will return the second element counting from the bottom:

echo ~-1
-> /home/hprdemo/Documents

The directory returned is the same as that reported by dirs -l -1.

Tilde expansion in variables

Normally the tilde forms we have looked at would be used in file system paths when referring to files or directories. It is also possible to assign their values to variables, such as:

docs=~/Documents
echo $docs
-> /home/hprdemo/Documents

Bash provides special variables (and other software might need its own such variables) which contain lists of paths separated by colons (:). For example, the PATH variable, which contains paths used when searching for commands:

echo $PATH
-> /usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

Bash allows the tilde expansion formats we have seen to be included in such lists:

PATH+=:~/bin
echo $PATH
-> /usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/home/hprdemo/bin

Notice how the addition to the PATH variable was not enclosed in quotes. If it had been then the tilde expansion would not have taken place.

Command Substitution

Commands often write output. Unless told otherwise they write this output to a channel known as standard output (STDOUT). It is possible to capture this output channel and use it in many contexts.

Take for example the date command. This reports a date and, optionally a time, in various formats. To get today's date in the ISO8601 format (the only sane format, which everyone should adopt) the following command could be used:

date +%Y-%m-%d
-> 2015-11-04

This output could be captured in a variable using command substitution as follows:

today=$(date +%Y-%m-%d)
echo $today
-> 2015-11-04

The format $(command) for command substitution is the recommended one to use. There is an older form which uses backquotes around the command. The above example could be rewritten as:

today=`date +%Y-%m-%d`
echo $today
-> 2015-11-04

We will discuss only the $(command) form in these notes. See the manual page extract for details of the other format.

The text returned by the command is processed to remove newlines. The following example shows the date command being used to generate multi-line output (the sequence %n generates a newline in output from date, and we're not using the -> here to avoid confusion):

date +"Today's date is%n%Y-%m-%d"
Today's date is
2015-11-04

(Note that the argument to date is quoted because it contains spaces).

Using this in command substitution we get a different result:

today=$(date +"Today's date is%n%Y-%m-%d%n")
echo $today
-> Today's date is 2015-11-04

The embedded newline has been removed and replaced by a space.

As a final example, consider the following. A file words exists with one word per line. We want to construct a Bash loop which processes this file. To keep things simple we'll just echo each word followed by its length:

for w in $(cat words)
do
    echo "$w (${#w})"
done

The for loop is simply given a list of words from the file by virtue of the command substitution $(cat words), which it then places one at a time into the variable w. We use the construct ${#w} to determine the length as discussed in show 1648.

Some typical output might be:

bulkier (7)
laxness (7)
house (5)
overshoe (8)

There is an alternative (and faster) way of doing this without using cat:

for w in $(< words)
do
    echo "$w (${#w})"
done

This is a real example; I always test the commands in my notes to check I have not made any glaring mistakes. You might be interested to know how I generated the file of words:

for i in {1..10}
do
    w=$(shuf -n1 /usr/share/dict/words)
    w=${w%[^a-zA-Z]*}
    echo $w
done > words

The loop uses brace expansion as discussed in show 1884; it iterates 10 times. The shuf command is used to extract one line (word) at random from the system dictionary. Because many of these words have possessive forms, I wanted to strip the apostrophe and anything beyond it and I did that with an instance of Remove matching suffix pattern as discussed in show 1648. It removes any suffix consisting of a non-alphabetic character followed by others.

The resulting word is simply echoed.

The entire loop redirects its output (a list of 10 words) into the file words. We might be visiting the subject of redirection in a later show in this (sub-)series.

Since shuf can return multiple random words at a time, and since the removal of extraneous characters could have been done in the echo, this example could also have been written as:

for w in $(shuf -n10 /usr/share/dict/words)
do
    echo ${w%[^a-zA-Z]*}
done > words

I tend to write short Bash loops of this sort on one line:

for w in $(shuf -n10 /usr/share/dict/words); do echo ${w%[^a-zA-Z]*}; done > words

Manual Page Extracts

EXPANSION

Expansion is performed on the command line after it has been split into words. There are seven kinds of expansion performed: brace expansion, tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, word splitting, and pathname expansion.

The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.

On systems that can support it, there is an additional expansion available: process substitution. This is performed at the same time as tilde, parameter, variable, and arithmetic expansion and command substitution.

Only brace expansion, word splitting, and pathname expansion can change the number of words of the expansion; other expansions expand a single word to a single word. The only exceptions to this are the expansions of "$@" and "${name[@]}" as explained above (see PARAMETERS).

Brace Expansion

See the notes for HPR show 1884.

Tilde Expansion

If a word begins with an unquoted tilde character (~), all of the characters preceding the first unquoted slash (or all characters, if there is no unquoted slash) are considered a tilde-prefix. If none of the characters in the tilde-prefix are quoted, the characters in the tilde-prefix following the tilde are treated as a possible login name. If this login name is the null string, the tilde is replaced with the value of the shell parameter HOME. If HOME is unset, the home directory of the user executing the shell is substituted instead. Otherwise, the tilde-prefix is replaced with the home directory associated with the specified login name.

If the tilde-prefix is a ~+, the value of the shell variable PWD replaces the tilde-prefix. If the tilde-prefix is a ~-, the value of the shell variable OLDPWD, if it is set, is substituted. If the characters following the tilde in the tilde-prefix consist of a number N, optionally prefixed by a +' or a-', the tilde-prefix is replaced with the corresponding element from the directory stack, as it would be displayed by the dirs builtin invoked with the tilde-prefix as an argument. If the characters following the tilde in the tilde-prefix consist of a number without a leading + or -, + is assumed.

If the login name is invalid, or the tilde expansion fails, the word is unchanged.

Each variable assignment is checked for unquoted tilde-prefixes immediately following a : or the first =. In these cases, tilde expansion is also performed. Consequently, one may use filenames with tildes in assignments to PATH, MAILPATH, and CDPATH, and the shell assigns the expanded value.

Parameter Expansion

See the notes for HPR show 1648.

Command Substitution

Command substitution allows the output of a command to replace the command name. There are two forms:

    $(command)
or
    `command`

Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted, but they may be removed during word splitting. The command substitution $(cat file) can be replaced by the equivalent but faster $(< file).

When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by $, `, or \. The first backquote not preceded by a backslash terminates the command substitution. When using the $(command) form, all characters between the parentheses make up the command; none are treated specially.

Command substitutions may be nested. To nest when using the backquoted form, escape the inner backquotes with backslashes.

If the substitution appears within double quotes, word splitting and pathname expansion are not performed on the results.