Bash Tips - 21 (HPR Show 3013)

Environment variables

Dave Morriss


Table of Contents

The Environment (More collateral Bash tips)

Overview

You will probably have seen references to The Environment in various contexts relating to shells, shell scripts, scripts in other languages and compiled programs.

In Unix and Unix-like operating systems an environment is maintained by the shell, and we will be looking at how Bash deals with this in this episode. When a script, program or subprocess is invoked it is given an array of strings called the environment. This is a list of name-value pairs, of the form name=value.

Using the environment

The environment is used to convey various pieces of information to the executing script or program. For example, two standard variables provided by the shell are 'HOME', which is set to the current user’s home directory and 'PWD, set to the current working directory. The shell user can set, change, remove and view environment variables for their own purposes as we will see in this episode. The Bash shell itself creates and in some cases manages environment variables.

The environment contains global data which is passed down to subprocesses (child processes) by copying. However, it is not possible for a subprocess to pass information back to the superior (parent) process.

Viewing the environment

You can view the environment in a number of ways.

  • From the command line the command printenv can do this (this is usually but not always a stand-alone command: it’s /usr/bin/printenv on my Debian system). We will look at this command later.

  • The command env without any arguments does the same thing as printenv without arguments. This is actually a tool to run a program in a modified environment which we will look at later. The environment printing capability can be regarded as more of a bonus feature.

  • Scripting languages like awk (as well as Python and Perl, to name just a few) can view and manipulate the environment.

  • Compiled languages such as C can do this too of course.

  • There are other commands that will show the environment, and we will look at some of these briefly.

Changing variables in the environment

The variables in the environment are not significantly different from the shell parameters we have seen throughout this Bash Tips series. The only difference is that they are marked for export to commands and sub-shells. You will often see variables (or parameters) in the environment referred to as environment variables. The Bash manual makes a distinction between ordinary parameters (variables) and environment variables, but many other sources are less precise about this in my experience.

The standard variables in the environment have upper-case names (HOME, SHELL, PWD, etc), but there is no reason why a variable you create should not be in lower or mixed case. In fact, the Bash manual suggests that you should avoid using all upper-case names so as not to clash with Bash’s variables.

Variables can be created and changed a number of ways.

  • They can be set up at login time (globally or locally) through various standard configuration files. It is intended to look at this subject in an upcoming episode so we will leave discussing the subject until then.
  • By preceding the command or script invocation with name=value expressions which will temporarily place these variables into the environment for the command
  • Using the export command
  • Using the declare command with the -x option
  • The value of an environment variable (once established) can be changed at any time in the sub-shell with a command like myvar=42, just as for a normal variable
  • The export command can also be used to turn off the export marker on a variable
  • Deletion is performed with the unset command (as seen earlier in the series)

We will look at all of these features in more detail later in the episode.

A detailed look

Temporary addition to a command’s environment

As summarised above, a command can be preceded by name=value definitions, and these set environment variables while the command is running.

For example, if an awk script has been placed in /tmp like this:

$ cat > /tmp/awktest.awk
BEGIN { print "Hello World!" }
CTRL+D

(where CTRL+D means to press D while holding down the CTRL key).

It is now possible to invoke awk to execute this file by giving it the environment variable AWKPATH. This is a list of directories where awk looks to find script files.

$ AWKPATH=/tmp awk -f awktest
Hello World!

Note that:

  1. The file is found even though the '.awk' part has been omitted; this is something that awk does when searching AWKPATH
  2. The setting of AWKPATH is separated from the awk command by a space - not a semi-colon. (If a semi-colon had been used then there would have been two statements on the line, which would not have achieved what was wanted.)
  3. The variable AWKPATH is not changed in the parent process (the process that ran awk), the change is temporary, is in the child process, and lasts only as long as the command runs

Commands relating to the environment

The printenv command

The printenv command without arguments lists all the environment variables. It may be followed by a list of variable names, in which case the output is restricted to these variables.

When no arguments are given the output consists of name=value pairs, but if variable names are specified just the values are listed.

This command might be built into the shell, but this is not the case with Bash.

The env command

This is a shell command which will print a list of environment variables or run another command in an altered environment.

env [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]

Without the COMMAND part env is functionally equivalent to printenv.

Options are:

Option Meaning
-, -i, --ignore-environment start with an empty environment
-0, --null (zero) end each output line with 0 byte rather than newline
-u, --unset=NAME remove variable from the environment
-C, --chdir=DIR change working directory to DIR
-S, --split-string=S process and split S into separate arguments;
used to pass multiple arguments on shebang lines
-v, --debug show verbose information for each processing step

The NAME=VALUE part is where environment variables are defined for the command being run.

The env command is often used in shell scripts to run the correct interpreter on the hash bang or shebang line (the first line of the file which begins with '#!') without needing to know its path. It is necessary to know the path of env but this is usually (almost invariably) /usr/bin/env.

For example, to run a python3 script you might begin with:

#!/usr/bin/env python3

The -S option is required if the interpreter needs options of its own. For example:

$ cat awktest1
#!/usr/bin/env awk -f
BEGIN{ print "Hello World" }
$ ./awktest1
/usr/bin/env: ‘awk -f’: No such file or directory
/usr/bin/env: use -[v]S to pass options in shebang lines

$ cat awktest2
#!/usr/bin/env -S awk -f
BEGIN{ print "Hello World" }
$ ./awktest2
Hello World

Script awktest1 fails because env misunderstands 'awk -f' whereas awktest2, which uses -S works fine.

The env command can be run from the command line as a means of running a command with a special environment. For example:

$ env MSG="Hello" printenv MSG
Hello
$ printenv MSG

This defines environment variable MSG and runs 'printenv MSG' to show its value, which is destroyed as soon as the command has finished. The second printenv does nothing because MSG has gone.

This can also be demonstrated thus:

$ env -vi MSG="Hello" printenv
cleaning environ
setenv:   MSG=Hello
executing: printenv
   arg[0]= ‘printenv’
MSG=Hello

Here debug mode is on, and the '-i' option clears the environment of all but MSG. We don’t specify MSG as an argument this time; it’s unnecessary because that variable is all there is in the child environment.

Consult the GNU Coreutils Manual for more details of the env command. Note that the version of env being described here is 8.30.

The declare command

As we saw earlier in this series, declare can be used to create variables (and arrays). If the '-x' option is added to the command then the variables created are also marked for export to the environment used by subsequent commands. Note that arrays cannot be exported in any current versions of Bash. It was apparently planned to do this in the past, but it has not been implemented.

The option '+x' can also remove the indicator that makes a variable exported.

As expected declare -p can be used to show the declare command required to create a variable and the same applies to looking at environment variables using declare -p -x.

The export command

This command marks variables to be passed to child processes in the environment.

export [-fn] [-p] [name[=value] ...]

By default the names refer to shell variables, which can be given a value if desired.

Options are:

Option Meaning
-n the name arguments are marked not to be exported
-f the name arguments are the names of defined shell functions
-p displays output in a form that may be reused as input (see declare)

Writing 'export -p' with no arguments causes the environment to be displayed in a similar way to 'declare -x -p'.

Looking again at the awk example from earlier we could use export to set AWKPATH, but the setting will persist after the process running awk has finished:

$ export AWKPATH=/tmp
$ awk -f awktest
Hello World!
$ echo $AWKPATH
/tmp

You might see export being used in the following way in older scripts:

TZ='Europe/London'; export TZ

This is perfectly acceptable, but the single-statement form is most common in more recent scripts.

The set command

We have looked at some of the features that this command offers in other contexts but have not yet examined it in detail. This detailed analysis is overdue, but (for brevity) will be left until a later episode of Bash Tips.

For now we will look at set in the context of environment variables.

When set is used without any options or arguments then it performs the following function (quoted from the Gnu Bash Manual):

set displays the names and values of all shell variables and functions, sorted according to the current locale, in a format that may be reused as input for setting or resetting the currently-set variables.

This is a significant amount of output, so it is not a recommended way of examining the environment.

The '-k' option performs the following function (again a quote):

All arguments in the form of assignment statements are placed in the environment for a command, not just those that precede the command name.

Example: using set -k

What this means is actually quite simple. The following example demonstrates what setting '-k' does. The script bash21_ex1.sh is fairly simple:

#!/usr/bin/env bash

#-------------------------------------------------------------------------------
# Example 1 for Bash Tips show 21: the environment
#-------------------------------------------------------------------------------

# Not expected to be in the environment
bt211C=somedata

echo "Args: $*"

printenv

exit

I’m calling the variables associated with this script 'bt211[ABC]' so they are easier to find in the environment listing. To prove that defining variables in the script does not affect the environment we define one (bt211C) for later examination. We then echo any arguments given to the script. Finally we use printenv to show the environment.

Running this script like this results in:

$ set -k
$ bt211A=42 ./bash21_ex1.sh arg1 arg2 bt211B=99 | grep -E '^(Args|bt211)'
Args: arg1 arg2
bt211B=99
bt211A=42

What is happening here:

  1. We set the option on with set -k
  2. We invoke the script bash21_ex1.sh preceded by bt211A=42 which will cause an environment variable of that name to be created in the process that is initiated
  3. We give the script two arguments arg1 and arg2 and we add in another variable assignment bt211B=99. This should be placed in the environment now that set -k is enabled; if it weren’t then this string would just be treated as an argument
  4. The output from the script (arguments and data from printenv) is piped through grep which selects anything that starts with 'Args' or 'bt211'
  5. We see the two arguments echoed by the script and the two environment variables - the other variable bt211C is not shown because it is not an environment variable.

As with all of the single-letter options to set this one can be turned off again with 'set +k' (a little counter-intuitive, but that’s how it works).

Using environment variables

You will probably have seen references to environment variables when reading man pages. We have already seen how awk (gawk) can be made to behave differently when given certain variables such as AWKPATH. The same applies to a number of commands, and there is often a section describing such variables in the man page for the command.

Many commands depend on configuration files rather than environment variables nowadays, though it is not uncommon to see environment variables being used as a way to indicate a non-standard location for the configuration files. For example, the GNU Privacy Guard gpg command uses GNUPGHOME to specify a directory to be used instead of the default '~/.gnupg'. Also with the command-line tool for the PostgreSQL database, psql, there are several environment variables that can be set to provide defaults if necessary, for example: PGDATABASE, PGHOST, PGPORT and PGUSER.

In general environment variables are used:

  1. To pass information about the login environment, such as the particular shell, the desktop and the user. For example, 'SHELL' contains the current shell (such as /bin/bash), 'DESKTOP_SESSION' defines the chosen desktop environment (such as xfce) and 'USER' defines the current username. These values are created during login and can be controlled where appropriate by the shell’s configuration files.

  2. To pass relevant information to scripts, commands and programs. These variables can be set in the shell’s configuration file(s) or on the command line, either temporarily or permanently. We have seen the ways we can set environment variables permanently and temporarily.

However, it can be argued that any complex software system is better controlled through configuration files than through environment variables. It is common to see the YAML or JSON formats being used to set up configuration files, as well as other file formats. This method allows many settings to be controlled in one place, whereas using environment variables would require many separate variable definitions. On the other hand, environment variables are simpler to manage than having to deal with YAML or JSON formats.

Examples

Various ways of displaying the environment

#!/usr/bin/env bash

#-------------------------------------------------------------------------------
# Example 2 for Bash Tips show 21: the environment
#-------------------------------------------------------------------------------

BTversion='21'
export BTversion

echo "** Using 'grep' with 'env'"
env | grep -E '(EDITOR|SHELL|BTversion)='
echo

echo "** Using 'printenv' with arguments"
printenv EDITOR SHELL BTversion
echo

echo "** Using 'grep' with 'export'"
export | grep -E '(EDITOR|SHELL|BTversion)='
echo

echo "** Using 'grep' with 'declare'"
declare -x | grep -E '(EDITOR|SHELL|BTversion)='

exit

Running the script (bash21_ex2.sh) generates the following output:

** Using 'grep' with 'env'
SHELL=/bin/bash
EDITOR=/usr/bin/vim
BTversion=21

** Using 'printenv' with arguments
/usr/bin/vim
/bin/bash
21

** Using 'grep' with 'export'
declare -x BTversion="21"
declare -x EDITOR="/usr/bin/vim"
declare -x SHELL="/bin/bash"

** Using 'grep' with 'declare'
declare -x BTversion="21"
declare -x EDITOR="/usr/bin/vim"
declare -x SHELL="/bin/bash"

This example shows how environment variable values can be examined with env, printenv, export and declare. I will leave you to investigate set if you wish, though it’s not the ideal way to find such information.

Accessing the environment in an awk script

#!/usr/bin/awk -f

#-------------------------------------------------------------------------------
# Example 3 for Bash Tips show 21: printing the environment in Awk
#-------------------------------------------------------------------------------

BEGIN{
    for (n in ENVIRON)
        printf "ENVIRON[%s]=%s\n",n,ENVIRON[n]
}

Running the script (bash21_ex3.awk) generates the following output in my particular case:

$ ./bash21_ex3.awk
ENVIRON[AWKPATH]=.:/usr/share/awk
ENVIRON[OLDPWD]=/home/hprdemo
ENVIRON[XDG_SESSION_CLASS]=user
ENVIRON[AWKLIBPATH]=/usr/lib/x86_64-linux-gnu/gawk
ENVIRON[LANG]=en_GB.UTF-8
ENVIRON[XDG_RUNTIME_DIR]=/run/user/1001
ENVIRON[USER]=hprdemo
ENVIRON[LANGUAGE]=en_GB:en
ENVIRON[_]=./bash21_ex3.awk
ENVIRON[SHELL]=/bin/bash
ENVIRON[XDG_SESSION_ID]=57
ENVIRON[SSH_CONNECTION]=::1 55674 ::1 22
ENVIRON[PATH]=/usr/local/bin:/usr/bin:/bin:/usr/games
ENVIRON[SSH_CLIENT]=::1 55674 22
ENVIRON[HOME]=/home/hprdemo
ENVIRON[PWD]=/home/hprdemo/BashTips
ENVIRON[SHLVL]=1
ENVIRON[XDG_SESSION_TYPE]=tty
ENVIRON[LOGNAME]=hprdemo

If you were to run the above script yourself you would see different values (!) and very likely a lot more of them.

Nerdy digression! Ignore if not interested! The way I demonstrate scripts for HPR shows is complicated since I usually run the scripts from the notes while they are being rendered to be sure that the output I show is really correct! This one was actually run over ssh under the local user hprdemo, which has been tailored for such demonstrations, so the environment is not typical.

Passing temporary environment variables

#!/usr/bin/env bash

#-------------------------------------------------------------------------------
# Example 4 for Bash Tips show 21: a way of showing environment variables
#-------------------------------------------------------------------------------

#
# We expect one or more arguments
#
if [[ $# = 0 ]]; then
    echo "Usage: $0 variable_name"
    exit 1
fi

#
# Loop through the arguments reporting their attributes with 'declare'
#
for arg; do
    declare -p "$arg"
done

exit

This simple script (bash21_ex4.sh) allows the demonstration of the existence of selected variables in the environment. First we look at the SHELL variable, managed by Bash:

$ ./bash21_ex4.sh SHELL
declare -x SHELL="/bin/bash"

Now we generate our own temporary environment variables and report them:

$ compass=north weather=wet ./bash21_ex4.sh compass weather
declare -x compass="north"
declare -x weather="wet"

Using export in Bash configuration files

We will be looking at the configuration files that can be used to control your instance of Bash in a later show. The following example (bash21_ex5.sh) shows some of the environment variables I have defined in mine:

#-------------------------------------------------------------------------------
# Example 5 for Bash Tips show 21: a few things exported in my .bashrc
#-------------------------------------------------------------------------------
#
# The PATH variable gets my local ~/bin directory added to it
#
export PATH="${PATH}:$HOME/bin"
#
# The above is the older way of doing this. It is possible to write the
# following using '+=' to concatenate a string onto a variable:
# export PATH+=":$HOME/bin"

#
# Some tools need a default editor. The only one for me is Vim
#
export EDITOR=/usr/bin/vim
export VISUAL=/usr/bin/vim

A simple configuration file for a Bash script

Note: the following example is quite detailed and somewhat convoluted. You might prefer to skip it; you will probably not lose much by doing so!

In the script I use to manage episodes I submit to HPR I use a simple configuration file. The script was begun in 2013 and was intended to be entirely implemented in Bash. At that time I came up with the idea of creating a file of export commands to define a collection of variables. The following is an example called bash21_ex6_1.cfg which contains settings for this episode of Bash Tips (slightly edited):

export _CFG_PROJECT="Bash_Tips__21"
export _CFG_HOSTID=225
export _CFG_HOSTNAME="Dave Morriss"
export _CFG_SUMMARY="Environment variables"
export _CFG_TAGS="Bash,variable,environment"
export _CFG_EXPLICIT="Yes"
export _CFG_FILES=(hpr____.html hpr____.tbz)
export _CFG_STRUCTURE="Tree"
export _CFG_SERIES="Bash Scripting"
export _CFG_NOTETYPE="HTML"
export _CFG_SUMADDED="No"
export _CFG_INOUT="No"
export _CFG_EMAIL="blah@blah"
export _CFG_TITLE="Bash Tips - 21"
export _CFG_STATUS="Editing"
export _CFG_SLOT=""

This idea works in a limited way. Using the 'source' command on the file will cause all of the export statements to be obeyed and the variables will be placed in the environment. There is one exception though; the definition of '_CFG_FILES' does not result in an environment variable because it’s an array and Bash does not support arrays in the environment. However, an array is created as an ordinary variable.

Originally I expected that I would need to access these environment variables in sub-processes or using Awk or Perl scripts. In the latter two cases the variables must be in the environment, but I found I didn’t need to do this in fact.

The demonstration script bash21_ex6.sh takes these variables and generates a new configuration file from them using declare statements:

#!/usr/bin/env bash

#-------------------------------------------------------------------------------
# Example 6 for Bash Tips show 21: a poor way to make a configuration file
#-------------------------------------------------------------------------------

#
# Example configuration file using 'export'
#
CFG1='bash21_ex6_1.cfg'
if [[ ! -e $CFG1 ]]; then
    echo "Unable to find $CFG1"
    exit 1
fi

#
# Alternative configuration file using 'declare', converted from the other one
#
CFG2='bash21_ex6_2.cfg'

#
# Strip out all of the 'export' commands with 'sed' in a process substitution,
# turning the lines into simple variable declarations. Use 'source' to obey
# all of the resulting commands
#
source <(sed 's/^export //' $CFG1)

#
# Scan the (simple) variables beginning with '_CFG_' and convert them into
# a portable form by saving the output of 'declare -p'
#
declare -p "${!_CFG_@}" > $CFG2

#
# Now next time we can 'source' this file instead when we want the variables
#
cat $CFG2

exit

The variable 'CFG1' contains the name of the file of export commands, bash21_ex6_1.cfg. Rather than placing all of these variables into the environment the script strips the 'export' string from each line, making them simple assignments. The result is processed using the source command, taking its input from a process substitution running 'sed', and then the second configuration file is created, bash21_ex6_2.cfg.

The declare command needs explanation:

declare -p "${!_CFG_@}" > $CFG2

It uses 'declare -p' to print out declare statements, redirecting them to the new configuration file whose name is in variable 'CFG2'. The expression used to find all the variables: "${!_CFG_@}" uses a Bash parameter expansion feature which we looked at back in HPR episode 1648. It returns the names of all variables whose names begin with '_CFG_'. The expression ends with '@' which (like when expanding arrays) causes a list of distinct arguments to be generated rather than one single long string.

The script lists the contents of the file bash21_ex6_2.cfg to demonstrate what has happened.

Running this script results in the following:

$ ./bash21_ex6.sh
declare -- _CFG_EMAIL="blah@blah"
declare -- _CFG_EXPLICIT="Yes"
declare -a _CFG_FILES=([0]="hpr____.html" [1]="hpr____.tbz")
declare -- _CFG_HOSTID="225"
declare -- _CFG_HOSTNAME="Dave Morriss"
declare -- _CFG_INOUT="No"
declare -- _CFG_NOTETYPE="HTML"
declare -- _CFG_PROJECT="Bash_Tips__21"
declare -- _CFG_SERIES="Bash Scripting"
declare -- _CFG_SLOT=""
declare -- _CFG_STATUS="Editing"
declare -- _CFG_STRUCTURE="Tree"
declare -- _CFG_SUMADDED="No"
declare -- _CFG_SUMMARY="Environment variables"
declare -- _CFG_TAGS="Bash,variable,environment"
declare -- _CFG_TITLE="Bash Tips - 21"

Note how nothing is marked with '-x' because they had not been exported to the environment, especially the array (which can’t be!). Note that the array has been handled properly by 'declare -p' and the output file could be used to backup and restore this array. This is a safer format than the original file of assignments.