Bash Tips - 19 (HPR Show 2739)

Arrays in Bash (part 4)

Dave Morriss


Table of Contents

Arrays in Bash

This is the fourth and last of a small group of shows on the subject of arrays in Bash. It is also the nineteenth show in the Bash Tips sub-series.

In the last show we continued with the subject of parameter expansion in the context of arrays. There are other aspects of this that could be looked at, but we’ll leave it for the moment and may revisit it in the future.

In this episode we will look in more depth at the declare (typeset) built in command and at some commands that are related (readonly and local), We will also look at the commands that assist with loading data into arrays: mapfile (readarray) and read.

The declare (typeset) command in more detail

The 'declare' command is a Bash builtin used to declare variables and give them attributes. This includes arrays, as we have seen.

The command 'typeset' is a synonym for 'declare' supplied for compatibility with the Korn shell.

We will look at some of the options to 'declare' but will restrict ourselves largely to those relevant to arrays. All of the options (with the exception of '-r' and '-a') are turned on by starting with a '-' and turned off with '+' (slightly confusingly).

The '-i' option

This makes the variable behave as an integer. Arithmetic evaluation is performed when the variable is assigned a value (as if the assignment is inside '(())').

I have included a downloadable script bash19_ex1.sh which demonstrates what an array declared in this way can do:

#!/bin/bash

#-------------------------------------------------------------------------------
# Example 1 for Bash Tips show 19: how an integer array works
#-------------------------------------------------------------------------------

#
# Declare an integer array and a normal one
#
declare -a -i ints
declare -a norm

#
# Load both with arithmetic expressions
#
ints=('38 % 7' '38 / 7')
norm=('38 % 7' '38 / 7')

#
# Try storing a string in each of the arrays
#
ints+=('jellyfish')
norm+=('jellyfish')

#
# Show the results
#
echo "ints: ${ints[*]}"
echo "norm: ${norm[*]}"


exit

Running the script generates the following output:

ints: 3 5 0
norm: 38 % 7 38 / 7 jellyfish

The integer array stored the results of the expressions, but treated the string as zero. The other array stored the same expressions as strings.

The '-l' and '-u' options

These options make the variables force a lower case ('-l') or upper case ('-u') conversion on whatever is assigned to them. Only one can be set at a time of course!

The '-r' option

This option make the names being declared readonly. This means that they must be initialised at creation time and cannot then be assigned further values, and cannot be turned off. This is a way of creating constants in Bash:

$ declare -ir sixthprime=13
$ sixthprime=17
bash: sixthprime: readonly variable
$ declare +r sixthprime
bash: declare: sixthprime: readonly variable

The readonly command

This command is equivalent to 'declare -r' discussed above. It takes the options '-a', '-A', '-f' and '-p' only, and they have the same meaning as they do in 'declare' (we haven’t looked at '-f' yet though). The command comes from the Bourne shell.

The local command

The 'local' command takes the same options and types of arguments as 'declare' but is only for use inside functions where it creates variables local to the function (invisible outside it). We will look at it in more detail in a later episode when we deal with functions in Bash.

The mapfile (readarray) command

This command reads lines from standard input into an indexed array. It can also read from a file descriptor, a subject we have not looked at yet.

The 'readarray' command is a synonym for 'mapfile'.

The command syntax is (from the GNU Bash manual):

mapfile [-d delim] [-n count] [-O origin] [-s count] [-t] [-u fd]
    [-C callback] [-c quantum] [array]

The items in square brackets are optional.

Option Explanation
-d delim The first character of delim is used to terminate each input line, rather than newline.
-n count Read a maximum of count lines. If count is zero, all available lines are copied.
-O origin Begin writing lines to array at index number origin. The default value is zero.
-s count Discard the first count lines before writing to array.
-t Remove a trailing delim (default newline) from each line read.
-u fd Read lines from file descriptor fd rather than standard input.
-C callback Execute/evaluate a function/expression, callback, every time quantum lines are read. The -c option specifies quantum.
-c quantum Specify the number of lines, quantum, after which function/expression callback should be executed/evaluated if specified with -C. Default is 5000.
array The name of the array variable where lines should be written. If the array argument is omitted the data is loaded into an array called 'MAPFILE'.

When callback is evaluated, it is supplied the index of the next array element to be assigned and the line to be assigned to that element as additional arguments. The callback function or expression is evaluated after the line is read but before the array element is assigned.

If not supplied with an explicit origin, 'mapfile' will clear array before assigning to it.

I have included a downloadable script bash19_ex2.sh which demonstrates some uses of 'mapfile':

#!/bin/bash

#-------------------------------------------------------------------------------
# Example 2 for Bash Tips show 19: the mapfile/readarray command
#-------------------------------------------------------------------------------

#
# Declare an indexed array
#
declare -a map

#
# Fill the array with a process substitution that generates 10 random numbers,
# each followed by a newline (the default delimiter). We remove the newline
# characters.
#
mapfile -t map < <(for i in {1..10}; do echo $RANDOM; done)

#
# Show the array as a list
#
echo "map: ${map[*]}"
echo

#
# Declare a new indexed array
#
declare -a daffs

#
# Define a string with spaces replaced by underscores
#
words="I_wandered_lonely_as_a_Cloud_That_floats_on_high_o'er_vales_and_Hills,_When_all_at_once_I_saw_a_crowd,_A_host,_of_golden_Daffodils"

#
# Fill the array with a process substitution that provides the string. The
# delimiter is '_' and we remove it as we load the array
#
mapfile -d _ -t daffs < <(echo -n "$words")

#
# Show the array as a list
#
echo "daffs: ${daffs[*]}"
echo

#
# Fill an array with 100 random dictionary words. Use 'printf' as the callback
# to report every 10th word using -C and -c
#
declare -a big
mapfile -t -C "printf '%02d %s\n' " -c 10 big < <(grep -E -v "'s$" /usr/share/dict/words | shuf -n 100)
echo

#
# Report every 10th element of the populated array in the same way
#
echo "big: ${#big[*]} elements"
for ((i = 9; i < ${#big[*]}; i=i+10)); do
    printf '%02d %s\n' "$i" "${big[$i]}"
done

exit

Running the script generates the following output:

map: 9405 5502 13323 16242 31013 5921 10529 28866 32759 24391

daffs: I wandered lonely as a Cloud That floats on high o'er vales and Hills, When all at once I saw a crowd, A host, of golden Daffodils

09 Malaysians
19 kissers
29 wastewater
39 diddles
49 Brahmagupta
59 dissociated
69 healthy
79 amortize
89 unsure
99 interbreeding

big: 100 elements
09 Malaysians
19 kissers
29 wastewater
39 diddles
49 Brahmagupta
59 dissociated
69 healthy
79 amortize
89 unsure
99 interbreeding

Using the read command to fill an array

This command, which we have seen on many occasions before (but haven’t yet examined in detail) takes an option '-a name' which loads an indexed array with words.

Because 'read' reads one line at a time the words to be placed in the array must all be on one line. The line is split into words using the word splitting process described in episode 2045.

I have included a downloadable script bash19_ex3.sh which demonstrates a use of 'read' with the option '-a name':

#!/bin/bash

#-------------------------------------------------------------------------------
# Example 3 for Bash Tips show 19: using 'read' to fill an array
#-------------------------------------------------------------------------------

#
# Create an indexed array
#
declare -a readtest

#
# Populate it with space separated words on one line using 'echo -n' to force
# that to happen
#
read -r -a readtest < <(for c in {A..J}{1..3}; do echo -n "$c "; done)

#
# The result
#
echo "readtest: ${#readtest[*]} elements"
echo "readtest: ${readtest[*]}"

exit

Running the script generates the following output:

readtest: 30 elements
readtest: A1 A2 A3 B1 B2 B3 C1 C2 C3 D1 D2 D3 E1 E2 E3 F1 F2 F3 G1 G2 G3 H1 H2 H3 I1 I2 I3 J1 J2 J3

Note: As I was recording the audio for this episode I suddenly realised that the way the data is being generated in this script is unnecessarily complex. I used:

read -r -a readtest < <(for c in {A..J}{1..3}; do echo -n "$c "; done)

but I could have written the much more efficient:

read -r -a readtest < <(echo {A..J}{1..3})

which would have achieved the same result.