Audio speedup script (HPR Show 2135)

Dave Morriss


Table of Contents

Introduction

Back in 2015 Ken Fallon did a show (episode 1766) on how to use sox to truncate silence and speed up audio.

Inspired by this I wrote a Bash script to aid my use of the technique, which I thought I’d share with you.

Overview of the script

I called the script speedup although it performs the dual functions of speeding up and truncating silence.

The script is invoked thus:

$ speedup [options] filename

(If you didn’t place it somewhere in your PATH then you’d need to include the path to the script such as ./speedup if it’s in the current directory.)

The filename argument should be the full path of the audio file. Unless deleted with the -c option (see below) the script will rename the original file and create the modified file with the same name as the original. When finished processing the original file unmodified will have the name ‘NAME_.EXT’ with an underscore added after the original name as shown.

The options are used to select the various features. They are:

-s
This option causes the audio to be sped up. It can be
repeated and the speed up is increased for every -s given.
-t
This option causes the audio to have silences truncated.
It can be repeated to increase the sensitivity of the
truncation.
-m
Mix-down multiple (stereo) tracks to mono.
-c
Delete the original file leaving the modified file behind with the
same name as the original.
-d
Engage dry-run mode where the planned actions are reported but nothing
is actually done.
-D
Run in DEBUG mode where more information is reported about what is
going on.
-h
Print the help information.

As mentioned above, the speedup and truncate functions can be “turned up” by repeating the options. The script counts the number of times a -s or -t option occurs and uses that number to index a list of speeds or truncation parameters. We will look at how this is done and what the possibilities are later.

The options conform to the usual Unix standard and can be concatenated, so the following invocations are the same and perform three levels of speeding up and one of truncation:

speedup -s -s -s -t ~/Bashpodder/Podcasts/2016-08-25/tllts_670-08-17-16.ogg
speedup -ssst ~/Bashpodder/Podcasts/2016-08-25/tllts_670-08-17-16.ogg

An analysis of the script

The script is available for download here. It is also available on GitHub in a repository I use for all of the scripts I talk about on HPR.

We will examine the script in chunks1:

#!/usr/bin/env bash
#===============================================================================
#
#         FILE: speedup
#
#        USAGE: ./speedup [-s ...] [-t ...] [-m] [-c] [-d] [-D] [-h] filename
#
#  DESCRIPTION: A script to perform a speedup and silence removal on a given
#               audio file
#
#      OPTIONS: ---
# REQUIREMENTS: ---
#         BUGS: ---
#        NOTES: ---
#       AUTHOR: Dave Morriss (djm), Dave.Morriss@gmail.com
#      VERSION: 0.0.4
#      CREATED: 2015-05-01 21:51:32
#     REVISION: 2016-04-22 11:35:08
#
#===============================================================================

set -o nounset                              # Treat unset variables as an error

SCRIPT=${0##*/}
VERSION="0.0.4"

#===  FUNCTION  ================================================================
#          NAME:  _usage
#   DESCRIPTION:  Report usage
#    PARAMETERS:  1 - the exit value (so it can be used to return an error
#                     value)
#       RETURNS:  Nothing
#===============================================================================
_usage () {
    local res="${1:-0}"

    cat <<-endusage

Usage: ${SCRIPT} [-s ...] [-t ...] [-m] [-c] [-d] [-D] [-h] filename

Speeds up and truncates silence in an audio file

Options:
  -s            This option if present causes the audio to be sped up.
                The option can be repeated and the speed up is increased for
                every -s given
  -t            This option if present causes the audio to to have silences
                truncated. The option can be repeated to increase the
                sensitivity of the truncation
  -m            Mix-down multiple (stereo) tracks to mono
  -c            Delete the original file leaving the modified file behind with
                the same name as the original
  -d            Engage dry-run mode where the planned actions are reported
                but nothing is actually done
  -D            Run in DEBUG mode where more information is reported
                about what is going on
  -h            Print this help

Arguments:
  filename     The full path of the audio file containing the podcast episode.

Note:
  Unless deleted with the -c option the script will rename the original file
  and create the modified file with the same name as the original. The
  original file will have the name 'NAME_.EXT' with an underscore added after
  the original name.

Version: $VERSION
endusage
    exit "$res"
}

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The first part consists of a comment, the declaration of a SCRIPT variable (taken from the $0 argument), and the version number.

This is followed by the definition of function _usage. This simply lists a “here document” using the cat <<-endusage statement near the top, and then exits using the argument as an exit value. The function is called to show how to use the script, so it’s not appropriate to run the script afterwards.


#
# Default settings
#
CLEANUP=0
DEBUG=0
DRYRUN=0
SPEEDUP=0
TRUNCATE=0
MIXDOWN=0

#
# Process options
#
while getopts :cDdhmst opt
do
    case "${opt}" in
        c) CLEANUP=1;;
        D) DEBUG=1;;
        d) DRYRUN=1;;
        s) ((SPEEDUP++));;
        t) ((TRUNCATE++));;
        m) MIXDOWN=1;;
        h) _usage 1;;
        *) _usage 1;;
    esac
done
shift $((OPTIND - 1))

In this section a collection of variables associated with the options is initialised. The while loop then processes the options. Note how the s and t options increment the variables SPEEDUP and TRUNCATE. Otherwise, presence of an option turns on (sets to 1) the variables defined earlier.

The shift statement at the end of this chunk is needed to remove all of the (now processed) options from the argument list, leaving the filename as argument 1 ($1).


#
# Check there is one argument
#
if [[ $# -ne 1 ]]; then
    echo "Error: filename missing"
    _usage 1
fi

#
# Does the file given as an argument exist?
#
if [[ ! -e "$1" ]]; then
    echo "File not found: $1"
    exit 1
fi

Now the script checks for the filename argument, aborting via function _usage if not found. It then checks to see if the file actually exists, and aborts with an error message if it doesn’t.


if [[ $DRYRUN -eq 1 ]]; then
    echo "Dry run: no changes will be made"
fi

This chunk simply detects the use of the -d (dry run) option and reports that it is on.


#
# Work out the speed-up we want (if any) and generate the argument to sox
#
SPEEDS=( 1.05 1.1 1.2 1.3 1.4 1.5 1.6 1.7 )
if [[ $SPEEDUP -eq 0 ]]; then
    TEMPO=
else
    if [[ $SPEEDUP -gt ${#SPEEDS[@]} ]]; then
        SPEEDUP=${#SPEEDS[@]}
    fi
    ((SPEEDUP--))
    speed=${SPEEDS[$SPEEDUP]}
    TEMPO="tempo ${speed}"
fi

This chunk detects the speedup level and creates a TEMPO variable with the result. If there was no -s option then the variable is empty. If a value was given then it is checked to see that it doesn’t exceed the number of speeds defined. These speeds are defined in the variable SPEEDS which is an array. You can see the script caters for speeds of 1.05, 1.1, 1.2 and so forth up to 1.7. This list was created for my needs, you could redefine it according to your needs.

The speed count in variable SPEEDUP is decremented to index the array which starts at index zero, then the value is stored in variable speed and used to define variable TEMPO ready for use with sox.


#
# Work out the silence truncation parameters (if any). The first set trims
# silence but ignores silences of 0.5 seconds in the middle (like pauses for
# breaths). The second set removes everything but can make a rather rushed
# result. See http://digitalcardboard.com/blog/2009/08/25/the-sox-of-silence/
# for some advice.
#
TRUNCS=( "1 0.1 1% -1 0.5 1%"  "1 0.1 1% -1 0.1 1%" )
if [[ $TRUNCATE -eq 0 ]]; then
    SILENCE=
else
    if [[ $TRUNCATE -gt ${#TRUNCS[@]} ]]; then
        TRUNCATE=${#TRUNCS[@]}
    fi
    ((TRUNCATE--))
    silence=${TRUNCS[$TRUNCATE]}
    SILENCE="silence ${silence}"
fi

This chunk does more or less the same as the preceding one for silence truncation. The main difference is that the array TRUNCS contains only two components and each one is a string of numbers. The setting of sound truncation parameters for sox is quite complicated. See the reference in the comments and show 1766 if you want to understand it. The end result is that the variable SILENCE contains the necessary parameter for sox.


if [[ $MIXDOWN == 0 ]]; then
    REMIX=
else
    REMIX="remix -"
fi

#
# Report some internals in debug mode
#
if [[ $DEBUG -eq 1 ]]; then
    echo "SPEEDUP  = $SPEEDUP"
    echo "TRUNCATE = $TRUNCATE"
    echo "MIXDOWN  = $MIXDOWN"
    echo "speed    = ${speed:-0}"
    echo "silence  = ${silence:-}"
    echo "TEMPO    = $TEMPO"
    echo "SILENCE  = $SILENCE"
    echo "REMIX    = $REMIX"
fi

#
# Is there anything to do?
#
if [[ -z $TEMPO && -z $SILENCE ]]; then
    echo "Nothing to do; exiting"
    exit 1
fi

Next, the -m option is checked and the variable REMIX defined to contain the sox parameter which will result in the stereo audio being remixed to mono.

Then, if the -D (debug) option was provided the various settings are reported. This mainly of use to someone debugging or developing this script.

Lastly in this chunk the script checks to see if there is any work to do. If neither TEMPO nor SILENCE is set to anything then there is no need to continue, and it exits.


#
# Divide up the path to the file
#
orig="$(realpath "$1")"
odir="${orig%/*}"
oname="${orig##*/}"
oext="${oname##*.}"

#
# The name of the original file will be changed to this
#
new="${odir}/${oname%.$oext}_.${oext}"

#
# Report the name of the input file
#
echo "Processing $orig"

#
# If the new name exists we already processed it
#
if [[ -e $new ]]; then
    echo "Oops! Looks like this file has already been sped up"
    exit 1
fi

#
# Rename the original file
#
if [[ $DRYRUN -eq 1 ]]; then
    printf "Dry run: rename %s to %s\n" "$orig" "$new"
else
    mv "$orig" "$new"
fi

This chunk works on the file, making a new name for it so the converted file can have the original name.

Firstly the script save the full pathname into variable orig (using realpath to sort out any links or relative directories). Then it parses the filename into the path, the filename and the extension. It then reassembles it adding an underscore after the filename in the variable new.

The script checks that the new file doesn’t exist, because if it does there’s a good chance that this audio file has been processed already, so it gives up.

Finally in this chunk the script renames the original file to the new name (or reports what it would do if we are in “dry run” mode).


#
# Speed up and remove long silences as requested
# -S   requests a progress display
# -v2  adjusts the volume of the file that follows it on the command line by
#      a factor of 2
# -V9  requests a very high (debug) level of verbosity (default -V2)
# remix - mixes all stereo to mono
#
if [[ $DRYRUN -eq 1 ]]; then
    printf "Dry run: %s\n" \
        "sox -S -v2 \"${new}\" \"${orig}\" ${TEMPO} ${REMIX} ${SILENCE}"
else
    # [We want TEMP, REMIX and SILENCE to be word-split etc]
    # shellcheck disable=SC2086
    sox -S -v2 "${new}" "${orig}" ${TEMPO} ${REMIX} ${SILENCE}
fi

This is the meat of the script. The sox program is given the various parameters which have been created. If “dry run” mode is on then the script just prints what it would do, but otherwise it processes the renamed file into the original filename with the chosen parameters.

As an aside, I use a Vim plugin called “Syntastic” which applies a syntax checker to various source files as they are saved during an edit, reporting any errors the checker finds. The checker for Bash is called “shellcheck” and some of its checks can be turned off with comments like:

# shellcheck disable=SC2086

This is necessary here because shellcheck objects to the fact that variables like “${TEMPO}” are not quoted. We do not want to quote them here otherwise sox will not get the necessary arguments like tempo 1.5. However, we do want to quote the filenames in case they contain spaces or other dangerous characters.


#
# Delete the original file if asked. Note that the script can't detect that
# the audio has been sped up if this file is missing.
#
if [[ $CLEANUP -eq 1 ]]; then
    if [[ $DRYRUN -eq 1 ]]; then
        printf "Dry run: delete %s\n" "$new"
    else
        rm -f "$new"
    fi
fi

exit

# vim: syntax=sh:ts=8:sw=4:ai:et:tw=78:fo=tcrqn21

Finally the script checks whether the -c option has requested the original (renamed) file be deleted. If so, the deletion request is reported in “dry run” mode or is actioned otherwise.

The comment on the last line is a so-called Vim “modeline” which contains settings for the Vim editor.

Conclusion

I use this as part of my podcast download workflow. In particular I process “The Linux Link Tech Show” thus:

./db_list_episodes -ab "Linux Link" | xargs -i ./speedup -ssst {}

Here db_list_episodes is a script which lists the paths to all of the episodes of a given podcast known to the database where I hold podcast data. The list is passed to the command xargs which runs speedup on each file as shown.

I have used this script regularly since I wrote it in May 2015. It does all that I want it to do at the moment, but in the next version I think I would change the logic which causes nothing to be done unless there are speed and silence truncation changes to be made. For example, since a number of podcasts I download from the BBC have surprisingly low sound compared to most others I’d quite like to amplify them.

I hope you find this script useful. Please contact me with any comments, corrections or improvements.


  1. Normally I number the lines of scripts such as this in the notes. When trying to do so this time the tool I use to generate HTML notes (Pandoc) did not seem to like the fact that I chopped the script into chunks and misbehaved. Since the script is quite long I didn’t want to leave my annotations to the end, so went with the un-numbered chunks you see here.