6. Shell basics¶

6.1. Job Control¶

Command	Description
`jobs`	show jobs (background tasks)
`C-z`	send signal TSTOP to foreground job, which stops, is sent to the background job queue and becomes the current job
`bg [job no.]`	run current job in job queue in background
`fg [job no.]`	run current job in foreground
`cmd &`	run cmd as background job

6.2. POSIX¶

Using POSIX compatible syntax allows shell scripts to run on other systems, where no bash(1) is available (NAS, various embedded systems, old systems with proprietary unixes).

So, use #!/bin/sh instead of #!/bin/bash.

Use The Open Group Base Specifications Issue 7, 2018 edition for general reference. See sh - shell, the standard command language interpreter and Shell Command Language for specific shell reference.

Attention

Be aware that there are very old shells out there that do not conform to POSIX. Mainly because POSIX was not around at their conception. So not everything allowed by POSIX is necessarily failsafe for all shells.

6.2.1. Conventions for Syntax Descriptions¶

The conventions for syntax descriptions are covered in Chapter 12. Utility Conventions of The Open Group Base Specifications Issue 7, 2018 edition.

An informal description can also be found in man-pages(7).

The syntax for alternative arguments { arg1 | arg2 | arg3 } is not part of POSIX, but is mentioned in syntax - Is there a specification for a man page’s SYNOPSIS section? - Stack Overflow

6.2.1.1. Specific Conventions in Templated Shell Scripts¶

Single letter options preceded by a single dash - must not be grouped together. The additional effort is not worth the conceived advantage. Readability is much better, when separating single letter options.

In addition to POSIX chapter 12, a shortened ellipses .. may be used in place of a full ellipses ....

Clarifying POSIX section 12.8, the vertical bar | is only used within braces {, } to indicate exclusive alternatives. Together with brackets [, ] This allows specifying optional syntax variants without an implied order. E.g.:

program [ { key=[value] | -key | [!]key } ..]

This notation emphasizes that each of these options overrides the effect of a previous occurence. The following syntax description is equivalent, but the exclusive nature of the alternatives is not so clear:

program [key=[value]].. [-key].. [[!]key]..

This example can be further shortened, if the option description mentions, that the option can be specified multiple times:

program [key=[value]] [-key] [[!]key]

In addtion to specifying multiple synopsis lines according to POSIX section 12.8, mutually exclusive option may be given summarily as [MODE OPTIONS], which are understood to override any previous mode options. E.g.:

  program [OPTIONS] [MODE OPTIONS]

MODE OPTIONS
  --one  perform action 1
  --two  perform action 2

6.3. Special Purpose Language vs. Generic Programming Language¶

The bourne shell sh(1) is a special purpose language. It is not a generic programming language.

Making the shell more like C, with e.g. csh(1), are misguided experiments.

Making the excution of the test program look like a condition in a programming language is a very special brain dead example of syntactic obfuscation.

The standard syntax shows quite clearly, what happens, when the program test is executed:

if test arg1 arg2
then
    :
fi

The alternate program name [ requires an extra argument ] for closing the fake opening bracket, just so the command execution resembles a mathematical condition:

if [ arg1 arg2 ]
then
    :
fi

Warning

Using this abomination in a shell scripts results in immediate deletion.

Avoid arithmetic expansion $(( ... )), if expr(1) can do the job.

6.3.1. Variable expansion¶

To be safe and to make replacemnts simpler, always use curly braces for variable expansion:

printf "variable: %s, arg count: %d, args: %s\n" "${variable}" "${#}" "${*}"

Emacs support in Shell-script mode:

Shortcut	Expansion
`C-c v`	${}
`C-c q`	“${}”

6.3.2. echo (1) , printf(1)¶

Do not use echo(1), since it is not portable, use printf(1) instead.

Emacs support in Shell-script mode:

Shortcut	Expansion
`C-c p`	printf “%sn”
`C-u C-c p`	printf >&2 “%sn”

Especially dash(1) (ubuntu system shell) and bash(1) differ extremely.

$ ls -l /bin/sh
lrwxrwxrwx 1 root root 4 Mai  8  2018 /bin/sh -> dash

$ /bin/dash -c 'echo "hello\nnext line"'
hello
next line

$ /bin/dash -c 'echo -e "hello\nnext line"'
-e hello
next line

$ /bin/bash -c 'echo "hello\nnext line"'
hello\nnext line

$ /bin/bash -c 'echo -e "hello\nnext line"'
hello
next line

Emacs support in Shell-script mode for debug output of variables:

arg_count C-c d v v

expands to

printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "arg_count" "${arg_count}"

6.3.3. Avoid special bash syntax¶

Do not use:

function func_name
{
    :
}

but use POSIX compatible syntax instead:

func_name ()
{
    :
}

6.3.4. Do not use arrays¶

Shell arrays are not POSIX compatible! If you think you need to use arrays, you should probably not use the shell but a generic script programming language like awk(1), perl(1) or python(1).

See section 6.4, WRF loop - single line processing in shell for single line processing with splitting into fields.

See also GitHub - krebs/array: a POSIX-compliant implementation of arrays, for a POSIX compliant implementation of arrays (untested).

6.4. WRF loop - single line processing in shell¶

Emulating single line processing like sed(1) and awk(1) with read in a while loop. WRF stands historically for while/read/file.

6.4.1. WRF loop¶

A file is parsed as single lines with the read command (see listing 6.1, line 14):

while read -r in_line

See figure 6.1 for activity diagram.

figure 6.1 WRF loop¶

listing 6.1 WRF loop¶

# setup some line records
in_records="
# ::fillme:: this comment is skipped, as are blank lines
some words on

a line
varying count of words
"

printf "%s\n" "${in_records}" \
| (
while read -r in_line
do
    # skip blank lines and comments
    case "${in_line}" in
    ''|"${comm-#}"*) continue;;
    esac

    # print in_line
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "in_line" "${in_line}"
done
)

6.4.2. WRF loop with standard IFS split¶

Instead of reading an entire line, the read command parses the line into several variables (see listing 6.2, line 14):

while read -r in_word0 in_word1 rest

The standard IFS is used which splits the line on whitespace.

See figure 6.2 for activity diagram.

figure 6.2 WRF loop with standard IFS split¶

listing 6.2 WRF loop with standard IFS split¶

# setup some line records with
# standard IFS whitespace separator
in_records="
# ::fillme:: this comment is skipped, as are blank lines
some words on

a line
varying count of words
"

printf "%s\n" "${in_records}" \
| (
# use standard IFS to split line
while read -r in_word0 in_word1 rest
do
    # skip blank lines and comments
    case "${in_word0}" in
    ''|"${comm-#}"*) continue;;
    esac

    # print parts
    printf >&2 "# --------------------------------------------------\n"
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "in_word0" "${in_word0}"
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "in_word1" "${in_word1}"
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "rest" "${rest}"
done
)

6.4.3. WRF loop with special IFS split¶

Instead of reading an entire line, the read command parses the line into several variables (see listing 6.3, line 14):

while IFS=: read -r in_word0 in_word1 rest

IFS is set to : for the read command only, which splits the line on a : character.

See figure 6.3 for activity diagram.

figure 6.3 WRF loop with special IFS split¶

listing 6.3 WRF loop with special IFS split¶

# setup some line records with
# standard IFS whitespace separator
in_records="
# ::fillme:: this comment is skipped, as are blank lines
some words:on

a line
varying:count of:words:and:fields
"

printf "%s\n" "${in_records}" \
| (
# use IFS=: to split line
while IFS=: read -r in_word0 in_word1 rest
do
    # skip blank lines and comments
    case "${in_word0}" in
    ''|"${comm-#}"*) continue;;
    esac

    # print parts
    printf >&2 "# --------------------------------------------------\n"
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "in_word0" "${in_word0}"
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "in_word1" "${in_word1}"
    printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "rest" "${rest}"
done
)

6.4.4. split and process lines with awk(1)¶

listing 6.4 AWK script to split and process lines via callback¶

BEGIN {
    if (!line) {
        line = "line";
    }
    if (!varbase) {
        varbase = "col";
    }
    if (!callback) {
         callback = "col_process";
    }
    if (!max_count) {
         max_count = 10;
    }
}
function single_quote_enclose (str) {
    gsub(/'/, "'\\''", str);
    return "'" str "'";
}
{
    printf("%s=%s;\n", line, single_quote_enclose($0));
    printf("%s_count=%d;\n", varbase, NF);
    for (_i=1;_i<=NF;++_i) {
        printf("%s%d=%s;\n", varbase, _i, single_quote_enclose($_i));
    }
    for (_i=NF+1;_i<=max_count;++_i) {
        printf("%s%d=;\n", varbase, _i);
    }
    if (callback) {
        printf("%s;\n\n", callback);
    }
}

listing 6.5 Function split and process lines via callback¶

#!   split_and_process_lines [-F " *: *"] [-v line=line] [-v varbase=col] [-v callback=col_process] [-v max_count=10]
split_and_process_lines ()                                                 # ||:fnc:||
{
    ${AWK__PROG-awk} ${1+"$@"} "${AWK_SCRIPT_SPLIT_AND_PROCESS_LINES}"
}

listing 6.6 Example for split and process lines via callback¶

in_records="
# comment
in the : city
and : over : the : mountains

som'e where : over ' the : rainbow
some wh''ere : over the : rainbow
"

col_process ()
{
case "${line}" in
'#'*|'')
    printf >&2 "#  |"":WRN:|  warning: comment or blank line [%s]\n" "${line}"
    return
    ;;
esac

printf "line: col1 [%s] col2 [%s] col3 [%s] col4 [%s] col5 [%s]\n" "${col1}" "${col2}" "${col3}" "${col4}" "${col5}"

_indx=1
while test ${_indx} -le ${col_count}
do
    _var="col${_indx}"
    eval _value=\"\${${_var}}\"

    printf "      %s=%s\n" "${_var}" "${_value}"

    _indx="$( expr ${_indx} + 1 )"
done
}

_script="$(
  printf "%s\n" "${in_records}" \
  | split_and_process_lines -F ' *: *' -v varbase='col' -v callback='col_process'
  )"

printf >&2 "# --------------------------------------------------\n"
printf >&2 "#   "":DBG:   %-${dbg_fwid-15}s: [%s]\n" "_script" "${_script}"

printf >&2 "# --------------------------------------------------\n"
eval "${_script}"

listing 6.7 Script generated by example for split and process lines via callback¶

# --------------------------------------------------
#   :DBG:   _script        : [line='';
col_count=0;
col1=;
col2=;
col3=;
col4=;
col5=;
col6=;
col7=;
col8=;
col9=;
col10=;
col_process;

line='# comment';
col_count=1;
col1='# comment';
# ...
col_process;

line='in the : city';
col_count=2;
col1='in the';
col2='city';
# ...
col_process;

line='and : over : the : mountains';
col_count=4;
col1='and';
col2='over';
col3='the';
col4='mountains';
# ...
col_process;

line='';
col_count=0;
# ...
col_process;

line='som'\''e where : over '\'' the : rainbow';
col_count=3;
col1='som'\''e where';
col2='over '\'' the';
col3='rainbow';
# ...
col_process;

line='some wh'\'''\''ere : over the : rainbow';
col_count=3;
col1='some wh'\'''\''ere';
col2='over the';
col3='rainbow';
# ...
col_process;

line='';
col_count=0;
# ...
col_process;]

listing 6.8 Output from example for split and process lines via callback¶

# --------------------------------------------------
#  |:WRN:|  warning: comment or blank line []
#  |:WRN:|  warning: comment or blank line [# comment]
line: col1 [in the] col2 [city] col3 [] col4 [] col5 []
      col1=in the
      col2=city
line: col1 [and] col2 [over] col3 [the] col4 [mountains] col5 []
      col1=and
      col2=over
      col3=the
      col4=mountains
#  |:WRN:|  warning: comment or blank line []
line: col1 [som'e where] col2 [over ' the] col3 [rainbow] col4 [] col5 []
      col1=som'e where
      col2=over ' the
      col3=rainbow
line: col1 [some wh''ere] col2 [over the] col3 [rainbow] col4 [] col5 []
      col1=some wh''ere
      col2=over the
      col3=rainbow
#  |:WRN:|  warning: comment or blank line []

6.5. Single quoting¶

Activity diagram for algorithm:

6.6. Construct correctly quoted shell script¶

A shell script is assigned to the variable _script to be executed for different purposes, e.g.

at a later time:
```
eval "${_script}"
```

in different shell process, e.g.:

sh -c "${_script}"
printf "%s\n" "${_script}" | sh

as different user:
```
sudo -u user sh -c "${_script}"
```

on remote host:

ssh user@host "${_script}"
printf "%s\n" "${_script}" | ssh user@host

6.6.1. Preparations¶

Update snippets to latest version:

cd /srv/ftp/pub && ./sync.sh --restore && ./xx-sync-ftp-pub.sh -l 0

Create test shell script with template:
```
snn x_quoted_script.sh
```

Expand snippet (at end of line (C-e) enter key sequence C-x C-e):

## (progn (forward-line 1) (snip-insert "sh_f.single-quote" t t "sh" " --key single_quote_minimal") (insert "\n"))

Add example environment setup in body:

set -- arg1 arg2 'arg with spaces'

TEMP_DIR='/tmp/some-rndajom-stuff'

Add example command:

( cd "${TEMP_DIR}/" || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "${_arg}"; done
echo 'hello' | cat -

Execute and study output:

x_quoted_script.sh: 2: cd: cannot cd to /tmp/some-rndajom-stuff/
arg1
arg2
arg with spaces
hello

6.6.2. Single quoted string¶

Single quote entire command:

_script='
( cd "${TEMP_DIR}/" || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "${_arg}"; done
echo '\''hello'\'' | cat -
'

add some execution tests:

printf "%s\n" "${_script}"

printf "%s\n" '--------------------------------------------------'
eval "${_script}"

printf "%s\n" '--------------------------------------------------'
sh -c "${_script}"

and observe output:

( cd "${TEMP_DIR}/" || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "${_arg}"; done
echo 'hello' | cat -
--------------------------------------------------
x_quoted_script.sh: 2: cd: cannot cd to /tmp/some-rndajom-stuff/
arg1
arg2
arg with spaces
hello
--------------------------------------------------
/
hello

Interrupt quoting to insert expanded variables.

Use single_quote_enclose() as necessary:

_script='
( cd '"$( single_quote_enclose "${TEMP_DIR}/" )"' || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "${_arg}"; done
echo '\''hello'\'' | cat
'

Use single_quote_args() as necessary:

_script='
( cd '"$( single_quote_enclose "${TEMP_DIR}/" )"' || exit 1; pwd )
for _arg in '"$( single_quote_args ${1+"${@}"} )"'; do echo "${_arg}"; done
echo '\''hello'\'' | cat -
'

and observe output:

( cd '/tmp/some-rndajom-stuff/' || exit 1; pwd )
for _arg in 'arg1' 'arg2' 'arg with spaces'; do echo "${_arg}"; done
echo 'hello' | cat -
--------------------------------------------------
x_quoted_script.sh: 2: cd: cannot cd to /tmp/some-rndajom-stuff/
arg1
arg2
arg with spaces
hello
--------------------------------------------------
sh: 2: cd: cannot cd to /tmp/some-rndajom-stuff/
arg1
arg2
arg with spaces
hello

6.6.3. HERE document¶

Enclose entire command in cat <<EOF … EOF, escape as necessary:

cat <<EOF
( cd "${TEMP_DIR}/" || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "\${_arg}"; done
echo 'hello' | cat -
EOF

and observe output:

( cd "/tmp/some-rndajom-stuff/" || exit 1; pwd )
for _arg in arg1 arg2 arg with spaces; do echo "${_arg}"; done
echo 'hello' | cat -

Use single_quote_enclose() and single_quote_args() as necessary:

cat <<EOF
( cd $( single_quote_enclose "${TEMP_DIR}/" ) || exit 1; pwd )
for _arg in $( single_quote_args ${1+"${@}"} ); do echo "\${_arg}"; done
echo 'hello' | cat -
EOF

and observe output:

( cd '/tmp/some-rndajom-stuff/' || exit 1; pwd )
for _arg in 'arg1' 'arg2' 'arg with spaces'; do echo "${_arg}"; done
echo 'hello' | cat -

Enclose in subshell expansion "$( … )" for assignment to variable:

_script="$(
cat <<EOF
( cd $( single_quote_enclose "${TEMP_DIR}/" ) || exit 1; pwd )
for _arg in $( single_quote_args ${1+"${@}"} ); do echo "\${_arg}"; done
echo 'hello' | cat -
EOF
)"

add some execution tests:

printf "%s\n" "${_script}"

printf "%s\n" '--------------------------------------------------'
eval "${_script}"

printf "%s\n" '--------------------------------------------------'
sh -c "${_script}"

and observe output:

( cd '/tmp/some-rndajom-stuff/' || exit 1; pwd )
for _arg in 'arg1' 'arg2' 'arg with spaces'; do echo "${_arg}"; done
echo 'hello' | cat -
--------------------------------------------------
x_quoted_script.sh: 1: cd: cannot cd to /tmp/some-rndajom-stuff/
arg1
arg2
arg with spaces
hello
--------------------------------------------------
sh: 1: cd: cannot cd to /tmp/some-rndajom-stuff/
arg1
arg2
arg with spaces
hello

Enclose entire command in here document specifying a quoted end-of-file marker cat <<'EOF' … EOF, no escaping is necessary:

cat <<'EOF'
( cd "${TEMP_DIR}/" || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "\${_arg}"; done
echo 'hello' | cat -
EOF

and observe output:

( cd "${TEMP_DIR}/" || exit 1; pwd )
for _arg in ${1+"${@}"}; do echo "\${_arg}"; done
echo 'hello' | cat -

The type of quotes (single or double) does not matter.

6.7. Command execution¶

For bash(1), four types of commands are defined:

aliases
shell functions
builtin commands
external programs

A POSIX shell like dash(1) does not support aliases.

From the man page of bash(1):

COMMAND EXECUTION

After a command has been split into words, if it results in a simple command and an optional list of arguments, the following actions are taken.

If the command name contains no slashes, the shell attempts to locate it. [If the shell is interactive or shell option expand_aliases is set and an alias by that name is found, it is expanded.] If there exists a shell function by that name, that function is invoked as described above in FUNCTIONS. If the name does not match a function, the shell searches for it in the list of shell builtins. If a match is found, that builtin is invoked.

If the name is neither a shell function nor a builtin, and contains no slashes, bash searches each element of the PATH for a directory containing an executable file by that name. Bash uses a hash table to remember the full pathnames of executable files (see hash under SHELL BUILTIN COMMANDS below). A full search of the directories in PATH is performed only if the command is not found in the hash table. If the search is unsuccessful, the shell searches for a defined shell function named command_not_found_handle. If that function exists, it is invoked with the original command and the original command’s arguments as its arguments, and the function’s exit status becomes the exit status of the shell. If that function is not defined, the shell prints an error message and returns an exit status of 127.

If the search is successful, or if the command name contains one or more slashes, the shell executes the named program in a separate execution environment. Argument 0 is set to the name given, and the remaining arguments to the command are set to the arguments given, if any.

If this execution fails because the file is not in executable format, and the file is not a directory, it is assumed to be a shell script, a file containing shell commands. A subshell is spawned to execute it. This subshell reinitializes itself, so that the effect is as if a new shell had been invoked to handle the script, with the exception that the locations of commands remembered by the parent (see hash below under SHELL BUILTIN COMMANDS) are retained by the child.

If the program is a file beginning with #!, the remainder of the first line specifies an interpreter for the program. The shell executes the specified interpreter on operating systems that do not handle this executable format themselves. The arguments to the interpreter consist of a single optional argument following the interpreter name on the first line of the program, followed by the name of the program, followed by the command arguments, if any.

Note

DO NOT set the shell option expand_aliases in scripts. Generally, DO NOT write bash(1) scripts. Stick to POSIX.

figure 6.4 shows an activity diagram for the command execution process.

figure 6.4 Shell command execution process¶

6.8. . command¶

The . command is an include mechanism for script files (much like the preprocessor command #include in C). Note, that the standard definition of . ignores all arguments, which means, that no arguments are allowed to avoid inconsistent behavior for different shells..

All variable assignments in the included file are incorporated into the shell environment.

From man page of bash(1):

. filename […] Read and execute commands from filename in the

current shell environment and return the exit status of the last command executed from filename. If filename does not contain a slash, filenames in PATH are used to find the directory containing filename. The file searched for in PATH need not be executable. When bash is not in posix mode, the current directory is searched if no file is found in PATH. If the sourcepath option to the shopt builtin command is turned off, the PATH is not searched. […] The return status is the status of the last command exited within the script (0 if no commands are executed), and false if filename is not found or cannot be read.

6.9. Subshell and compound commands¶

From man page of bash(1):

Compound Commands

A compound command is one of the following. In most cases a list in a command’s description may be separated from the rest of the command by one or more newlines, and may be followed by a newline in place of a semicolon.

(list)

list is executed in a subshell environment (see COMMAND EXECUTION ENVIRONMENT below). Variable assignments and builtin commands that affect the shell’s environment do not remain in effect after the command completes. The return status is the exit status of list.

{ list; }

list is simply executed in the current shell environment. list must be terminated with a newline or semicolon. This is known as a group command. The return status is the exit status of list. Note that unlike the metacharacters ( and ), { and } are reserved words and must occur where a reserved word is permitted to be recognized. Since they do not cause a word break, they must be separated from list by whitespace or another shell metacharacter.

Builtin commands in a subshell do not affect the shell environment in the parent shell, e.g.:

VAR='value'
( VAR='something'; echo "${VAR}"; )
echo "${VAR}";

results in output of:

something
value

Builtin commands in a group command do affect the shell environment outside the group, e.g.:

VAR='value'
{ VAR='something'; echo "${VAR}"; }
echo "${VAR}";

results in output of:

something
something

Note

A command in a pipeline is implicitely executed in a subshell.

I.e.:

var=outer
echo world | { var=inner; echo hello; cat - }
echo "${var}"

is equivalent to:

var=outer
echo world | ( var=inner; echo hello; cat - )
echo "${var}"

Note

Generally avoid { list } grouping. Especially the side effect of shell environment manipulation.

Note

A subshell is not equivalent to execution of an external shell script.

A subshell can access all variables of the parent shell environment, whether they are exported or not. E.g.:

unexported='internal value'
export exported='external value'
( echo "[${unexported}]"; echo "[${exported}]" )

results in output

[internal value]
[external value]

Whereas an external shell script can only access exported variables of the parent shell environment. E.g.:

unexported='internal value'
export exported='external value'
cat <<'EOF' | sh
echo "[${unexported}]"; echo "[${exported}]"
EOF

results in output

[]
[external value]