
quoter - quote arguments or standard input for usage in POSIX shell by eval

Author: Martin Väth <martin@mvath.de>

This project is under the MIT license.

This project serves two purposes:

1. Quote arguments for eval or remote usage in POSIX shells:
"quote --" is similar to "printf '%q '" supported by some shells or
external printf implementations. Note, however, that %q is not POSIX,
and so one cannot rely that this works on all systems.

2. Allow to deal in POSIX shell with standard input from tools which output
strings separated by \0 (like GNU find -print0):
"quote -i" is somewhat similar to "xargs -0 printf '%q '" except that
neither the option -0 nor the %q are POSIX, and so one cannot rely that this
works on all systems.

The above points are best explained by some POSIX shell script snippets:

1(i)
	su -c "$(quoter -c -- cat -- "$@")"

The effect of the above snippet is similar to

	su -c "cat -- $@"

except that the latter cannot be used if the arguments might contain
a special character. In fact, if $1 has e.g. the value "/dev/null; rm -rf *"
then the bad code "cat -- /dev/null; rm -rf *" would be executed, because
the whole argument is interpreted by the remote shell.
Note that the "--" after quoter in this example is mandatory to make sure
that quoter will not interpret any furter options

1(ii)
	var=$var${var:+\ }`quoter -c -- ...`

The above is the analogue of

	Push var ...

where Push is the function from https://github.com/vaeth/push/
Note that calling the external program quoter has quite some overhead, so
this might be slower than Push. On the other hand, it becomes faster
the longer and more complex the pushed data "..." is, because the compiled
code in "quoter" will usually do the actual quoting considerably faster.

The above example 1(ii) still suffers from a problem:
The data which can be passed to an external program like "quoter" is usually
limited by system restrictions. No such limitation exists for standard input
and output, that is, when "quoter -i" is used.
For this reason, it is usually advisable to prefer quoter -i over the "plain"
usage of "quoter". This is simple if "printf" is a shell builtin which thus
does not suffer from the mentioned system restrictions.
In this case, one can simply replace "quoter -- ..." by

	printf '%s\0' ... | quoter -i

Summarizing: A variant of the above example without the mentioned limitations
(if printf is a built-in) reads as follows:

	var=$var${var:+\ }`printf '%s\0' ... | quoter -ic`


2.
	eval "set -- `find . -type f -print0 | quoter -i`"
	for file
	do ...
	done

This will (recursively) iterate through all ordinary files from the current
directory or some of its subdirectories. In contrast to the "naive" approach
	for file in `find . -type f`
	do ...
	done
the above has no problem with special characters (like spaces) in filenames.
Of course, find has to understand the non-POSIX option -print0 fo this.

The example 2. above still has two problems:

(a) If "find" returns nothing, then the above "eval" will expand to "set --";
some (buggy) shells will not remove all arguments by this command.
Workaround: Add artificially an argument and remove it, that is, use instead
of the above "eval" for instance

	eval "set -- a `find . -type f -print0 | quoter -i`"
	shift

(b) It is hard to check whether the call to "find" within the above "eval"
was succesful, because POSIX returns only the exit status of the last
command of a pipe, and neither -o pipefail nor the PIPESTATUS array are POSIX.
For this reason, this project also provides a script "quoter_pipe.sh".
When this script is sourced, it provides a shell function "quoter_pipe".
The call

	quoter_pipe [quoter_options] [--] command [args]

is then similar to

	quoter_pipe=`command [args] | quoter -ic [quoter_options]`

except that the exit status is 0 only if both commands of the pipe succeeded.
In addition, the variables quoter_pipestatus and quoter_pipestatus1 contain the
exit status of the first and second command of the above pipe, respectively.
(See the comments on top of the file quoter_pipe.sh for more details,
e.g. which quoter_options are admissible.)
To do its task, quoter_pipe uses implementation details of quoter (e.g. in
which way expressions actually are quoted). These implementation details might
change in future versions of quoter. Therefore, quoter_pipe will in general
only work with the version of quoter it is distributed with.
For this reason, you should make sure to have always matching versions of
quoter and quoter_pipe.sh installed. For the same reason, it is not
recommended to write further scripts which rely on such implementation details.

Summarizing, a variant of the above example which solves the problems (a),(b)
can look like this:

	# first source quoter_pipe.sh from $PATH if necessary:
	command -v quoter_pipe >/dev/null 2>&1 || . quoter_pipe.sh
	quoter_pipe find . -type f -print0 || {
		[ "$pipestatus" -eq 0 ] || echo "find failed" >&2
		[ "$pipestatus1" -eq 0 ] || echo "quote failed" >&2
		exit 1
	}
	eval "set -- a $quoter_pipe"
	shift
	for file
	do ...
	done

It is _not_ possible to specify redirection or several commands (e.g.
separated by ; or && or something similar) in the argument of quoter_pipe.
If you need to use such a thing, define a function:

	my_pipe_task() {
		some_command && another_command >/dev/null 2>&1
	}
	quoter_pipe my_pipe_task


Type "quoter -h" to see further options of "quoter".
Some options need a more verbose explanation:

Roughly speaking, quoter is intended to be used such that
	eval "set -- `quoter ...`"
can always be safely executed, i.e. all possibly "disturbing" characters
are quoted or escaped. Certain non-POSIX extensions of some shells might
require quoting further characters.
For instance, {a,b} is by some shells interpreted as "a" "b", so -
although not necessary according to POSIX - the symbol { will need to be
escaped if you have such a shell.
Any quoting which the author is aware of the currently existing popular
shell versions (bash, dash, zsh, ksh, busybox, bosh, Bourne shell) is
taken care of, and if some quoting is missing, this will likely be fixed
in a future release of "quoter".
This assertion holds with and without the options -s and -l, but with
the option -s, it is attempted to reduce the quoting while with -l a lot
of (usually redundant) quoting is used.
This means that with the option -s, the generated output is usually shorter
(though usually not more readable by a human being) while with -l the
generated string is usually longer (and less readable) due to unnecessary
quoting.
The default -S (--unshort) is a reasonable compromise which is readable
and safe. Use -s only if you have a reason to require a short string, and
use -l only if you are extremely paranoid and want to future-safe even for any
thinkable future nonstandard shell extension not existing yet, e.g. when
in some shell % should have a special meaning on the command line...

The meaning of the option --empty-last (-e) is perhaps not obvious.
If standard input looks like this

	string1\0string2\0string3

"quoter -i" will consider the three strings "string1" "string2" "string3".
However, what should happen if "string3" is empty in this case?
It is perhaps most logical that in this case "string3" is considered to be
an empty string. This was indeed the case for quoter v1.0 and quoter v1.1.
However, this turned out to be not very practical:
Most commands like "find ... -print0" or "printf '%s\0' ..." generated actually
a redundant "\0" symbol at the end, so that the above interpretation of the
standard input would append a non-intended empty string as a last argument
which is very disturbing and not very practical.
For this reason, the above is no longer the defaults for quoter v2.0 or newer:
A trailing \0 is no longer interpreted as "an empty string follows".
The purpose of the new option --empty-last (-e) is to restore the previous
(more logical but less practical) behaviour when needed.


Installation

Just compile the single file src/quoter.c (run "make" to compile it into
bin/quoter with default options) and copy the result into $PATH under the
name "quoter".
The code is c89 compatible with the exception of two compiler specials
which might optimize the code:
If compilation fails, try to use the compiler options
-DAVOID_BUILTIN_EXPECT and/or -DAVOID_ATTRIBUTE_NORETURN
to avoid these specials.

Also copy bin/quoter.sh into your $PATH so that ". quoter.sh" can be used
for sourcing.
To obtain zsh completions also copy the content of zsh/ to zsh's $fpath
(perhaps /usr/share/zsh/site-functions/).
For the standard paths, these copies all happen by "make install".

For gentoo, there is an ebuild in the mv overlay (available by layman).
