4th January 2019 | 7 minutes
Today's post is going to be a bit more complex if you're new to shell scripting, but something I found quite beautiful
is how one can go about parsing command-line arguments and flags in shell scripting. It works by using a switch-case
statement and the shift
expression.
Let's take a look!
# arguments.sh
# Default values of arguments
SHOULD_INITIALIZE=0
CACHE_DIRECTORY="/etc/cache"
ROOT_DIRECTORY="/etc/projects"
OTHER_ARGUMENTS=()
# Loop through arguments and process them
for arg in "$@"
do
case $arg in
-i|--initialize)
SHOULD_INITIALIZE=1
shift # Remove --initialize from processing
;;
-c=*|--cache=*)
CACHE_DIRECTORY="${arg#*=}"
shift # Remove --cache= from processing
;;
-r|--root)
ROOT_DIRECTORY="$2"
shift # Remove argument name from processing
shift # Remove argument value from processing
;;
*)
OTHER_ARGUMENTS+=("$1")
shift # Remove generic argument from processing
;;
esac
done
echo "# Should initialize: $SHOULD_INITIALIZE"
echo "# Cache directory: $CACHE_DIRECTORY"
echo "# Root directory: $ROOT_DIRECTORY"
echo "# Other arguments: ${OTHER_ARGUMENTS[*]}"
Code like this is why I'm in a love-hate relationship with my terminal
Phew. That looks like a whole bunch of code. It includes all process of catching command line arguments. But let's go through everything bit by bit. First, let's start with the default values.
# Default values of arguments
SHOULD_INITIALIZE=0
CACHE_DIRECTORY="/etc/cache"
ROOT_DIRECTORY="/etc/projects"
OTHER_ARGUMENTS=()
You can also make the default values empty strings! Just use what makes sense to you
This is simple enough. If the user doesn't pass in a certain argument, we fill it with some default value we're happy with. Alternatively you can make the strings empty and check if these empty values are still there. In this way you can easily verify that you have all necessary arguments passed in. How you go about that is an implementation detail of your script and thus left as an exercise for the reader. I recommend tldp.org for learning about operators.
Style-wise I like defining my arguments in all-caps snake_case
, because I generally treat them as constants that I do
not modify. You may disagree and you're welcome to call them however you like.
for arg in "$@"
do
.. SNIP ..
done
Funnily enough for loops end with done instead of rof. Consistency!
Looping through the arguments is equally simple. You simply loop over the magic $@
variable your shell provides to you.
It contains an array of the exact command as it was called, starting after the file name.
So if you call your script using ./arguments.sh -i --cache=/var/cache --root /var/www/html/public my-project
,
then the array will look a bit like so
(
$0 = ./arguments.sh
$1 = -i
$2 = --cache=/var/cache
$3 = --root
$4 = /var/www/html/public
$5 = my-project
)
This is not the exact notation of arrays in shell, but this will be important in a second
Note that the $@
variable does not contain the value of $0
. If you however access $0
normally, it will
return the file name you used to call the script.
For our purposes we loop over each entry in the array and put it in a temporary $arg
variable.
Now we can process the arguments.
The arguments will be processed in a switch-case statement. As you may have noticed in the full code sample above, those come with their own delightful idiosyncrasies in syntax. Like a lot of other things in shell scripting, really. A case statement looks like this:
case $arg in
.. SNIP ..
esac
The $arg variable in this case is the one we declared in the for-loop above
Now let's look at the various ways to process arguments and how to write switch cases.
Boolean flags are those which may be there or not. A good example might be a --help
flag. Parsing those looks like so
-i|--initialize)
SHOULD_INITIALIZE=1
shift # Remove --initialize from processing
;;
Note the two semicolons. Yes, you need those. Both of those.
This case statement checks whether the current value of $arg
is either -i
or --initialize
. In our case this is true
and thus we set the SHOULD_INITIALIZE
variable to 1
to indicate that the flag is present. Afterwards we pop the value
$arg
off of our $@
array using shift
. It now looks like the following:
(
$0 = ./arguments.sh
$1 = --cache=/var/cache
$2 = --root
$3 = /var/www/html/public
$4 = my-project
)
Note that the value of $0 stayed the same while everything else shifted up by one.
Our next case statement parses command-line flag of the form --arg=value
, which is the traditional style of passing arguments.
You can often see this when using Unix tools such as ls --color=auto
.
-c=*|--cache=*)
CACHE_DIRECTORY="${arg#*=}"
shift # Remove --cache= from processing
;;
This is where you realize that shell scripting has magical features
In this case we check if the current $arg
matches the either -c=
or --cache=
followed by any number of characters.
If it does we take that arg
variable into our string and remove the parts of it we don't need. The #*=
part looks super
confusing at first. What it does is remove everything character from the beginning of $arg
until it finds an equals sign.
This means that --cache=/var/cache
becomes /var/cache
. If you want to read up more on the topic of parameter substitution
in shell scripts, I recommend this article from cyberciti.biz
After this our $@
array of arguments now looks as follows:
(
$0 = ./arguments.sh
$1 = --root
$2 = /var/www/html/public
$3 = my-project
)
Our third case statement handles command-line flags of the form --arg value
, which is a more modern approach.
You can usually see it with command-line tools written with Node.js or Python.
-r|--root)
ROOT_DIRECTORY="$2"
shift # Remove argument name from processing
shift # Remove argument value from processing
;;
At this point these are probably a breeze to go through
Compared to the previous handler, this one is again rather easy to understand. We check whether $arg
is equal to -r
or root
then we take the value of $2
into our ROOT_DIRECTORY
variable and shift
twice.
Why do we take $2
? Remember: We have shifted away all previous arguments passed to the script so that now $1
is equal
to the value of $arg
and thus $2
now contains the arguments value.
After we shift the next two values off, we remain with this arguments array
(
$0 = ./arguments.sh
$1 = my-project
)
Just one more step to go and we're done
As the last step we will handle all the other arguments passed in without a flag. Let's go!
Our final case matches any value that wasn't matched by our previous handlers. These can be arguments passed without any flag, like a project name, or something else entirely.
*)
OTHER_ARGUMENTS+=("$1")
shift # Remove generic argument from processing
;;
"Pop!" goes the weasel and adds the value to an array
For this handler we simply take the value of $1
and add it to a miscellaneous array. After all the additional arguments
have been added to the array, you can decide to do whatever you like. For example the first entry in the array could be
a project name. Who knows!
Now if you add some echo statements and try to run your script as stated above with
./arguments.sh -i --cache=/var/cache --root /var/www/html/public my-project
you could see output like the following
$ ./arguments.sh -i --cache=/var/cache --root /var/www/html/public my-project
# Should initialize: 1
# Cache directory: /var/cache
# Root directory: /var/www/html/public
# Other arguments: my-project
I think that the use of such a switch-case statement together with some more advanced features of shell scripting makes for a really nice and extendable way to add command-line arguments and flags to your scripts. It also allows for great flexibility, so if you don't like being stuck with one style you can easily use the other.
Enjoy!~