Julia Evans

Bash scripting quirks & safety tips

Yesterday I was talking to some friends about Bash and I realized that, even though I’ve been using Bash for more than 10 years now there are still a few basic quirks about it that are not totally obvious to me. So as usual I thought I’d write a blog post.

We’ll cover

  • some bash basics (“how do you write a for loop”)
  • quirky things (“always quote your bash variables”)
  • and bash scripting safety tips (“always use set -u”)

If you write shell scripts and you don’t read anything else in this post, you should know that there is a shell script linter called shellcheck. Use it to make your shell scripts better!

We ’re going to talk about bash like it’s a programming language, because, well, it is. The goal of this post really is not to go into details of bash programming. I do not do complicated programming in bash and do not really plan to learn how to. But after thinking about it a bit today, I think it’s useful to explicitly write down some of the basics of the bash programming language. And some things about the bash programming languages are quite different from other programming languages I use!

I really thought I knew this stuff already but I learned a couple things by writing this post so maybe you will too.

Variable assignment

In bash variable assignment looks like this:

VARIABLE=2

and you reference variables with $VARIABLE. It’s very important that you don’t put spaces around the = sign – VARIABLE= 2, VARIABLE = 2, and VARIABLE =2 are not syntax errors, but will all do different unwanted things (like try to run a program called 2 an environment variable VARIABLE set to the empty string).

Bash variables don’t need to be all-caps but they usually are.

Most bash variables you’ll use are strings. There are also some array variables in bash but I don’t really understand those.

Quoting your variables with ${}

Sometimes I have a variable containing a string like file.txt and I want to do like

mv $MYVAR $MYVAR__bak # wrong!

This code as is won’t work! It will instead look for the variable MYVAR__bak which is not a real variable.

To get around this, all you need to know is that ${MYVAR} does the same thing as $MYVAR. So we can run

mv $MYVAR ${MYVAR}__bak # right!

global, local & environment variables

Next, Bash has 3 kinds of variables. The kind I usually think of first (and probably use the most often) are environment variables.

Every process on Linux actually has environment variables (you can run env to see what variables are currently set), but in Bash they’re much more easily accessible. To see the environment variable called MYVAR you can run.

echo "$MYVAR"

To set an environment variable, you need to use the export keyword:

export MYVAR=2

When you set an environment variable, all child processes will see that environment variable. So if you run export MYVAR=2; python test.py, the python program will have MYVAR set to 2.

The next kind of variable is the global variable. You assign these just like we described up above.

MYVAR=2

They behave like global variables in any other programming language.

There are also local variables, which are scoped to only exist inside a bash function. I basically never use functions so (unlike in literally every other programming language I write in) I have never used local variables.

for loops

Here’s how I write for loops in bash. This loop prints the numbers from 1 to 10.

for i in `seq 1 10` # you can use {1..10} instead of `seq 1 10` 
do     
 echo "$i"
done

If you want to write this loop on one line it looks like this:

for i in `seq 1 10`; do echo $i; done

I find this impossible to remember (how are you supposed to remember that there’s a semicolon after seq 1 10 but none after do?) so I don’t try to remember that.

You can also write while loops but I never do that.

The cool thing about this is that you can iterate over the output of another command. seq 1 10 prints the numbers from 1 to 10 (one per line), and this for loop is just taking that output and iterating over it. I use this a fair amount.

You can interpolate command output with either backticks or $().

OUTPUT=`command`
# or 
OUTPUT=$(command)

if statements

If statements in bash are pretty annoying to remember how to do. You have to put in these square brackets, and there have to be spaces around the square brackets otherwise it doesn’t work. [[ and [ square brackets (double/single) both work. Here we get truly into bash quirk territory: [ is a program (/usr/bin/[) but [[ is bash syntax. [[ is better.

if [[ "vulture" = "panda" ]]; then
 echo expression evaluated as true
else
 echo expression evaluated as false
fi

Also, you can check for things like “this file exists”, “this directory exists”, etc. For example you can check whether the file /tmp/awesome.txt exists like this:

If [[ -e /tmp/awesome.txt ]]; then
  echo "awesome"
fi

This is sometimes useful but I have to look up the syntax every single time.

If you want to try out conditions from the command line you can use the test command, like test -e /tmp/awesome.txt. It’ll return 0 for success, and an error return code otherwise.

One last thing about why [[ is better than [: if you use [[, then you can use < to do comparisons and it won’t turn into a file redirection.

$ [ 3 < 4 ] && echo "true"
bash: 4: No such file or directory
$ [[ 3 < 4 ]] && echo "true"
true

Also one extra last thing about if: I learned today that you don’t need [[ or [ to do if statements: any valid command will work. So you can do

if grep peanuts food-list.txt
then
echo "allergy allert!"
fi

functions aren’t that hard

Defining and calling functions in bash (especially if they have no parameters) is surprisingly easy.

my_function () {
 echo "This is a function"; 
}
my_function # calls the function

always quote your variables

Another bash trick: never use a variable without quoting it. Here, look at this totally reasonable-looking shell script:

X="i am awesome"
Y="i are awesome"
if [ $X = $Y ]; then
 echo awesome
fi

If you try to run this script, you will get the incomprehensible error message script.sh: line 3: [: too many arguments. What?

Bash interprets this if statement as if [ i am awesome == i are awesome], which doesn’t really make sense because there are 6 strings (i, am, awesome, i, are, awesome). The correct way to write this is

X="i am awesome"
Y="i are awesome"
if [ "$X" = "$Y" ]; then # i put quotes because i know bash will betray me otherwise
 echo awesome
fi

There are cases where it’s okay to just use $X instead of “$X”, but can you really keep track of when it’s okay and when it’s not okay? I sure can’t. Always quote your bash variables and you’ll be happier.

return codes, &&, and `||

Every Unix program has a “return code” which is an integer from 0 to 127. 0 means success, everything else means failure. This is relevant in bash because: sometimes I run a program from the command line and want to only run a second program if the first one succeeded.

You can do that with &&!

For example: create_user && make_home_directory. This will run create_user, check the return code, and then run make_home_directory only if the return code is 0.

This is different from create_user; make_home_directory which will run make_home_directory no matter what the return code of create_user is

You can also do create_user || make_home_directory which will run make_home_directory only create_user fails. That’s definitely in the realm of clever tricks.

background processes

I’m not going to say much about job control here but: in bash you can start a background process like this

long_running_command &

If you later have regrets about backgrounding the process and want to bring it back to the foreground, you can do that with fg. If there’s more than one of those processes, you can see them all with jobs. For some reason fg takes a “job ID” (which is what jobs prints) instead of a PID. Who knows. Bash.

Also, if you background LOTS of processes, the wait builtin will wait until they all return.

Speaking of having regrets – if you accidentally start a process in the wrong terminal, Nelson Elhage has a cool project called reptyr that can save your process and move it into a screen session or something.

Be safe: set -e

When I write programs, usually I make mistakes. In most programming languages I use, when something goes horribly wrong the program will exit and tell me what went wrong.

In a bash script, you are usually running a lot of programs. Sometimes those programs will exit with a failure return code. By default, when a program fails, Bash will just keep running.

For example, in this script Python will fail (because the file I’m trying run doesn’t exist) and then it’ll happily continue and print “done”.

python non_existant_file.py
echo "done"

This is almost never what I want – if one of the programs in my script fails, I do not want it to just merrily keep going and do possibly undefined / questionable things! That is terrifying. set -e will make the script stop and hopefully prevent any damage. Here’s the safer version of that

set -e # put this at the beginning of your file

python non_existant_file.py
echo "done"

be safer: use set -u

In most programming languages I use, I get an error if I try to use an unset variable. Not in Bash! By default, unset variables are just evaluated as if they were the empty string. The result of this is

rm -rf "$DIRECTORY/*" 

will actually run rm -rf /* if $DIRECTORY is unset. If you use set -u bash will stop and fail if you try to use an unset variable.

debug: use set -x

set -x will make bash print out every command it runs before running it. This is really useful for debugging.

You can see all the other bash options you can set with set -o.

A lot of shell scripts I see people using in practice start with set -eu or set -eux. Safety!

You can also set -o pipefail to exit if one part of a pipe fails.

lint your bash scripts with shellcheck!

Very recently I learned that there is a bash linter to help detect all these weird quirks and more!! It ‘ s called shellcheck and you can install it with apt-get install shellcheck.

Also it has a website! https://www.shellcheck.net/. There is an example so can see the kind of errors it tells you about. It’s pretty awesome and I’m excited about trying it out.

Shellcheck knows about way way more bash scripting best practices than I do :). When looking at the examples I was like “wow, that makes sense but I would have never thought of that”.

There’s also a shell formatter called shfmt which seems useful: https://github.com/mvdan/sh.

bash is weird but it’s possible to remember some of the quirks

I think those are all the basic bash quirks I know! I suppose it is possible to rant about how bash is a Terrible Programming Language that you shouldn’t be using and why don’t we have a shell programming language that is less confusing, but it doesn’t bother me too much.

I just try to not write very complicated bash scripts, stick to some of the best practices here, and don’t worry too much about it. And it’s kind of interesting to learn about the weird quirks, anyway!

If you liked this, people linked me to a bunch of other bash resources which I will share with you now

(if we’re talking about alternative shells, though – the shell I actually use day to day is fish. Fish is wonderful and I love it because of its amazing autocomplete. But I still generally write shell scripts in bash.)

thanks to Mat, Marina, Kamal, Geoffrey, Panashe, @gnomon, and Iain for talking about Bash with me!

Blogging principles I use Slow down your internet with tc