Global Variables Are Evil sample chapter

My publisher has authorized me to release a free sample chapter from my book.  They let me pick one, and I decided to go with the one on global variables. If this is a success, a couple more chapters may be released one way or another, so I'd welcome input on which the other best topics would be from the table of contents.


Entire chapter in Adobe Acrobat (166 KB):
http://koopman.us/bess/chap19_globals.pdf



Chapter 19
Global Variables Are Evil

• Global variables are memory locations that are directly visible to an entire
software system.
• The problem with using globals is that different parts of the software are
coupled in ways that increase complexity and can lead to subtle bugs.
• Avoid globals whenever possible, and at most use only a handful of globals
in any system.

Contents:

19.1 Chapter overview

19.2 Why are global variables evil?
Globals have system-wide visibility, and aren’t quite the same as static variables.
Globals increase coupling between modules, increase risk of bugs due to
interference between modules, and increase risk of data concurrency problems.
Pointers to globals are even more evil.

19.3 Ways to avoid or reduce the risk of globals
If optimization forces you into using globals, you’re better off getting a faster
processor. Use the heap or static locals instead of globals if possible. Hide
 globals inside objects if possible, and document any globals that can’t be
eliminated or hidden.

19.4 Pitfalls

19.5 For more information



19.1. Overview

“Global Variables Are Evil” is a pretty strong statement! Especially for some-
thing that is so prevalent in older embedded system software. But, if you can
build your system without using global variables, you’ll be a lot better off.

Global variables (called globals for short) can be accessed from any part of a
software system, and have a globally visible scope. In its plainest form, a global
variable is accessed directly by name rather than being passed as a parameter. By
way of contrast, non-global variables can only be seen from a particular module
or set of related methods. We’ll show how to recognize global variables in detail
in the following sections.

19.1.1. Importance of avoiding globals
The typical motivation for using global variables is efficiency. But sometimes
globals are used simply because a programmer learned his trade with a
non-scoped language (for example, classical BASIC doesn’t support variable
scoping – all variables are globals). Programmers moving to scoped languages
such as C need to learn new approaches to take full advantage of the scoping and
parameter passing mechanisms available.

The main problem with using global variables is that they create implicit cou-
plings among various pieces of the program (various routines might set or mod-
ify a variable, while several more routines might read it). Those couplings are not
well represented in the software design, and are not explicitly represented in the
implementation language. This type of opaque data coupling among modules re-
sults in difficult to find and hard to understand bugs.

19.1.2. Possible symptoms
You may have problems with global variable use if you observe any of the fol-
lowing when reviewing your implementation:

- More than a handful of variables are defined as globally visible (ideally, there
are none). In C or C++, they are defined outside the scope of any procedure,
so they are visible to all procedures. An indicator is the use of the keyword
extern to access global variables defined outside the scope of your compiled
module.

- Variables are used in a routine that are neither defined locally, nor passed as
parameters. (This is just another way of saying they are global.)

- In assembly language, variables are accessed by label from subroutines or
modules rather than via being passed on the stack as a parameter value. Any
access to a labeled memory location (other than in a single module that has ex-
clusive access to that location) is using a global.


The use of global variables is sometimes defensible. But their usage should be
for a very few, special values, and not a matter of routine. Consider their occa-
sional use a necessary evil.

19.1.3. Risks of using globals
The problem with global variables is that they make programs unnecessarily
complex. That can lead to:

- Bugs caused by hidden coupling between modules. For example, misbehav-
ing code in one place breaks things in another place.

- Bugs created in one module due to a change in a seemingly unrelated second
module. For example, a change to a program that writes a global variable
might break the behavior of code that reads that variable.

It may be necessary to have variables that are global, or at least serve a similar
purpose as global variables. The risk comes from using global variables too
freely, and not using mitigation strategies that are available.

19.2. Why are global variables evil?

Global variables should be avoided to the maximum extent possible. At a high
level, you can think of global variables as analogous to GOTO statements in pro-
gramming languages (this idea dates back at least as far as Wulf (1973)). Most of
us reflexively avoid GOTO statements. It’s odd that we still think globals are
OK.

First, let’s discuss what globals are, then talk about why they should be
avoided.

... read more ...


(c) Copyright 2010 Philip Koopman

Software Timing Loops

Once in a while I see someone using code that uses a timing loop to wait for some time to go by. Code might look like this:

// You should *NOT* use this code as-is!
#define SCALE   15501    // System-dependent time scaling factor

void WaitMS(unsigned long ms)
{ unsigned long counter = ms * SCALE; // Compute how long to wait  
  while(counter > 0) { counter--;}    // Waste time



The idea is that the designer tweaks "SCALE" so that WaitMS takes exactly one msec for each integer value of the input ms.  In other words WaitMS(5) waits for 5 msec. But, there are some significant problems with this code.
  • Different compiler versions and different optimization levels could dramatically change the timing of this loop depending upon the code that is generated. If you aren't careful, your system could stop working for this reason when you do a recompile, without you even knowing the timing has changed or why that has happened.
  •  Changes to hardware can change the timing even if the code is recompiled. Changing the system clock speed is a fairly obvious problem. But other more subtle problems include clock throttling on high-end processors due to thermal management, putting the code in memory with a wait states for access, moving to a processor with instruction cache, or using a different CPU variant that has different instruction timings.
  • The timing will change based on interrupt service routines delaying execution of the while loop. You are unlikely to see the worst case timing disruption in testing unless you have a very deterministic system and/or really great testing.
  • This approach ties up the CPU, using power and preventing other tasks from running.
What I recommend is that you don't use software-based timing loops!  Instead you should change WaitMS() to look at a hardware timer, possibly going to sleep (or yielding to other tasks) until the amount of time desired has passed. In this approach, the inner loop checks a hardware timer or a time of day value set by a timer interrupt service routine.


Sometimes there isn't a hardware timer or there is some other compelling reason to use a software loop. If that's the case, the following advice might prove helpful.
  • Make sure you put the software timing loop in one place like this instead of having lots of small in-line timing loops all over the place. It is hard enough to get one timing loop right!
  • The variable "counter" should be defined as "volatile" to make sure it is actually decremented each time through the loop. Otherwise the optimizer might decide to just eliminate the entire while loop.
  • You should calibrate the "SCALE" value somehow to make sure it is accurate. In some systems it makes sense to calibrate a variable during system startup, or do an internal sanity check during outgoing system test to make sure that it isn't too far from the right value. This is tricky to do well, but if you skip it you have no safety net in case the timing does change.

(There are no doubt minor problems that readers will point out as well depending upon style preferences.   As with all my code examples, I'm presenting a simple "how I usually see it" example to explain the concept.  If you can find style problems then probably you already know the point I'm making. Some examples:  WaitMS might check for overflow, uint32_t is better than "unsigned long," "const unsigned long SCALE = 15501L" is better if your compiler supports it, and there may be mixed-length math issues with the multiplication of ms * SCALE if they aren't both 32-bit unsigned integers.  But the above code is what I usually see in code reviews, and these style details tend to distract from the main point of the article.)