Thinking about scope

« Back

14 Jan 2015 – read.

Scope is one of those things that is very easy to lose track of. The natural response from a lot of developers is to try to scope everything as globally as possible just because "you never know where you'll need access to something"

Take this small C++ class as an example:

// ExampleClass.h
#pragma once

namespace MyApp  
{
  /// Example Class
  class ExampleClass
  {
  public:
    ExampleClass();
    ~ExampleClass();

  private:
    ExampleClass(const ExampleClass& that);
    ExampleClass& operator=(constconst ExampleClass& that);

  public:
    bool initialize();
    void deinitialize();

  private:
    QVector mObjects;
    float mSpacing;
  };
}

// ExampleClass.cpp
#include "ExampleClass.h"

namespace MyApp  
{
  /**
  * Default constructor
  */
  ExampleClass::ExampleClass() :
    mSpacing(0.5f)
  {
  }

  /**
  * Destructor
  */
  ExampleClass::~ExampleClass()
  {
  }

  bool ExampleClass::initialize()
  {
    QVector2D pos(0,0);
    for(int i=0; i<10; ++i)
    {
      Object obj;
      obj.setPosition(pos);
      pos.setX(pos.x()+mSpacing);
      mObjects << obj;
    }
    return !mObjects.isEmpty();
  }

  void ExampleClass::deinitialize()
  {
    mObjects.clear();
  }
}

In this class mSpacing is only ever used in the initialize function but it is still set up as a member variable. This is definitely a case where mSpacing could just be local to that function.

One could argue that keeping it a member of the class means that it could easily be used in any other function added, but that is a very weak argument. as adding a float spacing = 0.5f to the functions that need it is still a lot less potential cognitive overhead and it avoids that quick mSpacing += 0.1f thrown into one function using it that causes you to need to go into a multi-hour debugging session.

It also has the big downside of adding more complexity than necessary since it means it is one more thing you should keep track of the state of.

Reducing scope is one of the simplest way to reduce the amount of "global state" that a function has to take into account.

You might not think it's too bad, but then consider an example like this:

// ExampleClass.h
#pragma once

namespace MyApp  
{
  /// Example Class
  class ExampleClass
  {
  public:
    ExampleClass();
    ~ExampleClass();

   private:
     ExampleClass(const ExampleClass& that);
     ExampleClass& operator=(constconst ExampleClass& that);

  public:
    void calculateStuff();
    void calculateMoreStuff();
    int finish();

  private:
    int mStuff;
  };
}

// ExampleClass.cpp
#include "ExampleClass.h"

namespace MyApp  
{
  /**
  * Default constructor
  */
  ExampleClass::ExampleClass() :
    mStuff(0)
  {
  }

  /**
  * Destructor
  */
  ExampleClass::~ExampleClass()
  {
  }

  void ExampleClass::calculateStuff()
  {
    mStuff = 3;
  }

  void ExampleClass::calculateMoreStuff()
  {
    mStuff *= 2;
  }

  int ExampleClass::finish()
  {
    mStuff -= 2;
    return mStuff;
  }
}

// Example test
ExampleClass example;  
example.calculateStuff();  
example.calculateMoreStuff();  
int result = example.finish(); // returns 4

In this example if functions would ever end up being called out of order so that calculateMoreStuff would be called before calculateStuff then your results would end up different. This is of course a very simplified example because I wanted to keep the code as short as possible so you will need to use your imagination a bit here as to why you would do this.

This could easily be rewritten like this:

// ExampleClass.h
#pragma once

namespace MyApp  
{
  /// Example Class
  class ExampleClass
  {
  public:
    ExampleClass();
    ~ExampleClass();

  private:
    ExampleClass(const ExampleClass& that);
    ExampleClass& operator=(constconst ExampleClass& that);

  public:
    int calculateStuff() const;
    int calculateMoreStuff(int inStuff) const;
    int finish(int inStuff) const;

  private:
  };
}

// ExampleClass.cpp
#include "ExampleClass.h"

namespace MyApp  
{
  /**
  * Default constructor
  */
  ExampleClass::ExampleClass()
  {
  }

  /**
  * Destructor
  */
  ExampleClass::~ExampleClass()
  {
  }

  int ExampleClass::calculateStuff() const
  {
    return 3;
  }

  int ExampleClass::calculateMoreStuff(int inStuff) const
  {
    return inStuff * 2;
  }

  int ExampleClass::finish(int inStuff) const
  {
    return inStuff - 2;
  }
}

// Example test
ExampleClass example;  
int result = example.calculateStuff(); // result = 3  
result = example.calculateMoreStuff(result); // result = 6  
result = example.finish(result); // result = 4

If you call them out of order and pass along the values then it will obviously still calculate the wrong value, but in this way it is always visible to you what is going on, and now when making one of the functions you do not have to consider that some other function may have modified your input value as it only belongs to that function. This is usually one of the big benefits that are mentioned when talking about pure functional programming and it is something that I definitely believe in. Minimising the sideeffects from most of your functions will make debugging a lot easier.

This obviously doesn't only apply to member variables in objects, it applies to any variable. Take this for example:

// objects is an array of 10 Objects
int offset = 4;  
for(int i=0; i<10; ++i)  
{
  objects[i].setNumber(offset);
}

offset = 2;  
for(int i=0; i<10; ++i)  
{
  objects[i].setOffset(offset);
}

offset = 3;  
for(int i=0; i<10; ++i)  
{
  objects[i].setLimit(offset);
}

This is a bit scary as there is a big risk of accidentally leaking state by not resetting the offset variable, and if one function was to modify it internally based on loops etc you can get a bug with values which are very hard to follow, so a much "safer" way of writing it would be like this:

// objects is an array of 10 Objects
for(int i=0; i<10; ++i)  
{
  int offset = 4;
  objects[i].setNumber(offset);
}

for(int i=0; i<10; ++i)  
{
  int offset = 2;
  objects[i].setOffset(offset);
}

for(int i=0; i<10; ++i)  
{
  int offset = 3;
  objects[i].setLimit(offset);
}

This means that the variables are always limited in scope and can not be modified by code outside of where it is supposed to be used. Some old timers might think that "this is way too expensive, you should re-use variables to not have to allocate memory" and basically they would be right if we were talking about compilers from the 80s. In the modern era compilers will optimize this to the point where it won't matter and since the gains in terms of clarity are potentially huge it is clearly worth it.

All of these examples are C++, but this of course applies to any language. You would think some of it is code 101 but it is something we could all use a reminder about every now and then. If you can make something more narrow in scope you probably should