derkarl.org

Improving Code Readability in C++

published 2014-02-13 02:31:11 UTC by charles

When you’re writing code, there’s a lot of subtle things you can do that improve code readability by providing cues; there’s also a few glaringly obvious things you can do. Some (like “const"ing your functions) won’t come as a surprise to you.

Using scope to show variable purpose

Unlike what is was required in C, in C++ you can put your variable declarations anywhere you want in a function. Traditionally, it’s considered also “good form” to put your declaration as close as possible to its use:

void function()
{
    std::string foo = something();
    put_it_away(foo);

    std::string bar;
    ...
}

In this example program, by the time we get to the declaration of “bar”, we’re done using “foo”, by just enclosing the “foo” lines in curly braces, I can make that clear to the code reader:

void function()
{
    { // and it provides a cute place for a short comment
        std::string foo = something();
        put_it_away(foo);
    }
    std::string bar;
    ...
}

This is great from a readability perspective because it tells you that “foo will definitely not be used again in this function.” It also causes foo’s destructor to be called a little bit sooner (for whatever negligible performance advantage that may or may not have).

Corollary: don’t reuse variables

If you’re doing something twice, you might be tempted to reuse the temporary variables:

int counter=0;
for (...)
    if (something) counter++;

counter=0;
while (...)
    if (something) counter++;

Once again, two scopes, and declare the variable twice:

{
    int counter=0;
    for (...)
        if (something) counter++;
}
{
    int counter=0;
    for (...)
        if (something) counter++;
}

Now, it’s really easy for the reader to determine that these two counters are totally independent, Sure, the second assignment counter=0 is a clue to that, but this one is much harder to miss. What’s more is that unless the object allocates memory (or does something significant in its constructor), this has no performance penalty (not that you should be optimizing prior to profiling). That’s right, instantiating pointers, integers, doubles and STL iterators takes exactly 0 CPU cycles.

Enthusiastic use of const

Frequently you see code that assigns something to a variable from a function, and then uses it repeatedly:

user current_user = session->current_user();
show_greeting(user_name);
show_name_rhymes(user_name);

By simply making user_name “const”, you make it really clear what this variable is for:

const user current_user = session->current_user();
show_greeting(current_user);
show_name_rhymes(current_user);

C++ also has a weird syntax for declaring pointers as const. In the following example, not only is the variable “current_user” not going to change, but what it points to won’t either. It’s twice const and you can declare it as such:

const user *const current_user = session->current_user();
show_greeting(current_user);
show_name_rhymes(current_user);

The “inside” const (on the right) means that the pointer will not be reassigned, and the “outside” const (left) means that what it points to won’t change. A way to remember this is that the closer your const is to the variable name is the closer it is to the variable itself, and the closer it is to the typename is the closer it is to what the pointer points to (which has that type):

int *x = ...; // you can change the pointer and what it points to
int *const x = ...; // you can change what it points to, but not where you're pointing
const int *x = ...; // You can change what you're pointing to, but not the value of the pointer
int const *x = ...; // (A synonym for the previous, now out of fashion)

const your functions

When you declare a function is const, you do two things: you tell the compiler to keep you from modifying its object, and you let the reader know that. When someone sees that the declaration of a function is “const” it tells the user a few useful and closely related things:

If any of these don’t hold, your function should not be const even if the compiler lets you declare it const, and if all of these are true, your function should be const even if the function actually does modify state somehow.

Both of those “even ifs” may seem weird to you. Why would I not declare a function const if I could? C++ gives you direct access to the operating system. A const function could easily open a socket or write to a file descriptor. These are things that modify the state of the program. This isn’t too surprising.

What’s really weird is declaring a function const even though it modifies the object. Why on earth would you possibly do that? If your class is internally implemented with Copy on Write (example: std::string in GNU’s stdc++ - although they’re not supposed to anymore), you might want a function that disconnects a string from its other references. Another reason is that if your class allocates memory automatically in advance, you might want a function called “compact” that de-allocates some of that memory, or the reverse - a function that tells your object to allocate a whole bunch of memory for later use, like std::string::reserve(). All of these examples change the state of your object in some way, without changing the substance.

By the way, the “mutable” keyword declares a variable as a member of a class that can be changed by const member functions, and that keyword in itself tells the reader of your code that there’s some of this funny business. Also, the wonderfully verbose const_cast has a similar effect.

Caveat: this gets pretty tricky when threads come into play.

class user
{
public:
    // returns a database that can be modified, but does not modify user
    database *database_connection() const;

    // returns a database that cannot be modified, and does not modify user
    const database* database_connection() const;

    // returns a database that cannot be modified
    const database* database_connection();

    // also returns a database that cannot be modified, same as previous
    const database *const database_connection();
};

The too-many-parameters threshold - or - C++ rewards wordiness

At some point, a function can accept too many parameters. I don’t know exactly what that number is (I’m guessing it’s 4 ± 1). I think a nice goal for a function is that when you see some code calling it, you know what each parameter means even if you don’t actually know what those are. For example, if you see some simple code like this:

user n = user_cache->load_user_info(q);

It’s pretty easy to guess that “q” here represents the username, even though “q” is a bad variable name. If the same function had a million parameters, you couldn’t do that. If it did have a million parameters, you could instead do this:

user_loader loader;
loader.set_username(q);
loader.set_datasource(ds);
loader.set_user_origin_timestamp(ts);

user n = loader.load();

Now this class is extra useful because you can reuse it, by calling it repeatedly on different usernames but without resetting the other variables.

My nemesis, boolean parameters, with default parameters

Now you see this:

u = load_user_info(q, false);

What does that false mean? Who the hell knows! If you really need this, consider an enum with flags:

u = load_user_info(q, user_cache::no_history);

Consider exceptions

Exceptions in C++ have historically been unpopular for some reasons. These reasons are not really relevant anymore, and they can do great things to the readability of your program.

Use return values

Which of these two functions can return a null pointer? This is not a trick question:

user* load_user_info(const std::string &username);
user load_user_info(const std::string &username);

Only the first can, because the second one does not return a pointer. Duh. The second one never returns in a failure state. Now you’ll say “but then I need to have a whole try-catch block and that is incredibly wordy!” And you know, you’re right - it kinda sucks, but error handling is in general the hardest part of programming. Anyway, even if you don’t bother with that try-catch block, you have still dramatically improved the stability and security of your program. Your global event loop could just catch all exceptions and log them, as opposed to crashing.

People who modify your code will be tempted to assume that load_user_info never returns a null pointer, don’t let them crash the program by doing that, by never returning a null pointer!

Another good approach is to just return a user object that is invalid, as opposed to throwing.

Don’t have functions that return their success as a boolean

What does the bool that this function returns mean?

bool add_user_to_mailing_list(const user &u);

I can think of a few candidates:

Instead of a boolean, try an enum flag that has each of these.

exceptions are better assertions

void save_user_profile(const user &u)
{
    if (!u.valid())
        throw std::runtime_error("invalid user passed to save_user_profile");
    if (!db->is_open())
        throw std::runtime_error("database connection not open");
    if (u.password.empty())
        throw std::runtime_error("user has no password");
    // rest of function
}

Not only does this improve your program reliability, but it informs the reader of the function’s preconditions. This is similar to assert, but it doesn’t kill your program and won’t go away in a debug build. However, potentially having a lot of checks like this at every function start might be bad for performance, but most of the time performance doesn’t matter.

You don’t need a singleton

This one is my personal peeve, so I’ll repeat it.

You don’t need a singleton

Most arguments in favor of singletons all boil down to “my program can only have one of these” and 99% of the time, that’s just not true, and for 99% of the remaining 1%, even if it were true, it doesn’t help readability. Let’s disregard the bad design choice and just go for my argument-by-readability in opposition to singletons. A singleton tells your reader “only one of these can exist.” If your program has a connection to a database, your reader will say “only one database connection can exist” which is simply not true - you can connect twice. Please note that this statement is independent of the “only one of these should exist.”

One reason that singletons hurt readability so much is that a class that makes use of your singleton does not advertise through its API that it makes use of your singleton class.

The only time I would consider a singleton truly warranted is when you have a resource outside the control of your program that is limited to one accessor. Most of these are operating system limitations or logical limitations:

Don’t have every single piece of implementation as class members

Lets say you have a class like this:

class user
{
    database *db;
    // ... etc
private:
    std::string encode_in_database_format(const std::string &s) const;
};

This encode_in_database_format function takes a string, and modifies it in a way suitable for storing in a database, when you see someone calling this function in user’s implementation, it’s hard to know that this function only reads one value from the db and uses that to choose its encoding, or whatever.

If you take that function out of your header file, and make it a static non-member function at the top of your source file like so:

static std::string encode_in_database_format(const database *db, const std::string &s)

Then when you see someone calling it like this:

encode_in_database_format(db, s)

The very fact that the “db” is being passed there is a hint that this object doesn’t actually touch the user itself. They don’t even need to ever see its declaration to get that information; just passing the variable in directly has that effect. I think this might be one of those, more than the others on this page, that won’t be very helpful if your code is already a cesspool.

Conclusion

You can use C++’s syntax directly to hint the readers of your code about side effects and behavior. You can also provide these hints by just arranging your code in certain ways. Readability is more important than performance, much more. Most of the time, something that might damage readability might improve performance a tiny barely measurable amount (like <1%).

leave comment

Comments

[-] elghinn 9 months ago

Quoting wikipedia (https://en.wikipedia.org/wiki/C99): “[Since C99] variable declaration is no longer restricted to file scope or the start of a compound statement (block), similar to C++”

permalink reply
[-] charles 9 months ago

Most discussion has taken place on reddit

permalink reply