auto abuse considered harmful

Posted by Edouard on 27 Apr 2020

Auto is a keyword introduced in C++ 11 out of the necessity to capture lambdas.

Imagine writing this without auto:

auto l = [](int i) { return i * i;};

Now suddenly, that wisdom tooth you got extracted under local anesthesia that decided to wear off in the middle of the procedure doesn't sound like the worse experience of your life anymore. Yes, even if your dentist has questionable music tastes.

So auto is terrific. It removes many headaches and makes writing generic code much easier.

Why not use it everywhere? The code will be more generic! It's awesome! Right? Right?

A guideline for auto

The generally accepted guideline for auto is

If replacing your type with auto doesn't potentially break the algorithm, using auto is appropriate.

Formulated differently:

If auto creates degenerate cases, stick to explicit typing.

So what's a degenerate case? Read on.

auto messing with sign

auto l = one_function();
auto r = another_function();
if (l < r) { }

What a wonderful generic code, isn't it? All you need is a valid comparison operator between l and r, and you're done.

We'll put aside the problem of readability for now.

The real problem is that if l is an unsigned int and r is a signed int, r will be converted to unsigned int, which may result in r being greater than l if r is negative.

While your compiler may issue you a warning, this is very hard to see in a code review.

One way to avoid it is to avoid auto altogether and write the proper comparison:

unsigned int l = one_function();
int r = another_function();
if ((r >= 0) && (l < static_cast<unsigned int>(r))
// some stuff is happening here

You could argue that it's possible to write a signed/unsigned safe comparison function, but then we're back to the problem that auto hid the problem in the first place.

Protip: when integers enter the room, use explicit typing.

auto messing with serialization

Let's have a look at this code:

const auto r = data_producer();
const auto buf = serialize(r);
if (auto ec = send(s, buf)) { /* error handling */}

A mixed bag, isn't it? On the one hand, auto makes the code terser and more generic; on the other hand, it obfuscates the types. But again, we're not going to talk about readability (yet).

The problem is that this code is potentially not cross-platform and can create hard to track bugs.

If data_producer() returns a size_t, depending on the platform (or compiler), it may be a 32-bit integer or a 64-bit integer. As we can imagine that the serialize function has an overload for both types, thanks to the magic of genericity, it will compile without a single warning on both platforms.

If a program running on platform A sends a message over the wire on platform B, platform B will, at best, return an error.

In practice, the above code is generally hidden in subfunctions, as we serialize structures rather than integers, but the problem is identical.

The solution is to use integers of the exact size and avoid auto like the plague:

const std::uint64_t r = data_producer();
const auto buf = serialize(r);
if (auto ec = send(s, buf)) { /* error handling */ }

We need to "pin" the type to ensure the behavior of our code doesn't change.

Protip: if it goes through the wire, integers must be rated R (Because they have explicit content, thank you, I am here all night).

auto the cryptographically strong code cipher

Yes, auto makes your code more generic, and yes, auto saves you much time, but look at this code:

auto v = f();
if (auto ec = check(v)) { /* error handling */ }
auto l = [](const auto & p) { return modify(p); };
auto s = modify(v, l);
// ad nauseam

You could get around the problem with more specific variables or functions names.

That's a terrible idea. Would you trade compiler enforced constraints (e.g., the code cannot compile if the variable is invalid) for coding conventions that cannot be 100% enforced?

With explicit typing of v and s, the code would be much easier to grok. The problem is that auto adds one level of indirection to understand what is happening, requiring your IDE to help you figure out what's wrong.

Granted, it is less generic. Is it a bad thing?

The answer is: it depends. And unfortunately, you can't solve software engineering problems with a book of rules.

One guideline you can use, on top of the one given at the beginning of this article, is: "don't use auto if the type isn't obvious, and the type contributes greatly to understanding the surrounding code."

auto as a return value

Auto as a return value for functions and methods, is a specific case of the above that needs a special focus.

When your function returns auto, you are requesting extra work from the reader to understand how to use your code.

Use auto as a return value only if the code needs to be generic or the type is too complex to write (and thus read).

In any other instance: do your homework and write the type.

Wrap up

Like every new shiny thing, you believe its powers are limitless and want to use it as much as possible.

My hope in this blog post is to make you understand the limits of this approach.

  • Use auto when it doesn't create cases where the code breaks down
  • Use explicit typing with integers to avoid signed/unsigned comparison bugs
  • Use exact-sized integers for data structures to avoid cross-platforms bugs
  • Don't abuse auto: it enervates the reader
    • Don't use auto as variables when typing conveys meaning
    • Use it for return values only if the type is hard to deduce

Special thanks to Joel Falcou for helpful comments while writing this article.

Topics: c++, auto