Skip to main content

C++ Systems · 45 min

Templates and Compile-Time Programming

C++ templates instantiate type-specific code at compile time. This page covers deduction, specialization, SFINAE, concepts, variadic packs, and constexpr evaluation.

Why This Matters

std::vector<int> and std::vector<double> are different instantiated types. A call to max<int>(3, 7) compiles to integer compare and move instructions; a call to max<double>(3.0, 7.0) compiles to floating-point compare and move instructions. No runtime type tag is needed for that dispatch.

This is why Eigen can turn a matrix expression into a fused loop, why std::ranges reports constraint failures near the user’s code, and why std::variant<int, float> stores a fixed finite set of alternatives without heap allocation for the alternatives themselves. The cost moves to compilation: more instantiations, larger debug symbols, and diagnostics that can span dozens of lines.

Core Definitions

Definition

Template

A C++ template is a parameterized declaration. Its parameters can be types, values, or other templates. The compiler forms concrete declarations by instantiating the template with arguments.

Definition

Template instantiation

Instantiation is the compile-time creation of a concrete function, class, variable, or alias from a template. max<int> and max<double> are separate function template specializations.

Definition

Substitution failure

Substitution failure happens when replacing template parameters with deduced or explicit arguments produces an invalid type or expression in an immediate context. Under SFINAE, that failed candidate is removed from overload resolution instead of ending compilation.

Definition

Constant evaluation

Constant evaluation is execution by the compiler for expressions that must, or can, be computed at compile time. constexpr marks functions and objects that can participate when their inputs are compile-time constants.

Function Templates and Type-Specific Code

A function template describes a family of functions. The usual example is small but exact.

template <typename T>
T max_value(T a, T b) {
    return b < a ? a : b;
}

int xi = max_value(3, 7);          // T deduced as int
double xd = max_value(3.0, 7.0);   // T deduced as double
auto xl = max_value<long>(3, 7);   // explicit T, both arguments converted to long

Template argument deduction uses parameter types and argument expressions. In max_value(3, 7), both arguments have type int, so T = int. In max_value(3.0, 7.0), T = double. The mixed call max_value(3, 7.0) fails because one argument wants T = int and the other wants T = double. C++ does not pick a common type for this single-parameter template.

You can write the common-type policy yourself.

#include <type_traits>

template <typename A, typename B>
std::common_type_t<A, B> max_common(A a, B b) {
    using R = std::common_type_t<A, B>;
    return static_cast<R>(b) < static_cast<R>(a) ? static_cast<R>(a) : static_cast<R>(b);
}

auto y = max_common(3, 7.0); // y has type double, value 7.0

On x86-64 with optimization, an instantiated integer maximum can compile to code shaped like this.

# int max_value<int>(int, int)
# System V ABI: a in edi, b in esi, return in eax
mov     eax, edi
cmp     esi, edi
cmovg   eax, esi
ret

A double version uses scalar floating-point registers.

# double max_value<double>(double, double)
# a in xmm0, b in xmm1, return in xmm0
maxsd   xmm0, xmm1
ret

The source template contains one body. The object file contains machine code for the instantiations that are odr-used in that translation unit, subject to inlining and link-time merging.

Class Templates, Layout, and Specialization

A class template parameterizes layout and member function types. The standard library declares std::vector as a class template, with extra allocator parameters omitted here.

template <typename T>
class vector {
    // real std::vector has allocator machinery and exact layout is not standardized
};

A small fixed-size class template shows the layout effect without relying on library internals.

#include <cstddef>

template <typename T, std::size_t N>
struct StaticVector {
    T data[N];
    std::size_t size;
};

static_assert(sizeof(StaticVector<int, 3>) == 24);
static_assert(sizeof(StaticVector<double, 3>) == 32);

On a common 64-bit ABI, StaticVector<int, 3> contains 12 bytes of int data, 4 bytes of padding to align std::size_t, then 8 bytes for size.

StaticVector<int, 3>, 24 bytes

offset  bytes  field
0       4      data[0]
4       4      data[1]
8       4      data[2]
12      4      padding
16      8      size

For StaticVector<double, 3>, the three doubles occupy 24 bytes and size occupies 8 bytes.

StaticVector<double, 3>, 32 bytes

offset  bytes  field
0       8      data[0]
8       8      data[1]
16      8      data[2]
24      8      size

A full specialization replaces the whole template for one exact argument list.

template <typename T>
struct TypeName {
    static constexpr const char* value = "other";
};

template <>
struct TypeName<int> {
    static constexpr const char* value = "int";
};

static_assert(TypeName<double>::value[0] == 'o');
static_assert(TypeName<int>::value[0] == 'i');

A partial specialization matches a subset of patterns. This is central to traits libraries.

template <typename T>
struct IsPointer {
    static constexpr bool value = false;
};

template <typename T>
struct IsPointer<T*> {
    static constexpr bool value = true;
};

static_assert(!IsPointer<int>::value);
static_assert(IsPointer<int*>::value);
static_assert(IsPointer<const double*>::value);

The primary template covers all T. The partial specialization covers all pointer types T*. The compiler selects the most specialized matching definition.

Variadic Templates and Parameter Packs

A variadic template accepts a pack of zero or more template arguments. The pack can be expanded in a type list, an expression list, or a fold expression.

template <typename... Ts>
struct TypeList {
    static constexpr unsigned count = sizeof...(Ts);
};

static_assert(TypeList<>::count == 0);
static_assert(TypeList<int, double, char>::count == 3);

Fold expressions give a compact way to reduce a pack.

template <typename... Ints>
constexpr auto sum(Ints... xs) {
    return (xs + ... + 0);
}

static_assert(sum() == 0);
static_assert(sum(10, 20, 7) == 37);

For sum(10, 20, 7), the expression behaves like (((10 + 20) + 7) + 0). The compiler knows the argument count and types at instantiation.

Parameter packs also drive generic storage. A simplified variant-like object for int and double needs storage large enough for the biggest alternative and alignment strict enough for both.

#include <cstddef>
#include <type_traits>

struct Two {
    unsigned char tag;
    alignas(double) unsigned char storage[8];
};

static_assert(sizeof(Two) == 16);

A typical byte layout is below. The tag needs 1 byte, then 7 bytes pad the storage to an 8-byte boundary, then 8 bytes store either an int payload with unused bytes or a double payload.

Two, 16 bytes

offset  bytes  meaning
0       1      tag, 0 for int and 1 for double
1       7      padding
8       8      payload storage

std::variant<int, double> has the same design pressure, but its exact layout is implementation-defined. The type list is known at compile time, so visitation can instantiate one branch per alternative.

SFINAE and Concepts

SFINAE is a filter on overload candidates. It is not a general exception mechanism. It applies during template argument substitution in immediate contexts.

#include <type_traits>

template <typename T>
auto size_of_range(const T& x) -> decltype(x.size()) {
    return x.size();
}

unsigned size_of_range(...) {
    return 0;
}

#include <vector>
static_assert(size_of_range(std::vector<int>{1, 2, 3}) == 3);
static_assert(size_of_range(42) == 0);

For std::vector<int>, decltype(x.size()) is valid and the function template remains a candidate. For int, x.size() is invalid during substitution, so the template is removed and the ellipsis overload is chosen.

C++20 concepts make the constraint explicit and make diagnostics shorter.

#include <concepts>

template <typename T>
concept HasSize = requires(const T& x) {
    { x.size() } -> std::convertible_to<unsigned long>;
};

template <HasSize T>
auto range_size(const T& x) {
    return x.size();
}

If range_size(42) is compiled, the diagnostic names the failed requirement x.size() rather than reporting a long chain of overload attempts. Concepts do not change the runtime model. They change which templates are viable and when constraint failure is reported.

A concept can also encode semantic expectations, though the compiler checks only the syntactic and type requirements written in the requires expression.

template <typename T>
concept Addable = requires(T a, T b) {
    { a + b } -> std::same_as<T>;
};

This checks that a + b exists and returns T. It does not prove associativity.

constexpr and Compile-Time Evaluation

A constexpr function is ordinary C++ with restrictions suitable for constant evaluation. If all inputs are compile-time constants and the context requires a constant expression, the compiler evaluates it.

constexpr unsigned factorial(unsigned n) {
    unsigned r = 1;
    for (unsigned i = 2; i <= n; ++i) {
        r *= i;
    }
    return r;
}

static_assert(factorial(5) == 120);

unsigned f(unsigned x) {
    return factorial(x); // runtime call or inline loop when x is not constant
}

The same function has two lives. factorial(5) inside static_assert must be evaluated by the compiler. factorial(x) in f uses runtime code because x is not known.

Compile-time computation can create lookup tables with no startup cost.

#include <array>

constexpr std::array<unsigned char, 16> make_nibble_popcount() {
    std::array<unsigned char, 16> a{};
    for (unsigned i = 0; i < 16; ++i) {
        unsigned v = i;
        unsigned c = 0;
        while (v != 0) {
            c += v & 1u;
            v >>= 1;
        }
        a[i] = static_cast<unsigned char>(c);
    }
    return a;
}

constexpr auto pc = make_nibble_popcount();
static_assert(pc[0] == 0);
static_assert(pc[7] == 3);
static_assert(pc[15] == 4);

The table bytes are the popcounts for values 0 through 15.

index  0 1 2 3 4 5 6 7 8 9 A B C D E F
byte   0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

A compiler can place these 16 bytes in read-only data. If every use is constant-folded, it might not emit the table at all.

Expression Templates, Ranges, and Variant

Expression templates store an expression tree in types rather than computing each operator immediately. Eigen uses this pattern for dense linear algebra.

template <typename L, typename R>
struct AddExpr {
    const L& l;
    const R& r;

    double operator[](int i) const {
        return l[i] + r[i];
    }
};

template <typename L, typename R>
AddExpr<L, R> operator+(const L& l, const R& r) {
    return {l, r};
}

For a + b + c, the type is nested: AddExpr<AddExpr<A, B>, C>. A later assignment loop can compute each output element once.

for (int i = 0; i != n; ++i) {
    out[i] = expr[i]; // expands to a[i] + b[i] + c[i]
}

Without expression templates, a library might allocate a temporary for a + b, then add c. For n = 1,000,000 doubles, the temporary is 8,000,000 bytes. The expression-template version can remove that temporary when the access pattern is simple and aliases are handled.

std::ranges uses templates and concepts to express iterator and range requirements. std::variant uses a type parameter pack to enumerate alternatives at compile time. These facilities are library code, but they depend on the language rules covered above: instantiation, specialization, overload resolution, and constant evaluation.

Key Result

The main invariant is zero-overhead abstraction under successful optimization: a template abstraction does not require runtime type dispatch when its choices are compile-time parameters.

For a template with kk distinct argument lists used in a program, the compiler may instantiate up to kk concrete bodies. If one body has text size BiB_i, the total emitted text before linker merging is bounded by

i=1kBi\sum_{i=1}^{k} B_i

Inlining changes the accounting. The code may shrink because calls disappear and constants propagate, or grow because the function body is copied into callers.

A second invariant governs compile-time computation. If expression EE is required to be a constant expression and its evaluation satisfies the language rules, then its result is part of the compiled program. If the same expression depends on runtime input, the compiled program contains code to compute it.

constexpr int twice(int x) { return 2 * x; }

static_assert(twice(21) == 42); // compile-time result

int g(int x) {
    return twice(x);            // runtime arithmetic when x is runtime data
}

The abstraction boundary is source-level. The machine boundary is after instantiation, overload selection, optimization, and linking.

Common Confusions

Watch Out

Templates are not runtime generics

Java-style generic code often uses one runtime representation for many type arguments. C++ templates instantiate concrete code for concrete arguments. vector<int> and vector<double> are different types with different element sizes and different generated member functions.

Watch Out

SFINAE does not hide errors inside a function body

SFINAE applies to substitution in immediate contexts such as parameter types, return types, and requires clauses. If the template is selected and its body contains an invalid expression for that T, compilation fails.

Watch Out

constexpr does not force compile-time execution everywhere

constexpr permits constant evaluation. It forces it only in contexts such as static_assert, array bounds that require constants, non-type template arguments, and initialization of constexpr variables. A call with runtime input still runs at runtime.

Watch Out

Concepts are checked constraints, not proofs of algebraic laws

A concept can require a + b to compile and return a chosen type. It cannot prove that addition is associative, commutative, or numerically stable unless those facts are encoded by tests or external proof tools.

Exercises

ExerciseCore

Problem

For the template below, state which calls compile and give the deduced T when they do.

template <typename T>
T max_value(T a, T b) {
    return b < a ? a : b;
}

auto a = max_value(4, 9);
auto b = max_value(4.0, 9.0);
auto c = max_value(4, 9.0);
auto d = max_value<double>(4, 9.0);
ExerciseCore

Problem

Assume a 64-bit ABI where sizeof(int) = 4, alignof(int) = 4, sizeof(double) = 8, alignof(double) = 8, and sizeof(std::size_t) = 8. Compute the size and field offsets for:

template <typename T, std::size_t N>
struct StaticVector {
    T data[N];
    std::size_t size;
};

using A = StaticVector<int, 5>;
using B = StaticVector<double, 2>;
ExerciseAdvanced

Problem

Write a C++20 concept HasBracketZero that accepts a type T when x[0] is a valid expression for const T& x. Then write a constrained function first_or_zero that returns x[0] for accepted types.

References

Canonical:

  • Bjarne Stroustrup, A Tour of C++ (3rd ed., 2022), ch. 7-8 — templates, concepts, library generic programming
  • Bjarne Stroustrup, The C++ Programming Language (4th ed., 2013), ch. 23-29 — templates, generic programming, specialization, metaprogramming
  • David Vandevoorde, Nicolai M. Josuttis, and Douglas Gregor, C++ Templates: The Complete Guide (2nd ed., 2017), ch. 1-5, 8, 15-19 — deduction, specialization, SFINAE, variadic templates
  • Randal E. Bryant and David R. O'Hallaron, Computer Systems: A Programmer's Perspective (3rd ed., 2016), ch. 3 and §6.4, machine code, object files, memory layout
  • ISO/IEC, ISO/IEC 14882:2020 Programming Languages C++, §7.7, §9.2.9, §13 ; constant expressions, templates, constraints

Accessible:

  • cppreference.com, “Templates,” “Constraints and concepts,” and “constexpr specifier”
  • Andrew Koenig and Barbara E. Moo, Accelerated C++ (2000), ch. 8-11
  • Eigen documentation, “Lazy Evaluation and Aliasing,” expression-template behavior in matrix expressions

Next Topics

  • /computationpath/cpp-object-lifetime-and-raii
  • /computationpath/cpp-move-semantics-and-value-categories
  • /computationpath/cpp-memory-layout-and-abi
  • /computationpath/static-polymorphism-and-crtp