1. Introduction
is a widely-supported and very widely-used compiler extension for C and C++ which serves as an
alternative for traditional
-
-
include guards. This has previously been proposed for C and
C++, both in qualified and unqualified forms. This paper revisits the topic and presents a standard-defined unqualified
as well as a qualified version that is a shorthand for traditional include guard.
2. Motivation and Rationale for Standardization
is used in place of traditional guards because it’s less repetitive, it’s less of a distraction in code,
it’s not susceptible to qualifier collisions, and it’s less susceptible to mistakes such as typos in the qualifier,
forgetting to rename the qualifier during refactoring, or inadvertently not covering the entire header with the
. For these reasons
can be more robust than traditional
include guards, however, the
lack of standardization creates uncertainty about portability
Currently
is widely used as it is, in practice, a robust and portable facility. To many it has become a
de-facto language feature. However, conversely, many C and C++ programmers do not use
because it is non-standard, lacks clear semantics, and because of the looming question of subtle implementation
differences. This concern and uncertainty about such implementation differences is the origin for a lot of FUD
surrounding the facility.
From a user standpoint, it is unfortunate that an extension widespread enough to be a de-facto language feature is not standardized with clear behavior and portability guarantees.
2.1. Previous Proposals
Previous proposals include A Qualified Replacement for #pragma once [P0538] for C++ and
[N2896] for
C.
[P0538] proposes a
directive, avoiding the problem of determining file uniqueness by functioning as
a direct shorthand for the traditional three-directive
-based include guard. Factors that lead to this proposal
not being accepted included questions over standardizing
directly vs a similar lookalike and a hope that
modules would alleviate the need for such a preprocessor functionality.
[N2896] proposes both a
directive and a
directive, leaving the identification of file
uniqueness implementation-defined in the unqualified version.
2.2. Modules?
While modules will likely some day prevail over headers and the focus on modules likely played a role in why [P0538] was not accepted at the time, current practice indicates headers will continue to exist for the foreseeable future due to implementations requiring time to mature and new standards and practices taking time to adopt in C++ codebases.
2.3. This Proposal
This paper, in the interest of standardizing existing practices, proposes standardized semantics for an unqualified
. While current widespread use has shown that unqualified
is sufficient in the vast
majority of use cases, it is not an appropriate tool in all cases. Because of this, a qualified
shorthand for traditional
-based include guards is also proposed as a convenience.
3. Current Use
GitHub code search currently shows
appearing in 8.4 million C++ files hosted on GitHub, excluding
repository forks ([codesearch]), including in notable C++ codebases such as LLVM, nlohmann json, and Folly.
4. Compiler Support
has long been supported by all major implementations of C and C++:
Compiler | Support in Earliest CompilerVersion Available on Compiler Explorer |
---|---|
GCC | 3.4.6 ✔️ |
Clang | 3.0.0 ✔️ |
MSVC | 19.0 ✔️ |
ICC | 13.0.1 ✔️ |
ICX | 2021.1.2 ✔️ |
EDG | 6.5 ✔️ |
TCC | 0.9.27 ✔️ |
Movfuscator | Trunk ✔️ |
5. Determining File Uniqueness
The design of an unqualified
is complicated due to determining file uniqueness. Currently implementations
take varying approaches to this problem with varying pros and cons. Possible approaches include:
-
Canonical file paths (with symbolic links resolved etc.; MSVC currently appears to use this but without resolving links)
-
Filesystem unique id (inodes, etc., currently used by Clang)
-
File contents (currently used by GCC, in addition to a modification time check)
Each of these approaches can fail in unique manners, whether due to symbolic links, hard links, or copies of the same file in different locations, and each of the three most widely used C++ implementations have taken completely different approaches to determining file uniqueness.
GCC’s implementation with file content comparison isn’t optimal and can lead to unnecessary O(n^2) behavior https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58770. This isn’t by any means an optimal approach and a hash table could easily be used.
5.1. Shortcomings of Unqualified #pragma once
No matter how good the the approach to identifying file uniqueness there are some cases where
can lead
to multiple inclusion where a traditional include guard would not.
Consider a scenario such as:
usr / include / library / library . hpp vendored - dependency . hpp src / main . cpp vendored - dependency . hpp
With relevant files:
// main.cpp #include "vendored-dependency.hpp"#include <library/library.hpp>// library.hpp #include "vendored-dependency.hpp"
A traditional qualified include guard would prevent double inclusion here, however,
approaches based on
filesystem uniqueness or paths would not. GCC’s approach of using modification time and file contents would most likely
not either, even if the files are byte for byte equivalent generally you have to go out of your way to have the exact
same modification timestamp on two files.
Existing
implementations focus on unique entries on a filesystem (even GCC’s implementation due to the
modification timestamp check). Preventing multiple inclusion of headers, such as library headers, that may be used in
such a way that multiple identical or near-identical copies are included directly or transitively in a single
translation unit are outside the scope of an unqualified
. These should instead use qualified include
guards. Other than for this type of header file, unqualified
has proven to have few footguns.
In this example, considering GCC’s
semantics, if the timestamps are the same there still may be multiple
inclusions if the files don’t match exactly. One way this could happen is if the two vendored dependency headers are of
slightly different version. Version differences add another layer of complexity to the multiple inclusion problem. [P0538] proposed a qualified
directive that optionally includes a version. While it could be useful, this
paper does not propose such a utility in the interest of keeping the focus on standardizing existing practice.
Additionally, even in the case of traditional include guards libraries typically do not make any attempt to reconcile or
diagnose version discrepancies. This paper does not preclude adding such a facility in the future.
5.2. A Surprising #pragma once
Behavior
I have only once heard of
causing problems in practice. Consider a setup such as:
namespace v1 { #include "library_v1.hpp"} namespace v2 { #include "library_v2.hpp"}
Where both library headers included their own copy of a shared header. This may work under most circumstances with GCC due to its modification timestamp check, however, it can fail in scenarios where the files end up having identical modification times. This can happen in some automated testing setups, leading to a surprising discrepancy between local development and CI where there are errors in one namespace but not the other that are hard to reproduce locally.
It’s worth noting that while this behavior is surprising with
, this use of
clashes with the
concept of single-inclusion in general. Similar issues would emerge as a result of a qualified include guard as the
shared header used by both copies of the library would only be included in one namespace. Certainly, this isn’t an
idiomatic use of
.
Related gcc bugzilla entry: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52566.
6. Proposed Uniqueness System
We have learned from years of experience that in practice each of the approaches presented above, with the possible
exception of a file-path solution that doesn’t consider links, work well the vast majority of the time. Other than
headers which are more suited for qualified include guards (§ 5.1 Shortcomings of Unqualified #pragma once), surprising
behavior from unqualified
is exceptionally rare.
This paper proposes the GCC approach of file contents be used to determine file uniqueness for unqualified
because it is simple, portable, and robust in most cases. This avoids filesystem complexities by focusing
on the essence of header files, the contents within that you intend to include.
N.b.: This does not preclude implementation fast-path checks based on filesystem unique identifiers, canonical paths, or timestamps. In an optimized implementation a full comparison based on contents would only be needed in cases where multiple copies of the same header exist on disk and are included - everything else can be ruled out early.
7. Use Outside Headers
GCC and Clang (including LLVM-based compilers such as ICX) warn if
appears in source files. I don’t have
any strong opinion on this. This paper does not propose a mandate for such a warning in the interest of not breaking
existing code.
8. Directive Location
[N2896] proposes no preprocessing tokens shall appear before
. [P0538] suggests
apply to the rest of
the file from the point it is seen. The later doesn’t mention what to do in the following case:
#if foobar #once whatever #endif
Current practice is that if
appears anywhere in the file the whole file should receive the
treatment. In GCC, Clang, and MSVC,
in an
only applies if the condition is true.
// ---------------- // int x ; #pragma once // ok, applies to whole file // ---------------- // #if 1 #pragma once // ok, applies to whole file #endif // ---------------- // #if 0 #pragma once // doesn’t apply to whole file #endif
This paper does not propose a requirement that
should appear at the beginning of a header file.
9. Proposed Spelling
Existing practice is
and this paper proposes the same spelling in the interest of standardizing existing
practices. Other proposals have changed the spelling to
since
is by definition for
implementation-defined things. This paper proposes the spelling remain
, as it is simply pragmatic. This
allows existing code to benefit from new standardization without change and avoids any need to go out of the way to
support old standards, e.g. with:
#if defined(__cplusplus) && __cplusplus >= xxxxxx #once // new portable behavior #else #pragma once // old behavior that was good enough in practice #endif
10. Proposed Wording
Proposed wording relative to [N4950]:
Update [cpp.pragma/1]:
A preprocessing directive of the formwhere# pragma pp-tokens opt new-line does not begin with the identifier
pp - tokens causes the implementation to behave in an implementation-defined manner. The behavior may cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner. Any pragma that is not recognized by the implementation is ignored.
once
Add two paragraphs following [cpp.pragma/1]:
A preprocessing directive of the formshall cause no subsequent# pragma once new-line directives to perform replacement for a file with text contents identical to this file.
#include
A preprocessing directive of the formshall cause the implementation to behave as if the entire file’s contents are are wrapped inside an# pragma once identifier new-line which immediately defines
#ifdef identifier :
identifier #ifndef identifier #define identifier // file contents #endif
Add a note following the last paragraph, with updating numbering on subsequent notes:
[Note 1: If a preprocessing directive beginning withappears outside a header file the implementation may issue a warning. - end note]# pragma once