Difference between revisions of "C++ style guide"
Gongminmin (Talk | contribs) |
Gongminmin (Talk | contribs) |
||
Line 96: | Line 96: | ||
</pre> | </pre> | ||
− | In dir/foo. | + | In dir/foo.cpp or dir/foo_test.cpp, whose main purpose is to implement or test the stuff in dir2/foo2.hpp, order your includes as follows: |
#KlayGE/KlayGE.hpp | #KlayGE/KlayGE.hpp | ||
Line 146: | Line 146: | ||
===== Unnamed Namespaces ===== | ===== Unnamed Namespaces ===== | ||
− | *Unnamed namespaces are allowed and even encouraged in . | + | *Unnamed namespaces are allowed and even encouraged in .cpp files, to avoid runtime naming conflicts: |
<pre> | <pre> | ||
namespace | namespace | ||
Line 307: | Line 307: | ||
Functions defined in the same compilation unit as production classes may introduce unnecessary coupling and link-time dependencies when directly called from other compilation units; static member functions are particularly susceptible to this. Consider extracting a new class, or placing the functions in a namespace possibly in a separate library. | Functions defined in the same compilation unit as production classes may introduce unnecessary coupling and link-time dependencies when directly called from other compilation units; static member functions are particularly susceptible to this. Consider extracting a new class, or placing the functions in a namespace possibly in a separate library. | ||
− | If you must define a nonmember function and it is only needed in its . | + | If you must define a nonmember function and it is only needed in its .cpp file, use an unnamed namespace or static linkage (eg static int Foo() {...}) to limit its scope. |
=== Local Variables === | === Local Variables === | ||
Line 366: | Line 366: | ||
If you need a static or global variable of a class type, consider initializing a pointer (which will never be freed), from either your main() function or from pthread_once(). Note that this must be a raw pointer, not a "smart" pointer, since the smart pointer's destructor will have the order-of-destructor issue that we are trying to avoid. | If you need a static or global variable of a class type, consider initializing a pointer (which will never be freed), from either your main() function or from pthread_once(). Note that this must be a raw pointer, not a "smart" pointer, since the smart pointer's destructor will have the order-of-destructor issue that we are trying to avoid. | ||
+ | |||
+ | == Classes == | ||
+ | |||
+ | Classes are the fundamental unit of code in C++. Naturally, we use them extensively. This section lists the main dos and don'ts you should follow when writing a class. | ||
+ | |||
+ | === Doing Work in Constructors === | ||
+ | |||
+ | In general, constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | It is possible to perform initialization in the body of the constructor. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Convenience in typing. No need to worry about whether the class has been initialized or not. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | The problems with doing work in constructors are: | ||
+ | |||
+ | *There is no easy way for constructors to signal errors, short of using exceptions (which are forbidden). | ||
+ | *If the work fails, we now have an object whose initialization code failed, so it may be an indeterminate state. | ||
+ | *If the work calls virtual functions, these calls will not get dispatched to the subclass implementations. Future modification to your class can quietly introduce this problem even if your class is not currently subclassed, causing much confusion. | ||
+ | *If someone creates a global variable of this type (which is against the rules, but still), the constructor code will be called before main(), possibly breaking some implicit assumptions in the constructor code. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | If your object requires non-trivial initialization, consider having an explicit Init() method. In particular, constructors should not call virtual functions, attempt to raise errors, access potentially uninitialized global variables, etc. | ||
+ | |||
+ | === Default Constructors === | ||
+ | |||
+ | You must define a default constructor if your class defines member variables and has no other constructors. Otherwise the compiler will do it for you, badly. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | The default constructor is called when we new a class object with no arguments. It is always called when calling new[] (for arrays). | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Initializing structures by default, to hold "impossible" values, makes debugging much easier. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | Extra work for you, the code writer. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | If your class defines member variables and has no other constructors you must define a default constructor (one that takes no arguments). It should preferably initialize the object in such a way that its internal state is consistent and valid. | ||
+ | |||
+ | The reason for this is that if you have no other constructors and do not define a default constructor, the compiler will generate one for you. This compiler generated constructor may not initialize your object sensibly. | ||
+ | |||
+ | If your class inherits from an existing class but you add no new member variables, you are not required to have a default constructor. | ||
+ | |||
+ | === Explicit Constructors === | ||
+ | |||
+ | Use the C++ keyword explicit for constructors with one argument. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | Normally, if a constructor takes one argument, it can be used as a conversion. For instance, if you define Foo::Foo(string name) and then pass a string to a function that expects a Foo, the constructor will be called to convert the string into a Foo and will pass the Foo to your function for you. This can be convenient but is also a source of trouble when things get converted and new objects created without you meaning them to. Declaring a constructor explicit prevents it from being invoked implicitly as a conversion. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Avoids undesirable conversions. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | None. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | We require all single argument constructors to be explicit. Always put explicit in front of one-argument constructors in the class definition: explicit Foo(string name); | ||
+ | |||
+ | The exception is copy constructors, which, in the rare cases when we allow them, should probably not be explicit. Classes that are intended to be transparent wrappers around other classes are also exceptions. Such exceptions should be clearly marked with comments. | ||
+ | |||
+ | === Copy Constructors === | ||
+ | |||
+ | Provide a copy constructor and assignment operator only when necessary. Otherwise, disable them with private copy constructor and assignment operator. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | The copy constructor and assignment operator are used to create copies of objects. The copy constructor is implicitly invoked by the compiler in some situations, e.g. passing objects by value. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Copy constructors make it easy to copy objects. STL containers require that all contents be copyable and assignable. Copy constructors can be more efficient than CopyFrom()-style workarounds because they combine construction with copying, the compiler can elide them in some contexts, and they make it easier to avoid heap allocation. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | Implicit copying of objects in C++ is a rich source of bugs and of performance problems. It also reduces readability, as it becomes hard to track which objects are being passed around by value as opposed to by reference, and therefore where changes to an object are reflected. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | Few classes need to be copyable. Most should have neither a copy constructor nor an assignment operator. In many situations, a pointer or reference will work just as well as a copied value, with better performance. For example, you can pass function parameters by reference or pointer instead of by value, and you can store pointers rather than objects in an STL container. | ||
+ | |||
+ | If your class needs to be copyable, prefer providing a copy method, such as CopyFrom() or Clone(), rather than a copy constructor, because such methods cannot be invoked implicitly. If a copy method is insufficient in your situation (e.g. for performance reasons, or because your class needs to be stored by value in an STL container), provide both a copy constructor and assignment operator. | ||
+ | |||
+ | If your class does not need a copy constructor or assignment operator, you must explicitly disable them. To do so, add dummy declarations for the copy constructor and assignment operator in the private: section of your class, but do not provide any corresponding definition (so that any attempt to use them results in a link error). For example, in class Foo: | ||
+ | |||
+ | <pre> | ||
+ | class Foo | ||
+ | { | ||
+ | public: | ||
+ | Foo(int f0, int f1); | ||
+ | ~Foo(); | ||
+ | |||
+ | private: | ||
+ | Foo(Foo const &); | ||
+ | void operator=(Foo const &); | ||
+ | }; | ||
+ | </pre> | ||
+ | |||
+ | === Structs vs. Classes === | ||
+ | |||
+ | Use a struct only for passive objects that carry data; everything else is a class. | ||
+ | |||
+ | The struct and class keywords behave almost identically in C++. We add our own semantic meanings to each keyword, so you should use the appropriate keyword for the data-type you're defining. | ||
+ | |||
+ | structs should be used for passive objects that carry data, and may have associated constants, but lack any functionality other than access/setting the data members. The accessing/setting of fields is done by directly accessing the fields rather than through method invocations. Methods should not provide behavior but should only be used to set up the data members, e.g., constructor, destructor, Initialize(), Reset(), Validate(). | ||
+ | |||
+ | If more functionality is required, a class is more appropriate. If in doubt, make it a class. | ||
+ | |||
+ | For consistency with STL, you can use struct instead of class for functors and traits. | ||
+ | |||
+ | === Inheritance === | ||
+ | |||
+ | Composition is often more appropriate than inheritance. When using inheritance, make it public. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | When a sub-class inherits from a base class, it includes the definitions of all the data and operations that the parent base class defines. In practice, inheritance is used in two major ways in C++: implementation inheritance, in which actual code is inherited by the child, and interface inheritance, in which only method names are inherited. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Implementation inheritance reduces code size by re-using the base class code as it specializes an existing type. Because inheritance is a compile-time declaration, you and the compiler can understand the operation and detect errors. Interface inheritance can be used to programmatically enforce that a class expose a particular API. Again, the compiler can detect errors, in this case, when a class does not define a necessary method of the API. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | For implementation inheritance, because the code implementing a sub-class is spread between the base and the sub-class, it can be more difficult to understand an implementation. The sub-class cannot override functions that are not virtual, so the sub-class cannot change implementation. The base class may also define some data members, so that specifies physical layout of the base class. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | All inheritance should be public. If you want to do private inheritance, you should be including an instance of the base class as a member instead. | ||
+ | |||
+ | Do not overuse implementation inheritance. Composition is often more appropriate. Try to restrict use of inheritance to the "is-a" case: Bar subclasses Foo if it can reasonably be said that Bar "is a kind of" Foo. | ||
+ | |||
+ | Make your destructor virtual if necessary. If your class has virtual methods, its destructor should be virtual. | ||
+ | |||
+ | Limit the use of protected to those member functions that might need to be accessed from subclasses. Note that data members should be private. | ||
+ | |||
+ | When redefining an inherited virtual function, explicitly declare it virtual in the declaration of the derived class. Rationale: If virtual is omitted, the reader has to check all ancestors of the class in question to determine if the function is virtual or not. | ||
+ | |||
+ | === Multiple Inheritance === | ||
+ | |||
+ | Only very rarely is multiple implementation inheritance actually useful. We allow multiple inheritance only when at most one of the base classes has an implementation; all other base classes must be pure interface classes tagged with the Interface suffix. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | Multiple inheritance allows a sub-class to have more than one base class. We distinguish between base classes that are pure interfaces and those that have an implementation. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Multiple implementation inheritance may let you re-use even more code than single inheritance (see [[C++ style guide#Inheritance|Inheritance]]). | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | Only very rarely is multiple implementation inheritance actually useful. When multiple implementation inheritance seems like the solution, you can usually find a different, more explicit, and cleaner solution. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | Multiple inheritance is allowed only when all superclasses, with the possible exception of the first one, are pure interfaces. In order to ensure that they remain pure interfaces, they must end with the Interface suffix. | ||
+ | |||
+ | === Interfaces === | ||
+ | |||
+ | Classes that satisfy certain conditions are allowed, but not required, to end with an Interface suffix. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | A class is a pure interface if it meets the following requirements: | ||
+ | |||
+ | *It has only public pure virtual ("= 0") methods and static methods (but see below for destructor). | ||
+ | *It may not have non-static data members. | ||
+ | *It need not have any constructors defined. If a constructor is provided, it must take no arguments and it must be protected. | ||
+ | *If it is a subclass, it may only be derived from classes that satisfy these conditions and are tagged with the Interface suffix. | ||
+ | |||
+ | An interface class can never be directly instantiated because of the pure virtual method(s) it declares. To make sure all implementations of the interface can be destroyed correctly, the interface must also declare a virtual destructor (in an exception to the first rule, this should not be pure). See Stroustrup, The C++ Programming Language, 3rd edition, section 12.4 for details. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Tagging a class with the Interface suffix lets others know that they must not add implemented methods or non static data members. This is particularly important in the case of multiple inheritance. Additionally, the interface concept is already well-understood by Java programmers. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | The Interface suffix lengthens the class name, which can make it harder to read and understand. Also, the interface property may be considered an implementation detail that shouldn't be exposed to clients. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | A class may end with Interface only if it meets the above requirements. We do not require the converse, however: classes that meet the above requirements are not required to end with Interface. | ||
+ | |||
+ | === Operator Overloading === | ||
+ | |||
+ | Do not overload operators except in rare, special circumstances. | ||
+ | |||
+ | ==== Definition: ==== | ||
+ | A class can define that operators such as + and / operate on the class as if it were a built-in type. | ||
+ | |||
+ | ==== Pros: ==== | ||
+ | Can make code appear more intuitive because a class will behave in the same way as built-in types (such as int). Overloaded operators are more playful names for functions that are less-colorfully named, such as Equals() or Add(). For some template functions to work correctly, you may need to define operators. | ||
+ | |||
+ | ==== Cons: ==== | ||
+ | While operator overloading can make code more intuitive, it has several drawbacks: | ||
+ | |||
+ | *It can fool our intuition into thinking that expensive operations are cheap, built-in operations. | ||
+ | *It is much harder to find the call sites for overloaded operators. Searching for Equals() is much easier than searching for relevant invocations of ==. | ||
+ | *Some operators work on pointers too, making it easy to introduce bugs. Foo + 4 may do one thing, while &Foo + 4 does something totally different. The compiler does not complain for either of these, making this very hard to debug. | ||
+ | *Overloading also has surprising ramifications. For instance, if a class overloads unary operator&, it cannot safely be forward-declared. | ||
+ | |||
+ | ==== Decision: ==== | ||
+ | In general, do not overload operators. The assignment operator (operator=), in particular, is insidious and should be avoided. You can define functions like Equals() and CopyFrom() if you need them. Likewise, avoid the dangerous unary operator& at all costs, if there's any possibility the class might be forward-declared. | ||
+ | |||
+ | However, there may be rare cases where you need to overload an operator to interoperate with templates or "standard" C++ classes (such as operator<<(ostream&, const T&) for logging). These are acceptable if fully justified, but you should try to avoid these whenever possible. In particular, do not overload operator== or operator< just so that your class can be used as a key in an STL container; instead, you should create equality and comparison functor types when declaring the container. | ||
+ | |||
+ | Some of the STL algorithms do require you to overload operator==, and you may do so in these cases, provided you document why. | ||
+ | |||
+ | See also [[C++ style guide#Copy Constructors|Copy Constructors]] and [[C++ style guide#Function Overloading|Function Overloading]]. | ||
+ | |||
+ | === Access Control === | ||
+ | |||
+ | Make data members private, and provide access to them through accessor functions as needed. Typically a variable would be called foo_ and the accessor function foo(). You may also want a mutator function set_foo(). Exception: static const data members (typically called Foo) need not be private. | ||
+ | |||
+ | The definitions of accessors are usually inlined in the header file. | ||
+ | |||
+ | See also [[C++ style guide#Inheritance|Inheritance]] and [[C++ style guide#Function Names|Function Names]]. | ||
+ | |||
+ | === Declaration Order === | ||
+ | |||
+ | Use the specified order of declarations within a class: public: before private:, methods before data members (variables), etc. | ||
+ | |||
+ | Your class definition should start with its public: section, followed by its protected: section and then its private: section. If any of these sections are empty, omit them. | ||
+ | |||
+ | Within each section, the declarations generally should be in the following order: | ||
+ | |||
+ | *Typedefs and Enums | ||
+ | *Constants (static const data members) | ||
+ | *Constructors | ||
+ | *Destructor | ||
+ | *Methods, including static methods | ||
+ | *Data Members (except static const data members) | ||
+ | |||
+ | Friend declarations should always be in the private section, and the copy constructor and assignment operator for disallow copying should be at the end of the private: section. It should be the last thing in the class. See [[C++ style guide#Copy Constructors|Copy Constructors]]. | ||
+ | |||
+ | Method definitions in the corresponding .cpp file should be the same as the declaration order, as much as possible. | ||
+ | |||
+ | Do not put large method definitions inline in the class definition. Usually, only trivial or performance-critical, and very short, methods may be defined inline. See C++ style guide#Inline Functions|Inline Functions]] for more details. | ||
+ | |||
+ | == Write Short Functions == | ||
+ | |||
+ | Prefer small and focused functions. | ||
+ | |||
+ | We recognize that long functions are sometimes appropriate, so no hard limit is placed on functions length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program. | ||
+ | |||
+ | Even if your long function works perfectly now, someone modifying it in a few months may add new behavior. This could result in bugs that are hard to find. Keeping your functions short and simple makes it easier for other people to read and modify your code. | ||
+ | |||
+ | You could find long and complicated functions when working with some code. Do not be intimidated by modifying existing code: if working with such a function proves to be difficult, you find that errors are hard to debug, or you want to use a piece of it in several different contexts, consider breaking up the function into smaller and more manageable pieces. | ||
+ | |||
+ | == Other C++ Features == | ||
+ | |||
+ | == Naming == | ||
+ | |||
+ | == Comments == | ||
+ | |||
+ | == Formatting == | ||
+ | |||
+ | == Exceptions to the Rules == | ||
+ | |||
+ | == Parting Words == |
Revision as of 06:11, 30 April 2012
Contents
- 1 Background
- 2 Header Files
- 3 Scoping
- 4 Classes
- 5 Write Short Functions
- 6 Other C++ Features
- 7 Naming
- 8 Comments
- 9 Formatting
- 10 Exceptions to the Rules
- 11 Parting Words
Background
C++ is the main development language used by KlayGE. As every C++ programmer knows, the language has many powerful features, but this power brings with it complexity, which in turn can make code more bug-prone and harder to read and maintain.
The goal of this guide is to manage this complexity by describing in detail the dos and don'ts of writing C++ code. These rules exist to keep the code base manageable while still allowing coders to use C++ language features productively.
Style, also known as readability, is what we call the conventions that govern our C++ code. The term Style is a bit of a misnomer, since these conventions cover far more than just source file formatting.
One way in which we keep the code base manageable is by enforcing consistency. It is very important that any programmer be able to look at another's code and quickly understand it. Maintaining a uniform style and following conventions means that we can more easily use "pattern-matching" to infer what various symbols are and what invariants are true about them. Creating common, required idioms and patterns makes code much easier to understand. In some cases there might be good arguments for changing certain style rules, but we nonetheless keep things as they are in order to preserve consistency.
Another issue this guide addresses is that of C++ feature bloat. C++ is a huge language with many advanced features. In some cases we constrain, or even ban, use of certain features. We do this to keep code simple and to avoid the various common errors and problems that these features can cause. This guide lists these features and explains why their use is restricted.
Note that this guide is not a C++ tutorial: we assume that the reader is familiar with the language.
Header Files
In general, every .cpp file should have an associated .hpp file. There are some common exceptions, such as unittests and small .cpp files containing just a main() function.
Correct use of header files can make a huge difference to the readability, size and performance of your code.
The following rules will guide you through the various pitfalls of using header files.
The #define Guard
All header files should have both "#define guards" and "#pragma once" to prevent multiple inclusion and speed up compiling. The format of the symbol name should be _<FILE>_HPP. For example, the file foo.hpp should have the following guard:
#ifndef _FOO_HPP #define _FOO_HPP #pragma once ... #endif // _FOO_HPP
Header File Dependencies
Don't use an #include when a forward declaration would suffice.
When you include a header file you introduce a dependency that will cause your code to be recompiled whenever the header file changes. If your header file includes other header files, any change to those files will cause any code that includes your header to be recompiled. Therefore, we prefer to minimize includes, particularly includes of header files in other header files.
You can significantly reduce the number of header files you need to include in your own header files by using forward declarations. For example, if your header file uses the File class in ways that do not require access to the declaration of the File class, your header file can just forward declare class File; instead of having to #include "base/file.hpp". In KlayGE, all classes and structs have forward declarations in "KlayGE/PreDeclare.hpp". You can #include it in the front of your .hpp.
How can we use a class Foo in a header file without access to its definition?
- We can declare data members of type Foo* or Foo&.
- We can declare (but not define) functions with arguments, and/or return values, of type Foo. (One exception is if an argument Foo or const Foo& has a non-explicit, one-argument constructor, in which case we need the full definition to support automatic type conversion.)
- We can declare static data members of type Foo. This is because static data members are defined outside the class definition.
On the other hand, you must include the header file for Foo if your class subclasses Foo or has a data member of type Foo.
Sometimes it makes sense to have pointer (or better, scoped_ptr) members instead of object members. However, this complicates code readability and imposes a performance penalty, so avoid doing this transformation if the only purpose is to minimize includes in header files.
Of course, .cpp files typically do require the definitions of the classes they use, and usually have to include several header files.
Note: If you use a symbol Foo in your source file, you should bring in a definition for Foo yourself, either via an #include or via a forward declaration. Do not depend on the symbol being brought in transitively via headers not directly included. One exception is if Foo is used in myfile.cpp, it's ok to #include (or forward-declare) Foo in myfile.hpp, instead of myfile.cpp.
Inline Functions
Define functions inline only when they are small, say, 10 lines or less.
Definition:
You can declare functions in a way that allows the compiler to expand them inline rather than calling them through the usual function call mechanism.
Pros:
Inlining a function can generate more efficient object code, as long as the inlined function is small. Feel free to inline accessors and mutators, and other short, performance-critical functions.
Cons:
Overuse of inlining can actually make programs slower. Depending on a function's size, inlining it can cause the code size to increase or decrease. Inlining a very small accessor function will usually decrease code size while inlining a very large function can dramatically increase code size. On modern processors smaller code usually runs faster due to better use of the instruction cache.
Decision:
A decent rule of thumb is to not inline a function if it is more than 10 lines long. Beware of destructors, which are often longer than they appear because of implicit member- and base-destructor calls!
Another useful rule of thumb: it's typically not cost effective to inline functions with loops or switch statements (unless, in the common case, the loop or switch statement is never executed).
It is important to know that functions are not always inlined even if they are declared as such; for example, virtual and recursive functions are not normally inlined. Usually recursive functions should not be inline. The main reason for making a virtual function inline is to place its definition in the class, either for convenience or to document its behavior, e.g., for accessors and mutators.
Function Parameter Ordering
When defining a function, parameter order is: inputs, then outputs.
Parameters to C/C++ functions are either input to the function, output from the function, or both. Input parameters are usually values or const references, while output and input/output parameters will be non-const pointers. When ordering function parameters, put all input-only parameters before any output parameters. In particular, do not add new parameters to the end of the function just because they are new; place new input-only parameters before the output parameters.
This is not a hard-and-fast rule. Parameters that are both input and output (often classes/structs) muddy the waters, and, as always, consistency with related functions may require you to bend the rule.
Names and Order of Includes
Use standard order for readability and to avoid hidden dependencies: KlayGE's .hpp, C library, C++ library, other libraries' .hpp, your project's .hpp.
All of KlayGE's public header files are in "Include/KlayGE" directory, which is in the include path. Use it without UNIX directory shortcuts . (the current directory) or .. (the parent directory). For example, KlayGE/foo.hpp should be included as
#include <KlayGE/foo.hpp>
In dir/foo.cpp or dir/foo_test.cpp, whose main purpose is to implement or test the stuff in dir2/foo2.hpp, order your includes as follows:
- KlayGE/KlayGE.hpp
- dir2/foo2.h (preferred location — see details below).
- C system files.
- C++ system files.
- Other libraries' .hpp files.
- Your project's .hpp files.
With the preferred ordering, if dir/foo2.hpp omits any necessary includes, the build of dir/foo.cpp or dir/foo_test.cpp will break. Thus, this rule ensures that build breaks show up first for the people working on these files, not for innocent people in other packages.
dir/foo.cpp and dir2/foo2.hpp are often in the same directory (e.g. base/basictypes_test.cpp and base/basictypes.hpp), but can be in different directories too.
For example, the includes in KlayGE/src/foo.cpp might look like this:
#include <KlayGE/KlayGE.hpp> #include <KlayGE/Math.hpp> #include <KlayGE/ResLoader.hpp> #include <vector> #include <boost/shared_ptr.hpp> #include <KlayGE/foo.hpp>
Scoping
Namespaces
Unnamed namespaces in .cpp files are encouraged. With named namespaces, choose the name based on the project, and possibly its path. Do not use a using-directive.
Definition:
Namespaces subdivide the global scope into distinct, named scopes, and so are useful for preventing name collisions in the global scope.
Pros:
Namespaces provide a (hierarchical) axis of naming, in addition to the (also hierarchical) name axis provided by classes.
For example, if two different projects have a class Foo in the global scope, these symbols may collide at compile time or at runtime. If each project places their code in a namespace, project1::Foo and project2::Foo are now distinct symbols that do not collide.
Cons:
Namespaces can be confusing, because they provide an additional (hierarchical) axis of naming, in addition to the (also hierarchical) name axis provided by classes.
Use of unnamed spaces in header files can easily cause violations of the C++ One Definition Rule (ODR).
Decision:
Use namespaces according to the policy described below. Terminate namespaces with comments as shown in the given examples.
Unnamed Namespaces
- Unnamed namespaces are allowed and even encouraged in .cpp files, to avoid runtime naming conflicts:
namespace { // This is in a .cpp file. // The content of a namespace is not indented enum { Unused, EOF, Error }; // Commonly used tokens. bool AtEof() { return EOF == pos_; } // Uses our namespace's EOF. } // namespace
However, file-scope declarations that are associated with a particular class may be declared in that class as types, static data members or static member functions rather than as members of an unnamed namespace.
- Do not use unnamed namespaces in .hpp files.
Named Namespaces
Named namespaces should be used as follows:
- Namespaces wrap the entire source file after includes, definitions/declarations, and forward declarations of classes from other namespaces:
// In the .hpp file namespace mynamespace { // All declarations are within the namespace scope. // Notice the lack of indentation. class MyClass { public: ... void Foo(); }; } // In the .cpp file namespace mynamespace { // Definition of functions is within scope of the namespace. void MyClass::Foo() { ... } }
The typical .cpp file might have more complex detail, including the need to reference classes in other namespaces.
#include "a.hpp" #define someflag "dummy flag" class C; // Forward declaration of class C in the global namespace. namespace a { class A; // Forward declaration of a::A. } namespace b { ...code for b... // Code goes against the left margin. }
- Do not declare anything in namespace std, not even forward declarations of standard library classes. Declaring entities in namespace std is undefined behavior, i.e., not portable. To declare entities from the standard library, include the appropriate header file.
- You may not use a using-directive to make all names from a namespace available.
// Forbidden -- This pollutes the namespace. using namespace foo;
- You may use a using-declaration anywhere in a .cpp file, and in functions, methods or classes in .hpp files.
// OK in .cpp files. // Must be in a function, method or class in .hpp files. using ::foo::bar;
- Namespace aliases are allowed anywhere in a .cpp file, anywhere inside the named namespace that wraps an entire .hpp file, and in functions and methods.
// Shorten access to some commonly used names in .cpp files. namespace fbz = ::foo::bar::baz; // Shorten access to some commonly used names (in a .hpp file). namespace librarian { // The following alias is available to all files including // this header (in namespace librarian): // alias names should therefore be chosen consistently // within a project. namespace pd_s = ::pipeline_diagnostics::sidetable; inline void my_inline_function() { // namespace alias local to a function (or method). namespace fbz = ::foo::bar::baz; ... } } // namespace librarian
Note that an alias in a .hpp file is visible to everyone #including that file, so public headers (those available outside a project) and headers transitively #included by them, should avoid defining aliases, as part of the general goal of keeping public APIs as small as possible.
Nested Classes
Although you may use public nested classes when they are part of an interface, consider a namespace to keep declarations out of the global scope.
Definition:
A class can define another class within it; this is also called a member class.
class Foo { private: // Bar is a member class, nested within Foo. class Bar { ... }; };
Pros:
This is useful when the nested (or member) class is only used by the enclosing class; making it a member puts it in the enclosing class scope rather than polluting the outer scope with the class name. Nested classes can be forward declared within the enclosing class and then defined in the .cpp file to avoid including the nested class definition in the enclosing class declaration, since the nested class definition is usually only relevant to the implementation.
Cons:
Nested classes can be forward-declared only within the definition of the enclosing class. Thus, any header file manipulating a Foo::Bar* pointer will have to include the full class declaration for Foo.
Decision:
Do not make nested classes public unless they are actually part of the interface, e.g., a class that holds a set of options for some method.
Nonmember, Static Member, and Global Functions
Prefer nonmember functions within a namespace or static member functions to global functions; use completely global functions rarely.
Pros:
Nonmember and static member functions can be useful in some situations. Putting nonmember functions in a namespace avoids polluting the global namespace.
Cons:
Nonmember and static member functions may make more sense as members of a new class, especially if they access external resources or have significant dependencies.
Decision:
Sometimes it is useful, or even necessary, to define a function not bound to a class instance. Such a function can be either a static member or a nonmember function. Nonmember functions should not depend on external variables, and should nearly always exist in a namespace. Rather than creating classes only to group static member functions which do not share static data, use namespaces instead.
Functions defined in the same compilation unit as production classes may introduce unnecessary coupling and link-time dependencies when directly called from other compilation units; static member functions are particularly susceptible to this. Consider extracting a new class, or placing the functions in a namespace possibly in a separate library.
If you must define a nonmember function and it is only needed in its .cpp file, use an unnamed namespace or static linkage (eg static int Foo() {...}) to limit its scope.
Local Variables
Place a function's variables in the narrowest scope possible, and initialize variables in the declaration.
C++ allows you to declare variables anywhere in a function. We encourage you to declare them in as local a scope as possible, and as close to the first use as possible. This makes it easier for the reader to find the declaration and see what type the variable is and what it was initialized to. In particular, initialization should be used instead of declaration and assignment, e.g.
int i; i = f(); // Bad -- initialization separate from declaration.
int j = g(); // Good -- declaration has initialization.
Note that vc and gcc implements for (int i = 0; i < 10; ++ i) correctly (the scope of i is only the scope of the for loop), so you can then reuse i in another for loop in the same scope. It also correctly scopes declarations in if and while statements, e.g.
while (const char* p = strchr(str, '/')) { str = p + 1; }
There is one caveat: if the variable is an object, its constructor is invoked every time it enters scope and is created, and its destructor is invoked every time it goes out of scope.
// Inefficient implementation: for (int i = 0; i < 1000000; ++ i) { Foo f; // My ctor and dtor get called 1000000 times each. f.DoSomething(i); }
It may be more efficient to declare such a variable used in a loop outside that loop:
Foo f; // My ctor and dtor get called once each. for (int i = 0; i < 1000000; ++ i) { f.DoSomething(i); }
Static and Global Variables
Static or global variables of class type are forbidden: they cause hard-to-find bugs due to indeterminate order of construction and destruction. On some platform, (e.g. Android), a global variables with constructor crashes all the time.
Objects with static storage duration, including global variables, static variables, static class member variables, and function static variables, must be Plain Old Data (POD): only ints, chars, floats, or pointers, or arrays/structs of POD.
The order in which class constructors and initializers for static variables are called is only partially specified in C++ and can even change from build to build, which can cause bugs that are difficult to find. Therefore in addition to banning globals of class type, we do not allow static POD variables to be initialized with the result of a function, unless that function (such as getenv(), or getpid()) does not itself depend on any other globals.
Likewise, the order in which destructors are called is defined to be the reverse of the order in which the constructors were called. Since constructor order is indeterminate, so is destructor order. For example, at program-end time a static variable might have been destroyed, but code still running -- perhaps in another thread -- tries to access it and fails. Or the destructor for a static 'string' variable might be run prior to the destructor for another variable that contains a reference to that string.
As a result we only allow static variables to contain POD data. This rule completely disallows vector (use C arrays instead), or string (use const char []).
If you need a static or global variable of a class type, consider initializing a pointer (which will never be freed), from either your main() function or from pthread_once(). Note that this must be a raw pointer, not a "smart" pointer, since the smart pointer's destructor will have the order-of-destructor issue that we are trying to avoid.
Classes
Classes are the fundamental unit of code in C++. Naturally, we use them extensively. This section lists the main dos and don'ts you should follow when writing a class.
Doing Work in Constructors
In general, constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method.
Definition:
It is possible to perform initialization in the body of the constructor.
Pros:
Convenience in typing. No need to worry about whether the class has been initialized or not.
Cons:
The problems with doing work in constructors are:
- There is no easy way for constructors to signal errors, short of using exceptions (which are forbidden).
- If the work fails, we now have an object whose initialization code failed, so it may be an indeterminate state.
- If the work calls virtual functions, these calls will not get dispatched to the subclass implementations. Future modification to your class can quietly introduce this problem even if your class is not currently subclassed, causing much confusion.
- If someone creates a global variable of this type (which is against the rules, but still), the constructor code will be called before main(), possibly breaking some implicit assumptions in the constructor code.
Decision:
If your object requires non-trivial initialization, consider having an explicit Init() method. In particular, constructors should not call virtual functions, attempt to raise errors, access potentially uninitialized global variables, etc.
Default Constructors
You must define a default constructor if your class defines member variables and has no other constructors. Otherwise the compiler will do it for you, badly.
Definition:
The default constructor is called when we new a class object with no arguments. It is always called when calling new[] (for arrays).
Pros:
Initializing structures by default, to hold "impossible" values, makes debugging much easier.
Cons:
Extra work for you, the code writer.
Decision:
If your class defines member variables and has no other constructors you must define a default constructor (one that takes no arguments). It should preferably initialize the object in such a way that its internal state is consistent and valid.
The reason for this is that if you have no other constructors and do not define a default constructor, the compiler will generate one for you. This compiler generated constructor may not initialize your object sensibly.
If your class inherits from an existing class but you add no new member variables, you are not required to have a default constructor.
Explicit Constructors
Use the C++ keyword explicit for constructors with one argument.
Definition:
Normally, if a constructor takes one argument, it can be used as a conversion. For instance, if you define Foo::Foo(string name) and then pass a string to a function that expects a Foo, the constructor will be called to convert the string into a Foo and will pass the Foo to your function for you. This can be convenient but is also a source of trouble when things get converted and new objects created without you meaning them to. Declaring a constructor explicit prevents it from being invoked implicitly as a conversion.
Pros:
Avoids undesirable conversions.
Cons:
None.
Decision:
We require all single argument constructors to be explicit. Always put explicit in front of one-argument constructors in the class definition: explicit Foo(string name);
The exception is copy constructors, which, in the rare cases when we allow them, should probably not be explicit. Classes that are intended to be transparent wrappers around other classes are also exceptions. Such exceptions should be clearly marked with comments.
Copy Constructors
Provide a copy constructor and assignment operator only when necessary. Otherwise, disable them with private copy constructor and assignment operator.
Definition:
The copy constructor and assignment operator are used to create copies of objects. The copy constructor is implicitly invoked by the compiler in some situations, e.g. passing objects by value.
Pros:
Copy constructors make it easy to copy objects. STL containers require that all contents be copyable and assignable. Copy constructors can be more efficient than CopyFrom()-style workarounds because they combine construction with copying, the compiler can elide them in some contexts, and they make it easier to avoid heap allocation.
Cons:
Implicit copying of objects in C++ is a rich source of bugs and of performance problems. It also reduces readability, as it becomes hard to track which objects are being passed around by value as opposed to by reference, and therefore where changes to an object are reflected.
Decision:
Few classes need to be copyable. Most should have neither a copy constructor nor an assignment operator. In many situations, a pointer or reference will work just as well as a copied value, with better performance. For example, you can pass function parameters by reference or pointer instead of by value, and you can store pointers rather than objects in an STL container.
If your class needs to be copyable, prefer providing a copy method, such as CopyFrom() or Clone(), rather than a copy constructor, because such methods cannot be invoked implicitly. If a copy method is insufficient in your situation (e.g. for performance reasons, or because your class needs to be stored by value in an STL container), provide both a copy constructor and assignment operator.
If your class does not need a copy constructor or assignment operator, you must explicitly disable them. To do so, add dummy declarations for the copy constructor and assignment operator in the private: section of your class, but do not provide any corresponding definition (so that any attempt to use them results in a link error). For example, in class Foo:
class Foo { public: Foo(int f0, int f1); ~Foo(); private: Foo(Foo const &); void operator=(Foo const &); };
Structs vs. Classes
Use a struct only for passive objects that carry data; everything else is a class.
The struct and class keywords behave almost identically in C++. We add our own semantic meanings to each keyword, so you should use the appropriate keyword for the data-type you're defining.
structs should be used for passive objects that carry data, and may have associated constants, but lack any functionality other than access/setting the data members. The accessing/setting of fields is done by directly accessing the fields rather than through method invocations. Methods should not provide behavior but should only be used to set up the data members, e.g., constructor, destructor, Initialize(), Reset(), Validate().
If more functionality is required, a class is more appropriate. If in doubt, make it a class.
For consistency with STL, you can use struct instead of class for functors and traits.
Inheritance
Composition is often more appropriate than inheritance. When using inheritance, make it public.
Definition:
When a sub-class inherits from a base class, it includes the definitions of all the data and operations that the parent base class defines. In practice, inheritance is used in two major ways in C++: implementation inheritance, in which actual code is inherited by the child, and interface inheritance, in which only method names are inherited.
Pros:
Implementation inheritance reduces code size by re-using the base class code as it specializes an existing type. Because inheritance is a compile-time declaration, you and the compiler can understand the operation and detect errors. Interface inheritance can be used to programmatically enforce that a class expose a particular API. Again, the compiler can detect errors, in this case, when a class does not define a necessary method of the API.
Cons:
For implementation inheritance, because the code implementing a sub-class is spread between the base and the sub-class, it can be more difficult to understand an implementation. The sub-class cannot override functions that are not virtual, so the sub-class cannot change implementation. The base class may also define some data members, so that specifies physical layout of the base class.
Decision:
All inheritance should be public. If you want to do private inheritance, you should be including an instance of the base class as a member instead.
Do not overuse implementation inheritance. Composition is often more appropriate. Try to restrict use of inheritance to the "is-a" case: Bar subclasses Foo if it can reasonably be said that Bar "is a kind of" Foo.
Make your destructor virtual if necessary. If your class has virtual methods, its destructor should be virtual.
Limit the use of protected to those member functions that might need to be accessed from subclasses. Note that data members should be private.
When redefining an inherited virtual function, explicitly declare it virtual in the declaration of the derived class. Rationale: If virtual is omitted, the reader has to check all ancestors of the class in question to determine if the function is virtual or not.
Multiple Inheritance
Only very rarely is multiple implementation inheritance actually useful. We allow multiple inheritance only when at most one of the base classes has an implementation; all other base classes must be pure interface classes tagged with the Interface suffix.
Definition:
Multiple inheritance allows a sub-class to have more than one base class. We distinguish between base classes that are pure interfaces and those that have an implementation.
Pros:
Multiple implementation inheritance may let you re-use even more code than single inheritance (see Inheritance).
Cons:
Only very rarely is multiple implementation inheritance actually useful. When multiple implementation inheritance seems like the solution, you can usually find a different, more explicit, and cleaner solution.
Decision:
Multiple inheritance is allowed only when all superclasses, with the possible exception of the first one, are pure interfaces. In order to ensure that they remain pure interfaces, they must end with the Interface suffix.
Interfaces
Classes that satisfy certain conditions are allowed, but not required, to end with an Interface suffix.
Definition:
A class is a pure interface if it meets the following requirements:
- It has only public pure virtual ("= 0") methods and static methods (but see below for destructor).
- It may not have non-static data members.
- It need not have any constructors defined. If a constructor is provided, it must take no arguments and it must be protected.
- If it is a subclass, it may only be derived from classes that satisfy these conditions and are tagged with the Interface suffix.
An interface class can never be directly instantiated because of the pure virtual method(s) it declares. To make sure all implementations of the interface can be destroyed correctly, the interface must also declare a virtual destructor (in an exception to the first rule, this should not be pure). See Stroustrup, The C++ Programming Language, 3rd edition, section 12.4 for details.
Pros:
Tagging a class with the Interface suffix lets others know that they must not add implemented methods or non static data members. This is particularly important in the case of multiple inheritance. Additionally, the interface concept is already well-understood by Java programmers.
Cons:
The Interface suffix lengthens the class name, which can make it harder to read and understand. Also, the interface property may be considered an implementation detail that shouldn't be exposed to clients.
Decision:
A class may end with Interface only if it meets the above requirements. We do not require the converse, however: classes that meet the above requirements are not required to end with Interface.
Operator Overloading
Do not overload operators except in rare, special circumstances.
Definition:
A class can define that operators such as + and / operate on the class as if it were a built-in type.
Pros:
Can make code appear more intuitive because a class will behave in the same way as built-in types (such as int). Overloaded operators are more playful names for functions that are less-colorfully named, such as Equals() or Add(). For some template functions to work correctly, you may need to define operators.
Cons:
While operator overloading can make code more intuitive, it has several drawbacks:
- It can fool our intuition into thinking that expensive operations are cheap, built-in operations.
- It is much harder to find the call sites for overloaded operators. Searching for Equals() is much easier than searching for relevant invocations of ==.
- Some operators work on pointers too, making it easy to introduce bugs. Foo + 4 may do one thing, while &Foo + 4 does something totally different. The compiler does not complain for either of these, making this very hard to debug.
- Overloading also has surprising ramifications. For instance, if a class overloads unary operator&, it cannot safely be forward-declared.
Decision:
In general, do not overload operators. The assignment operator (operator=), in particular, is insidious and should be avoided. You can define functions like Equals() and CopyFrom() if you need them. Likewise, avoid the dangerous unary operator& at all costs, if there's any possibility the class might be forward-declared.
However, there may be rare cases where you need to overload an operator to interoperate with templates or "standard" C++ classes (such as operator<<(ostream&, const T&) for logging). These are acceptable if fully justified, but you should try to avoid these whenever possible. In particular, do not overload operator== or operator< just so that your class can be used as a key in an STL container; instead, you should create equality and comparison functor types when declaring the container.
Some of the STL algorithms do require you to overload operator==, and you may do so in these cases, provided you document why.
See also Copy Constructors and Function Overloading.
Access Control
Make data members private, and provide access to them through accessor functions as needed. Typically a variable would be called foo_ and the accessor function foo(). You may also want a mutator function set_foo(). Exception: static const data members (typically called Foo) need not be private.
The definitions of accessors are usually inlined in the header file.
See also Inheritance and Function Names.
Declaration Order
Use the specified order of declarations within a class: public: before private:, methods before data members (variables), etc.
Your class definition should start with its public: section, followed by its protected: section and then its private: section. If any of these sections are empty, omit them.
Within each section, the declarations generally should be in the following order:
- Typedefs and Enums
- Constants (static const data members)
- Constructors
- Destructor
- Methods, including static methods
- Data Members (except static const data members)
Friend declarations should always be in the private section, and the copy constructor and assignment operator for disallow copying should be at the end of the private: section. It should be the last thing in the class. See Copy Constructors.
Method definitions in the corresponding .cpp file should be the same as the declaration order, as much as possible.
Do not put large method definitions inline in the class definition. Usually, only trivial or performance-critical, and very short, methods may be defined inline. See C++ style guide#Inline Functions|Inline Functions]] for more details.
Write Short Functions
Prefer small and focused functions.
We recognize that long functions are sometimes appropriate, so no hard limit is placed on functions length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program.
Even if your long function works perfectly now, someone modifying it in a few months may add new behavior. This could result in bugs that are hard to find. Keeping your functions short and simple makes it easier for other people to read and modify your code.
You could find long and complicated functions when working with some code. Do not be intimidated by modifying existing code: if working with such a function proves to be difficult, you find that errors are hard to debug, or you want to use a piece of it in several different contexts, consider breaking up the function into smaller and more manageable pieces.