Clang provides a user-friendly framework for writing basic static-analysis checks. In this post we will see how analysis on the clang abstract syntax tree (AST) can be used to implement powerful checks for single translation units pretty easily.
- The problem
- Setting up needed tools
- Adding scaffolding
- Anatomy of a checker
- A matcher for virtual methods
- Testing the check
- Processing of base classes
- Conclusion
The problem
As an example we will write a static analysis check which can catch the following problematic code:
// problematic.cpp
struct A {
void f() {}
};
struct B : public A {
virtual void f() {} // problematic
};
Here B
defines a virtual method B::f
with a name identical to a
member function in its base class A
which was never intended to be customized
(i.e. it was not declared as virtual
there).
This will create the confusing situation that depending on the type the member function was called through different functions will be called, e.g.
// create two pointers to a B, but through either A or B
std::unique_ptr<A> a(new B());
std::unique_ptr<B> b(new B());
a->f(); // calls A::f
b->f(); // calls B::f
We will write a check that catches the problematic method declaration in
class B
.
Setting up needed tools
We will write our check as an extension of the clang-tidy
tool which is part of clang.
If we have a compilation
database for our
source we can perform a check with clang-tidy
by running in the build
directory (assuming we have a compilation database)
% clang-tidy some/file.cpp
or on any source file, even without compilation databases, but with no automatic support for custom build flags
% clang-tidy some/file.cpp --
Since we will be working directly inside the clang tool sources we need to check out and build the upstream sources.
# download the sources
% git clone http://llvm.org/git/llvm.git
% cd llvm/tools/
% git clone http://llvm.org/git/clang.git
% cd clang/tools/
% git clone http://llvm.org/git/clang-tools-extra.git extra
# build everything
% cd ../../../
% mkdir build && cd build/
# see http://llvm.org/docs/CMake.html#options-and-variables
# for details on the cmake build process of LLVM
% cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
% make check-clang-tools
Make sure you add the bin/
directory of that build to your path, e.g.
% export PATH=$PWD/bin:$PATH
With that we are ready to add our custom check.
The clang-tidy
sources are in tools/clang/tools/extra/clang-tidy
inside the
LLVM source tree.
Adding scaffolding
Let’s change to the clang-tidy
source directory and add our check (which we
will aptly call VirtualShadowingCheck
).
% cd ../tools/clang/tools/extra/clang-tidy
We will add our tool to the [misc]
category of clang-tidy
checks and before
anything else will need to create the needed files and integrate them into the
build. Thankfully there is a tool doing all of that for us:
% ./add_new_check.py misc virtual-shadowing
This will create misc/VirtualShadowingCheck.h
and
misc/VirtualShadowingCheck.cpp
, and additionally include it in
misc/MiscTidyModule.cpp
so it can be run as a normal part of clang-tidy
.
As of clang-tools-extra
svn revision 236309
(git commit 6a5bbb2
) we still
need to modify misc/CMakeLists.txt
so that our newly added dependency
VirtualShadowingCheck.cpp
comes before the LINK_LIBS
line.
We can now create a version of clang-tidy
including our checker by rebuilding
llvm and tools. To run it on some code we would run
% clang-tidy -checks='-*,misc-virtual-shadowing' some/file.cpp
Here we have first disabled all default-enabled checks with -*
and then
exclusively enabled our check. Right now running this does not output too much
interesting information.
Anatomy of a checker
Looking at misc/VirtualShadowingCheck.h
we find
namespace clang {
namespace tidy {
class VirtualShadowingCheck : public ClangTidyCheck {
public:
VirtualShadowingCheck(StringRef Name, ClangTidyContext *Context)
: ClangTidyCheck(Name, Context) {}
void registerMatchers(ast_matchers::MatchFinder *Finder) override;
void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
};
} // namespace tidy
} // namespace clang
Here VirtualShadowingCheck
is our custom check defined inside the clang::tidy
namespace. It derives from
ClangTidyCheck
. We will need to provide implementations for two functions,
registerMatchers
and check
(remove their dummy implementations for the time being):
- in
registerMatchers
we register clang AST matchers to filter out intesting source locations, and - with
check
we provide a function which is called by the clang machinery whenever a match was found; we can perform further actions here (e.g. emit a warning).
In our case we want to check for any virtual
method of some class whether any
of the class’ bases defined a method with the same name, and our implementation
strategy will be
- filter out declarations of any virtual method with a matcher registered in
registerMatchers
, and - walk the bases of the matched class in
check
and compare to the matchedvirtual
method.
A matcher for virtual methods
Clang comes with a large set of basic matchers for many use cases. Chaining them allows creating powerful matchers.
To get a feeling for the kind of AST node representing B::f
looking at the
clang AST is helpful,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
% clang-check -ast-dump problematic.cpp --
TranslationUnitDecl 0x2b3cd20 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x2b3d258 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
|-TypedefDecl 0x2b3d2b8 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
|-TypedefDecl 0x2b3d698 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list '__va_list_tag [1]'
|-CXXRecordDecl 0x2b3d6e8 </media/sf_home_host/test.cpp:1:1, line:3:1> line:1:8 referenced struct A definition
| |-CXXRecordDecl 0x2b3d800 <col:1, col:8> col:8 implicit struct A
| `-CXXMethodDecl 0x2b3d8e0 <line:2:9, col:19> col:14 f 'void (void)'
| `-CompoundStmt 0x2b3d9b8 <col:18, col:19>
`-CXXRecordDecl 0x2b3d9d0 <line:5:1, line:7:1> line:5:8 struct B definition
|-public 'struct A'
|-CXXRecordDecl 0x2b85050 <col:1, col:8> col:8 implicit struct B
|-CXXMethodDecl 0x2b85100 <line:6:3, col:21> col:16 f 'void (void)' virtual
| `-CompoundStmt 0x2b854f8 <col:20, col:21>
|-CXXMethodDecl 0x2b85208 <line:5:8, <invalid sloc>> col:8 implicit operator= 'struct B &(const struct B &)' inline noexcept-unevaluated 0x2b85208
| `-ParmVarDecl 0x2b85330 <col:8> col:8 'const struct B &'
`-CXXDestructorDecl 0x2b853b8 <col:8> col:8 implicit ~B 'void (void)' inline noexcept-unevaluated 0x2b853b8
We see the two classes defined in lines 6 and 10 as CXXRecordDecl
which
encode both classes and structs; the two definitions of f
in lines 8 and 13
encoded as CXXMethodDecl
s which encapsulate (not much suprisingly)
declarations of methods in C++.
The filter we need would first need to catch method declarations and then then
refine that to only methods declared virtual
. As first filter we use the
cxxMethodDecl
matcher,
% clang-query problematic.cpp --
clang-query> match cxxMethodDecl()
/tmp/problematic.cpp:8:3: note: "root" binds here
virtual void f() {} // problematic
^~~~~~~~~~~~~~~~~~~
1 match.
clang-query> match cxxMethodDecl()
Match #1:
/tmp/problematic.cpp:4:3: note: "root" binds here
void f() {}
^~~~~~~~~~~
Match #2:
/tmp/problematic.cpp:8:3: note: "root" binds here
virtual void f() {} // problematic
^~~~~~~~~~~~~~~~~~~
Match #3:
Match #4:
/tmp/problematic.cpp:7:8: note: "root" binds here
struct B : public A {
^
4 matches.
which finds 4 CXXMethodDecl
s (two for our explicitly declared methods
f
and two for implicitly declared methods).
We chain that with the isVirtual
matcher to only match virtual methods,
% clang-query problematic.cpp --
clang-query> match cxxMethodDecl(isVirtual())
/tmp/problematic.cpp:8:3: note: "root" binds here
virtual void f() {} // problematic
^~~~~~~~~~~~~~~~~~~
1 match.
which finds just the method we are interested in.
All left for us to do is to update the definition of
VirtualShadowingCheck::registerMatchers
in misc/VirtualShadowingCheck.cpp
so it matches virtual
method declarations,
void VirtualShadowingCheck::registerMatchers(MatchFinder *Finder) {
Finder->addMatcher(cxxMethodDecl(isVirtual()).bind("method"), this);
}
This binds the identifier method
to the found method (note its placement with respect to the parentheses).
Testing the check
If you have built clang-tidy
with
% make check-clang-tools
you will have seen that the dummy test case added by add_new_check.py
now
fails so we should update it to at least correctly reflect our intended use
case.
// RUN: $(dirname %s)/check_clang_tidy.sh %s misc-virtual-shadowing %t
// REQUIRES: shell
struct A {
void f() {}
};
struct B : public A {
// CHECK-MESSAGES: :[[@LINE+1]]:3: warning: method hides non-virtual method from a base class [misc-virtual-shadowing]
virtual void f() {} // problematic
};
struct C {
virtual void f() {} // OK(1)
};
struct D : public C {
virtual void f() {} // OK(2)
};
Here we have added three test cases:
- the line marked problematic is our initial problem and should trigger a
warning. We have specified the location of the expected warning with the
CHECK-MESSAGES
macro: the warning should be on the next line (+1
) on column 3. We specified the full expected warning text. - the lines marked OK(1) and OK(2) should not trigger the warning since
they represent valid use cases; consequentlially we added no
CHECK_MESSAGES
markup.
We can already add the diagnostic message to VirtualShadowingCheck::check
,
// see next section
const auto method = Result.Nodes.getNodeAs<CXXMethodDecl>("method");
diag(method->getLocStart(),
"method hides non-virtual method from a base class");
If we rerun the test suite we will see that there is still some work left,
FAIL: Clang Tools :: clang-tidy/misc-virtual-shadowing.cpp (90 of 211)
******************** TEST 'Clang Tools :: clang-tidy/misc-virtual-shadowing.cpp' FAILED ********************
Script:
--
$(dirname /private/tmp/t/llvm/tools/clang/tools/extra/test/clang-tidy/misc-virtual-shadowing.cpp)/check_clang_tidy.sh /private/tmp/t/llvm/tools/clang/tools/extra/test/clang-tidy/misc-virtual-shadowing.cpp misc-virtual-shadowing /private/tmp/t/llvm/build/tools/clang/tools/extra/test/clang-tidy/Output/misc-virtual-shadowing.cpp.tmp
--
Exit Code: 1
Command Output (stdout):
--
------------------------ clang-tidy output ------------------------
3 warnings generated.
/private/tmp/t/llvm/build/tools/clang/tools/extra/test/clang-tidy/Output/misc-virtual-shadowing.cpp.tmp.cpp:10:3: warning: method hides non-virtual method from a base class [misc-virtual-shadowing]
virtual void f() {} // problematic
^
/private/tmp/t/llvm/build/tools/clang/tools/extra/test/clang-tidy/Output/misc-virtual-shadowing.cpp.tmp.cpp:14:3: warning: method hides non-virtual method from a base class [misc-virtual-shadowing]
virtual void f() {} // OK(1)
^
/private/tmp/t/llvm/build/tools/clang/tools/extra/test/clang-tidy/Output/misc-virtual-shadowing.cpp.tmp.cpp:18:3: warning: method hides non-virtual method from a base class [misc-virtual-shadowing]
virtual void f() {} // OK(2)
^
-------------------------------------------------------------------
------------------------------ Fixes ------------------------------
-------------------------------------------------------------------
--
Command Output (stderr):
--
/private/tmp/t/llvm/build/tools/clang/tools/extra/test/clang-tidy/Output/misc-virtual-shadowing.cpp.tmp.cpp.msg:5:115: error: CHECK-MESSAGES-NOT: string occurred!
/private/tmp/t/llvm/build/tools/clang/tools/extra/test/clang-tidy/Output/misc-virtual-shadowing.cpp.tmp.cpp:14:3: warning: method hides non-virtual method from a base class [misc-virtual-shadowing]
^
command line:1:22: note: CHECK-MESSAGES-NOT: pattern specified here
-implicit-check-not=':'
^
--
********************
Testing Time: 0.92s
********************
Failing Tests (1):
Clang Tools :: clang-tidy/misc-virtual-shadowing.cpp
Expected Passes : 204
Expected Failures : 6
Unexpected Failures: 1
FAILED: cd /tmp/t/llvm/build/tools/clang/tools/extra/test && /usr/local/bin/python2.7 /tmp/t/llvm/utils/lit/lit.py -sv /tmp/t/llvm/build/tools/clang/tools/extra/test
Processing of base classes
The matcher we registered will call VirtualShadowingCheck::check
whenever a
matching AST node was found. There we can retrieve the matches by name with
const auto method = Result.Nodes.getNodeAs<CXXMethodDecl>("method");
Here Result.Nodes.getNodeAs<CXXMethodDecl>
returns a CXXMethodDecl*
which
is valid as long as the translation unit is loaded (i.e. much longer than
check
is running).
We can already stop processing if the class containing method
has no bases,
const auto method = Result.Nodes.getNodeAs<CXXMethodDecl>("method");
const auto cl_decl = method->getParent();
if (cl_decl->getNumBases() == 0)
return;
diag(method->getLocStart(),
"method hides non-virtual method from a base class");
Rerunning the test suite shows that now case OK(1) is removed, but we still match OK(2).
To check the base classes for non-virtual methods with identical names we need
to walk the tree of bases; clang provides infrastructure to perform that walk
with
CXXRecordDecl::forallBases
,
bool CXXRecordDecl::forallBases(ForallBasesCallback *BaseMatches,
void *UserData,
bool AllowShortCircuit = true
) const
BaseMatches
is a callback with the signature
typedef bool ForallBasesCallback(const CXXRecordDecl *BaseDefinition,
void *UserData)
and BaseDefinition
a CXXRecordDecl
pointing to the declaration of a base
class. UserData
can point to something we could use to pass additional
information along1. With AllowShortCircuit = true
the callback will be called
for all bases as long the callback returns true
; for AllowShortCircuit =
false
all bases would be walked. If the class has no bases
CXXRecordDecl::forallBases
returns false
.
We can use this to recursively walk the tree of bases. We will pass a pointer
to method
as Userdata
to perform checks on the name. Inside
VirtualShadowingCheck::check
we would call
if (not cl_decl->forallBases(CandidatePred,
const_cast<CXXMethodDecl *>(method)))
return;
i.e. call some predicate CandidatePred
for all bases of the class containing
method
and give up if none of them returns true.
If this would not exit check
we emit a warning for method
since it collides
with some base’s name. With this VirtualShadowingCheck::check
’s declaration would look like
void VirtualShadowingCheck::check(
const ast_matchers::MatchFinder::MatchResult &Result) {
const auto method = Result.Nodes.getNodeAs<CXXMethodDecl>("method");
const auto cl_decl = method->getParent();
if (cl_decl->getNumBases() == 0)
return;
if (not cl_decl->forallBases(CandidatePred,
const_cast<CXXMethodDecl *>(method)))
return;
diag(method->getLocStart(),
"method hides non-virtual method from a base class");
}
Now all the work left is to define the predicate.
namespace
{
bool CandidatePred(const CXXRecordDecl *BaseDefinition, void *UserData) {
// get the original method from UserData
const auto method = reinterpret_cast<const CXXMethodDecl *>(UserData);
// get a string ref to the original method's name
const auto name = method->getName();
// check all the methods in the base class
for (const auto &base_method : BaseDefinition->methods()) {
if (name == base_method->getName()) {
// if we found a method with the same name check if it's virtual:
// * if it is not virtual the original method was problematic
// * if the base's method was virtual the original method was OK
// (but the base's method could be problematic and would be checked
// in a different analyzer run)
return not base_method->isVirtual();
}
}
// we found no collisions with this base's methods, now check the base's
// bases for collisions with the original method
return BaseDefinition->forallBases(CandidatePred, UserData);
}
} // namespace
Rerunning the test suite show that we have covered all our test cases,
Testing Time: 0.92s
Expected Passes : 205
Expected Failures : 6
Conclusion
While we have just scratched the surface of what is possible, it has never been easier to write custom static analysis checks of C++ code. Clang provides both infrastructure and tools which allow the user to focus on the real problem – formulating the problem.
-
In the development version clang-3.8 the signature changed and
UserData
got dropped,bool CXXRecordDecl::forallBases(ForallBasesCallback *BaseMatches, bool AllowShortCircuit = true)
Now
ForallBasesCallback
is a callable taking aconst CXXRecordDecl*
as only argument. To pass additional data along one can explicitly (and in a type-safe manner) capture data with a lambda closure. ↩