This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Differential Revision: https://reviews.llvm.org/D77112
Summary:
This check was causing a crash in a test case where the 0th argument was
uninitialized ('Assertion `T::isKind(*this)' at line SVals.h:104). This
was happening since the argument was actually undefined, but the castAs
assumes the value is DefinedOrUnknownSVal.
The fix appears to be simply to check for an undefined value and skip
the check allowing the uninitalized value checker to detect the error.
I included a test case that I verified to produce the negative case
prior to the fix, and passes with the fix.
Reviewers: martong, NoQ
Subscribers: xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, Charusso, ASDenysPetrov, baloghadamsoftware, dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77012
Since its important to know whether a function frees memory (even if its a
reallocating function!), I used two CallDescriptionMaps to merge all
CallDescriptions into it. MemFunctionInfoTy no longer makes sense, it may never
have, but for now, it would be more of a distraction then anything else.
Differential Revision: https://reviews.llvm.org/D68165
Summary:
Added basic representation and parsing/sema handling of array-shaping
operations. Array shaping expression is an expression of form ([s0]..[sn])base,
where s0, ..., sn must be a positive integer, base - a pointer. This
expression is a kind of cast operation that converts pointer expression
into an array-like kind of expression.
Reviewers: rjmccall, rsmith, jdoerfert
Subscribers: guansong, arphaman, cfe-commits, caomhin, kkwli0
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74144
Summary:
The kernel kmalloc function may return a constant value ZERO_SIZE_PTR
if a zero-sized block is allocated. This special value is allowed to
be passed to kfree and should produce no warning.
This is a simple version but should be no problem. The macro is always
detected independent of if this is a kernel source code or any other
code.
Reviewers: Szelethus, martong
Reviewed By: Szelethus, martong
Subscribers: rnkovacs, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gamesh411, Charusso, martong, ASDenysPetrov, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76830
Iterator checkers (and planned container checkers) need the option
aggressive-binary-operation-simplification to be enabled. Without this
option they may cause assertions. To prevent such misuse, this patch adds
a preventive check which issues a warning and denies the registartion of
the checker if this option is disabled.
Differential Revision: https://reviews.llvm.org/D75171
Some checkers may not only depend on language options but also analyzer options.
To make this possible this patch changes the parameter of the shouldRegister*
function to CheckerManager to be able to query the analyzer options when
deciding whether the checker should be registered.
Differential Revision: https://reviews.llvm.org/D75271
Originally commited in rG57b8a407493c34c3680e7e1e4cb82e097f43744a, but
it broke the modules bot. This is solved by putting the contructors of
the CheckerManager class to the Frontend library.
Differential Revision: https://reviews.llvm.org/D75360
If an error happens which is related to a container the Container
Modeling checker adds note tags to all the container operations along
the bug path. This may be disturbing if there are other containers
beside the one which is affected by the bug. This patch restricts the
note tags to only the affected container and adjust the debug checkers
to be able to test this change.
Differential Revision: https://reviews.llvm.org/D75514
Container operations such as `push_back()`, `pop_front()`
etc. increment and decrement the abstract begin and end
symbols of containers. This patch introduces note tags
to `ContainerModeling` to track these changes. This helps
the user to better identify the source of errors related
to containers and iterators.
Differential Revision: https://reviews.llvm.org/D73720
Normally clang avoids creating expressions when it encounters semantic
errors, even if the parser knows which expression to produce.
This works well for the compiler. However, this is not ideal for
source-level tools that have to deal with broken code, e.g. clangd is
not able to provide navigation features even for names that compiler
knows how to resolve.
The new RecoveryExpr aims to capture the minimal set of information
useful for the tools that need to deal with incorrect code:
source range of the expression being dropped,
subexpressions of the expression.
We aim to make constructing RecoveryExprs as simple as possible to
ensure writing code to avoid dropping expressions is easy.
Producing RecoveryExprs can result in new code paths being taken in the
frontend. In particular, clang can produce some new diagnostics now and
we aim to suppress bogus ones based on Expr::containsErrors.
We deliberately produce RecoveryExprs only in the parser for now to
minimize the code affected by this patch. Producing RecoveryExprs in
Sema potentially allows to preserve more information (e.g. type of an
expression), but also results in more code being affected. E.g.
SFINAE checks will have to take presence of RecoveryExprs into account.
Initial implementation only works in C++ mode, as it relies on compiler
postponing diagnostics on dependent expressions. C and ObjC often do not
do this, so they require more work to make sure we do not produce too
many bogus diagnostics on the new expressions.
See documentation of RecoveryExpr for more details.
original patch from Ilya
This change is based on https://reviews.llvm.org/D61722
Reviewers: sammccall, rsmith
Reviewed By: sammccall, rsmith
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69330
TableGen and .def files (which are meant to be used with the preprocessor) come
with obvious downsides. One of those issues is that generated switch-case
branches have to be identical. This pushes corner cases either to an outer code
block, or into the generated code.
Inspect the removed code in AnalysisConsumer::DigestAnalyzerOptions. You can see
how corner cases like a not existing output file, the analysis output type being
set to PD_NONE, or whether to complement the output with additional diagnostics
on stderr lay around the preprocessor generated code. This is a bit problematic,
as to how to deal with such errors is not in the hands of the users of this
interface (those implementing output types, like PlistDiagnostics etc).
This patch changes this by moving these corner cases into the generated code,
more specifically, into the called functions. In addition, I introduced a new
output type for convenience purposes, PD_TEXT_MINIMAL, which always existed
conceptually, but never in the actual Analyses.def file. This refactoring
allowed me to move TextDiagnostics (renamed from ClangDiagPathDiagConsumer) to
its own file, which it really deserved.
Also, those that had the misfortune to gaze upon Analyses.def will probably
enjoy the sight that a clang-format did on it.
Differential Revision: https://reviews.llvm.org/D76509
Upon calling one of the functions `std::advance()`, `std::prev()` and
`std::next()` iterators could get out of their valid range which leads
to undefined behavior. If all these funcions are inlined together with
the functions they call internally (e.g. `__advance()` called by
`std::advance()` in some implementations) the error is detected by
`IteratorRangeChecker` but the bug location is inside the STL
implementation. Even worse, if the budget runs out and one of the calls
is not inlined the bug remains undetected. This patch fixes this
behavior: all the bugs are detected at the point of the STL function
invocation.
Differential Revision: https://reviews.llvm.org/D76379
Its been a while since my CheckerRegistry related patches landed, allow me to
refresh your memory:
During compilation, TblGen turns
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td into
(build directory)/tools/clang/include/clang/StaticAnalyzer/Checkers/Checkers.inc.
This is a file that contains the full name of the checkers, their options, etc.
The class that is responsible for parsing this file is CheckerRegistry. The job
of this class is to establish what checkers are available for the analyzer (even
from plugins and statically linked but non-tblgen generated files!), and
calculate which ones should be turned on according to the analyzer's invocation.
CheckerManager is the class that is responsible for the construction and storage
of checkers. This process works by first creating a CheckerRegistry object, and
passing itself to CheckerRegistry::initializeManager(CheckerManager&), which
will call the checker registry functions (for example registerMallocChecker) on
it.
The big problem here is that these two classes lie in two different libraries,
so their interaction is pretty awkward. This used to be far worse, but I
refactored much of it, which made things better but nowhere near perfect.
---
This patch changes how the above mentioned two classes interact. CheckerRegistry
is mainly used by CheckerManager, and they are so intertwined, it makes a lot of
sense to turn in into a field, instead of a one-time local variable. This has
additional benefits: much of the information that CheckerRegistry conveniently
holds is no longer thrown away right after the analyzer's initialization, and
opens the possibility to pass CheckerManager in the shouldRegister* function
rather then LangOptions (D75271).
There are a few problems with this. CheckerManager isn't the only user, when we
honor help flags like -analyzer-checker-help, we only have access to a
CompilerInstance class, that is before the point of parsing the AST.
CheckerManager makes little sense without ASTContext, so I made some changes and
added new constructors to make it constructible for the use of help flags.
Differential Revision: https://reviews.llvm.org/D75360
Whenever the analyzer budget runs out just at the point where
`std::advance()`, `std::prev()` or `std::next()` is invoked the function
are not inlined. This results in strange behavior such as
`std::prev(v.end())` equals `v.end()`. To prevent this model these
functions if they were not inlined. It may also happend that although
`std::advance()` is inlined but a function it calls inside (e.g.
`__advance()` in some implementations) is not. This case is also handled
in this patch.
Differential Revision: https://reviews.llvm.org/D76361
Summary:
Currently, ValueRange is very hard to extend with new kind of constraints.
For instance, it forcibly encapsulates relations between arguments and the
return value (ComparesToArgument) besides handling the regular value
ranges (OutOfRange, WithinRange).
ValueRange in this form is not suitable to add new constraints on
arguments like "not-null".
This refactor introduces a new base class ValueConstraint with an
abstract apply function. Descendants must override this. There are 2
descendants: RangeConstraint and ComparisonConstraint. In the following
patches I am planning to add the NotNullConstraint, and additional
virtual functions like `negate()` and `warning()`.
Reviewers: NoQ, Szelethus, balazske, gamesh411, baloghadamsoftware, steakhal
Subscribers: whisperity, xazax.hun, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74973
This makes life easier for downstream users who maintain exotic
target platforms.
Patch by Vince Bridgers!
Differential Revision: https://reviews.llvm.org/D75529
Most clients of SourceManager.h need to do things like turning source
locations into file & line number pairs, but this doesn't require
bringing in FileManager.h and LLVM's FS headers.
The main code change here is to sink SM::createFileID into the cpp file.
I reason that this is not performance critical because it doesn't happen
on the diagnostic path, it happens along the paths of macro expansion
(could be hot) and new includes (less hot).
Saves some includes:
309 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileManager.h
272 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/Basic/FileSystemOptions.h
271 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/VirtualFileSystem.h
267 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/FileSystem.h
266 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Chrono.h
Differential Revision: https://reviews.llvm.org/D75406
error: default initialization of an object of const type
'const clang::QualType' without a user-provided
default constructor
Irrelevant; // A placeholder, whenever we do not care about the type.
^
{}
Lambdas creating path notes using NoteTags still take BugReport as their
parameter. Since path notes obviously only appear in PathSensitiveBugReports
it is straightforward that lambdas of NoteTags take PathSensitiveBugReport
as their parameter.
Differential Revision: https://reviews.llvm.org/D75898
Most of the getter functions (and a reporter function) in
`CheckerManager` are constant but not marked as `const`. This prevents
functions having only a constant reference to `CheckerManager` using
these member functions. This patch fixes this issue.
Differential Revision: https://reviews.llvm.org/D75839
Summary:
According to documentations, after an `fclose` call any other stream
operations cause undefined behaviour, regardless if the close failed
or not.
This change adds the check for the opened state before all other
(applicable) operations.
Reviewers: Szelethus
Reviewed By: Szelethus
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gamesh411, Charusso, martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75614
Summary:
Adding PreCall callback.
Argument validity checks are moved into the PreCall callback.
Code is restructured, functions renamed.
There are "pre" and "eval" functions for the file operations.
And additional state check (validate) functions.
Reviewers: Szelethus
Reviewed By: Szelethus
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gamesh411, Charusso, martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75612
Summary:
Intended to be a non-functional change but it turned out CallEvent handles
constructor calls unlike CallExpr which doesn't triggered for constructors.
All in all, this change shouldn't be observable since constructors are not
yet propagating taintness like functions.
In the future constructors should propagate taintness as well.
This change includes:
- NFCi change all uses of the CallExpr to CallEvent
- NFC rename some functions, mark static them etc.
- NFC omit explicit TaintPropagationRule type in switches
- NFC apply some clang-tidy fixits
Reviewers: NoQ, Szelethus, boga95
Reviewed By: Szelethus
Subscribers: martong, whisperity, xazax.hun, baloghadamsoftware, szepet,
a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72035
Summary:
`ScopeContext` wanted to be a thing, but sadly it is dead code.
If you wish to continue the work in D19979, here was a tiny code which
could be reused, but that tiny and that dead, I felt that it is unneded.
Note: Other changes are truly uninteresting.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D73519
Summary: The new way of checking fix-its is `%check_analyzer_fixit`.
Reviewed By: NoQ, Szelethus, xazax.hun
Differential Revision: https://reviews.llvm.org/D73729
Summary:
This patch introduces a way to apply the fix-its by the Analyzer:
`-analyzer-config apply-fixits=true`.
The fix-its should be testable, therefore I have copied the well-tested
`check_clang_tidy.py` script. The idea is that the Analyzer's workflow
is different so it would be very difficult to use only one script for
both Tidy and the Analyzer, the script would diverge a lot.
Example test: `// RUN: %check-analyzer-fixit %s %t -analyzer-checker=core`
When the copy-paste happened the original authors were:
@alexfh, @zinovy.nis, @JonasToth, @hokein, @gribozavr, @lebedev.ri
Reviewed By: NoQ, alexfh, zinovy.nis
Differential Revision: https://reviews.llvm.org/D69746
Summary:
This patch introduces the `clang_analyzer_isTainted` expression inspection
check for checking taint.
Using this we could query the analyzer whether the expression used as the
argument is tainted or not. This would be useful in tests, where we don't want
to issue warning for all tainted expressions in a given file
(like the `debug.TaintTest` would do) but only for certain expressions.
Example usage:
```lang=c++
int read_integer() {
int n;
clang_analyzer_isTainted(n); // expected-warning{{NO}}
scanf("%d", &n);
clang_analyzer_isTainted(n); // expected-warning{{YES}}
clang_analyzer_isTainted(n + 2); // expected-warning{{YES}}
clang_analyzer_isTainted(n > 0); // expected-warning{{YES}}
int next_tainted_value = n; // no-warning
return n;
}
```
Reviewers: NoQ, Szelethus, baloghadamsoftware, xazax.hun, boga95
Reviewed By: Szelethus
Subscribers: martong, rnkovacs, whisperity, xazax.hun,
baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy,
Charusso, cfe-commits, boga95, dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74131
Summary:
Have a description object for the stream functions
that can store different aspects of a single stream operation.
I plan to extend the structure with other members,
for example pre-callback and index of the stream argument.
Reviewers: Szelethus, baloghadamsoftware, NoQ, martong, Charusso, xazax.hun
Reviewed By: Szelethus
Subscribers: rnkovacs, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gamesh411, Charusso, martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75158
So far we've been dropping coverage every time we've encountered
a CXXInheritedCtorInitExpr. This patch attempts to add some
initial support for it.
Constructors for arguments of a CXXInheritedCtorInitExpr are still
not fully supported.
Differential Revision: https://reviews.llvm.org/D74735
Exactly what it says on the tin! I decided not to merge this with the patch that
changes all these to a CallDescriptionMap object, so the patch is that much more
trivial.
Differential Revision: https://reviews.llvm.org/D68163
Currently, using negative numbers in iterator operations (additions and
subractions) results in advancements with huge positive numbers due to
an error. This patch fixes it.
Differential Revision: https://reviews.llvm.org/D74760
The following series of refactoring patches aim to fix the horrible mess that MallocChecker.cpp is.
I genuinely hate this file. It goes completely against how most of the checkers
are implemented, its by far the biggest headache regarding checker dependencies,
checker options, or anything you can imagine. On top of all that, its just bad
code. Its seriously everything that you shouldn't do in C++, or any other
language really. Bad variable/class names, in/out parameters... Apologies, rant
over.
So: there are a variety of memory manipulating function this checker models. One
aspect of these functions is their AllocationFamily, which we use to distinguish
between allocation kinds, like using free() on an object allocated by operator
new. However, since we always know which function we're actually modeling, in
fact we know it compile time, there is no need to use tricks to retrieve this
information out of thin air n+1 function calls down the line. This patch changes
many methods of MallocChecker to take a non-optional AllocationFamily template
parameter (which also makes stack dumps a bit nicer!), and removes some no
longer needed auxiliary functions.
Differential Revision: https://reviews.llvm.org/D68162
Summary:
PutenvWithAutoChecker.cpp used to include "AllocationState.h" that is present in project root.
This makes build systems like blaze unhappy. Made it include the header relative to source file.
Reviewers: kadircet
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, Charusso, martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74906
Summary:
This patch introduces a new checker:
`alpha.security.cert.pos.34c`
This checker is implemented based on the following rule:
https://wiki.sei.cmu.edu/confluence/x/6NYxBQ
The check warns if `putenv` function is
called with automatic storage variable as an argument.
Differential Revision: https://reviews.llvm.org/D71433
In the path-sensitive vfork() checker that keeps a list of operations
allowed after a successful vfork(), unforget to include execve() in the list.
Patch by Jan Včelák!
Differential Revision: https://reviews.llvm.org/D73629
Summary:
Both EOF and the max value of unsigned char is platform dependent. In this
patch we try our best to deduce the value of EOF from the Preprocessor,
if we can't we fall back to -1.
Reviewers: Szelethus, NoQ
Subscribers: whisperity, xazax.hun, kristof.beyls, baloghadamsoftware, szepet, rnkovacs, a.sidorin, mikhail.ramalh
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74473
Summary:
Simplifies the C++11-style "-> decltype(...)" return-type deduction.
Note that you have to be careful about whether the function return type
is `auto` or `decltype(auto)`. The difference is that bare `auto`
strips const and reference, just like lambda return type deduction. In
some cases that's what we want (or more likely, we know that the return
type is a value type), but whenever we're wrapping a templated function
which might return a reference, we need to be sure that the return type
is decltype(auto).
No functional change.
Reviewers: bkramer, MaskRay, martong, shafik
Subscribers: martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74423
STL Algorithms are usually implemented in a tricky for performance
reasons which is too complicated for the analyzer. Furthermore inlining
them is costly. Instead of inlining we should model their behavior
according to the specifications.
This patch is the first step towards STL Algorithm modeling. It models
all the `find()`-like functions in a simple way: the result is either
found or not. In the future it can be extended to only return success if
container modeling is also extended in a way the it keeps track of
trivial insertions and deletions.
Differential Revision: https://reviews.llvm.org/D70818
Summary:
This patch hooks the `Preprocessor` trough `BugReporter` to the
`CheckerContext` so the checkers could look for macro definitions.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69731
Summary:
This patch uses the new `DynamicSize.cpp` to serve dynamic information.
Previously it was static and probably imprecise data.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69599
Summary:
This patch introduces a placeholder for representing the dynamic size of
regions. It also moves the `getExtent()` method of `SubRegions` to the
`MemRegionManager` as `getStaticSize()`.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D69540
Iterator modeling depends on container modeling,
but not vice versa. This enables the possibility
to arrange these two modeling checkers into
separate layers.
There are several advantages for doing this: the
first one is that this way we can keep the
respective modeling checkers moderately simple
and small. Furthermore, this enables creation of
checkers on container operations which only
depend on the container modeling. Thus iterator
modeling can be disabled together with the
iterator checkers if they are not needed.
Since many container operations also affect
iterators, container modeling also uses the
iterator library: it creates iterator positions
upon calling the `begin()` or `end()` method of
a containter (but propagation of the abstract
position is left to the iterator modeling),
shifts or invalidates iterators according to the
rules upon calling a container modifier and
rebinds the iterator to a new container upon
`std::move()`.
Iterator modeling propagates the abstract
iterator position, handles the relations between
iterator positions and models iterator
operations such as increments and decrements.
Differential Revision: https://reviews.llvm.org/D73547
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
These are mostly trivial additions as both of them are reusing existing
PThreadLockChecker logic. I only needed to add the list of functions to
check and do some plumbing to make sure that we display the right
checker name in the diagnostic.
Differential Revision: https://reviews.llvm.org/D73376
Summary:
Instead of checking the range manually, changed the checker to use assumeInclusiveRangeDual instead.
This patch was part of D28955.
Reviewers: NoQ
Reviewed By: NoQ
Subscribers: ddcc, xazax.hun, baloghadamsoftware, szepet, a.sidorin, Szelethus, donat.nagy, dkrupp, Charusso, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73062
Implement support for C++2a requires-expressions.
Re-commit after compilation failure on some platforms due to alignment issues with PointerIntPair.
Differential Revision: https://reviews.llvm.org/D50360
Summary:
This checker verifies if default placement new is provided with pointers
to sufficient storage capacity.
Noncompliant Code Example:
#include <new>
void f() {
short s;
long *lp = ::new (&s) long;
}
Based on SEI CERT rule MEM54-CPP
https://wiki.sei.cmu.edu/confluence/display/cplusplus/MEM54-CPP.+Provide+placement+new+with+properly+aligned+pointe
This patch does not implement checking of the alignment.
Reviewers: NoQ, xazax.hun
Subscribers: mgorny, whisperity, xazax.hun, baloghadamsoftware, szepet,
rnkovacs, a.sidorin, mikhail.ramalho, donat
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71612
This avoids unneeded copies when using a range-based for loops.
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.
Differential Revision: https://reviews.llvm.org/D70869
Method '-[NSCoder decodeValueOfObjCType:at:]' is not only deprecated
but also a security hazard, hence a loud check.
Differential Revision: https://reviews.llvm.org/D71728
MallocChecker warns when memory is passed into -[NSData initWithBytesNoCopy]
but isn't allocated by malloc(), because it will be deallocated by free().
However, initWithBytesNoCopy has an overload that takes an arbitrary block
for deallocating the object. If such overload is used, it is no longer
necessary to make sure that the memory is allocated by malloc().
This is useful for clients that are relying on linearized CFGs for evaluating
subexpressions and want the default initializer to be evaluated properly.
The upcoming lifetime analysis is using this but it might also be useful
for the static analyzer at some point.
Differential Revision: https://reviews.llvm.org/D71642
This canonicalizes the representation of unknown pointer symbols,
which reduces the overall confusion in pointer cast representation.
Patch by Vince Bridgers!
Differential Revision: https://reviews.llvm.org/D70836
This patch introduces the namespaces for the configured functions and
also enables the use of the member functions.
I added an optional Scope field for every configured function. Functions
without Scope match for every function regardless of the namespace.
Functions with Scope will match if the full name of the function starts
with the Scope.
Multiple functions can exist with the same name.
Differential Revision: https://reviews.llvm.org/D70878
AbstractBasicReader.h has quite a few dependencies already,
and that's only likely to increase. Meanwhile, ASTRecordReader
is really an implementation detail of the ASTReader that is only
used in a small number of places.
I've kept it in a public header for the use of projects like Swift
that might want to plug in to Clang's serialization framework.
I've also moved OMPClauseReader into an implementation file,
although it can't be made private because of friendship.
Some AST nodes which stands for implicit initialization is shared. The analyzer
will do the same evaluation on the same nodes resulting in the same state. The
analyzer will "cache out", i.e. it thinks that it visited an already existing
node in the exploded graph. This is not true in this case and we lose coverage.
Since these nodes do not really require any processing from the analyzer
we just omit them from the CFG.
Differential Revision: https://reviews.llvm.org/D71371
This patch introduced additional PointerEscape callbacks after conservative
calls for output parameters. This should not really affect the current
checkers but the upcoming FuchsiaHandleChecker relies on this heavily.
Differential Revision: https://reviews.llvm.org/D71224
The checker was trying to analyze the body of every method in Objective-C
@implementation clause but the sythesized accessor stubs that were introduced
into it by 2073dd2d have no bodies.
While analyzing code `memcmp(a, NULL, n);', where `a' has an unconstrained
symbolic value, the analyzer was emitting a warning about the *first* argument
being a null pointer, even though we'd rather have it warn about the *second*
argument.
This happens because CStringChecker first checks whether the two argument
buffers are in fact the same buffer, in order to take the fast path.
This boils down to assuming `a == NULL' to true. Then the subsequent check
for null pointer argument "discovers" that `a' is null.
Don't take the fast path unless we are *sure* that the buffers are the same.
Otherwise proceed as normal.
Differential Revision: https://reviews.llvm.org/D71322
Sometimes the return value of a comparison operator call is
`UnkownVal`. Since no assumptions can be made on `UnknownVal`,
this leeds to keeping impossible execution paths in the
exploded graph resulting in poor performance and false
positives. To overcome this we replace unknown results of
iterator comparisons by conjured symbols.
Differential Revision: https://reviews.llvm.org/D70244
Debugging the Iterator Modeling checker or any of the iterator checkers
is difficult without being able to see the relations between the
iterator variables and their abstract positions, as well as the abstract
symbols denoting the begin and the end of the container.
This patch adds the checker-specific part of the Program State printing
to the Iterator Modeling checker.
A monolithic checker class is hard to maintain. This patch splits it up
into a modeling part, the three checkers and a debug checker. The common
functions are moved into a library.
Differential Revision: https://reviews.llvm.org/D70320
It was a step in the right direction but it is not clear how can this
fit into the checker API at this point. The pre-escape happens in the
analyzer core and the checker has no control over it. If the checker
is not interestd in a pre-escape it would need to do additional work
on each escape to check if the escaped symbol is originated from an
"uninteresting" pre-escaped memory region. In order to keep the
checker API simple we abandoned this solution for now.
We will reland this once we have a better answer for what to do on the
checker side.
This reverts commit f3a28202ef.
We want to escape all symbols that are stored into escaped regions.
The problem is, we did not know which local regions were escaped. Until now.
This should fix some false positives like the one in the tests.
Differential Revision: https://reviews.llvm.org/D71152
This commit sets the Self and Imp declarations for ObjC method declarations,
in addition to the definitions. It also fixes
a bunch of code in clang that had wrong assumptions about when getSelfDecl() would be set:
- CGDebugInfo::getObjCMethodName and AnalysisConsumer::getFunctionName would assume that it was
set for method declarations part of a protocol, which they never were,
and that self would be a Class type, which it isn't as it is id for a protocol.
Also use the Canonical Decl to index the set of Direct methods so that
when calls and implementations interleave, the same llvm::Function is
used and the same symbol name emitted.
Radar-Id: rdar://problem/57661767
Patch by: Pierre Habouzit
Differential Revision: https://reviews.llvm.org/D71091
When implementation of the block runtime is available, we should not
warn that block layout fields are uninitialized simply because they're
on the stack.
This patch is the last of the series of patches which allow the user to
annotate their functions with taint propagation rules.
I implemented the use of the configured filtering functions. These
functions can remove taintedness from the symbols which are passed at
the specified arguments to the filters.
Differential Revision: https://reviews.llvm.org/D59516
Fix a canonicalization problem for the newly added property accessor stubs that
was causing a wrong decl to be used for 'self' in the accessor's body farm.
Fix a crash when constructing a body farm for accessors of a property
that is declared and @synthesize'd in different (but related) interfaces.
Differential Revision: https://reviews.llvm.org/D70158
Let the checkers use a reference instead of a copy in a range-based
for loop.
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.
Differential Revision: https://reviews.llvm.org/D70047
When bugreporter::trackExpressionValue() is invoked on a DeclRefExpr,
it tries to do most of its computations over the node in which
this DeclRefExpr is computed, rather than on the error node (or whatever node
is stuffed into it). One reason why we can't simply use the error node is
that the binding to that variable might have already disappeared from the state
by the time the bug is found.
In case of the inlined defensive checks visitor, the DeclRefExpr node
is in fact sometimes too *early*: the call in which the inlined defensive check
has happened might have not been entered yet.
Change the visitor to be fine with tracking dead symbols (which it is totally
capable of - the collapse point for the symbol is still well-defined), and fire
it up directly on the error node. Keep using "LVState" to find out which value
should we be tracking, so that there weren't any problems with accidentally
loading an ill-formed value from a dead variable.
Differential Revision: https://reviews.llvm.org/D67932
This patch is motivated by (and factored out from)
https://reviews.llvm.org/D66121 which is a debug info bugfix. Starting
with DWARF 5 all Objective-C methods are nested inside their
containing type, and that patch implements this for synthesized
Objective-C properties.
1. SemaObjCProperty populates a list of synthesized accessors that may
need to inserted into an ObjCImplDecl.
2. SemaDeclObjC::ActOnEnd inserts forward-declarations for all
accessors for which no override was provided into their
ObjCImplDecl. This patch does *not* synthesize AST function
*bodies*. Moving that code from the static analyzer into Sema may
be a good idea though.
3. Places that expect all methods to have bodies have been updated.
I did not update the static analyzer's inliner for synthesized
properties to point back to the property declaration (see
test/Analysis/Inputs/expected-plists/nullability-notes.m.plist), which
I believed to be more bug than a feature.
Differential Revision: https://reviews.llvm.org/D68108
rdar://problem/53782400
For white-box testing correct container and iterator modelling it is essential
to access the internal data structures stored for container and iterators. This
patch introduces a simple debug checkers called debug.IteratorDebugging to
achieve this.
Differential Revision: https://reviews.llvm.org/D67156
- Fix false positive reports of strlcat.
- The return value of strlcat and strlcpy is now correctly calculated.
- The resulting string length of strlcat and strlcpy is now correctly
calculated.
Patch by Daniel Krupp!
Differential Revision: https://reviews.llvm.org/D66049
Summary:
Recognization of function names is done now with the CallDescription
class instead of using IdentifierInfo. This means function name and
argument count is compared too.
A new check for filtering not global-C-functions was added.
Test was updated.
Reviewers: Szelethus, NoQ, baloghadamsoftware, Charusso
Reviewed By: Szelethus, NoQ, Charusso
Subscribers: rnkovacs, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy, Charusso, dkrupp, Szelethus, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67706
Member operator declarations and member operator expressions
have different numbering of parameters and arguments respectively:
one of them includes "this", the other does not.
Account for this inconsistency when figuring out whether
the parameter needs to be manually rebound from the Environment
to the Store when entering a stack frame of an operator call,
as opposed to being constructed with a constructor and as such
already having the necessary Store bindings.
Differential Revision: https://reviews.llvm.org/D69155
The '->' thing has always been confusing; the actual operation '->'
translates to a pointer dereference together with adding a FieldRegion,
but FieldRegion on its own doesn't imply an additional pointer
dereference.
llvm-svn: 375281
One of the first attempts to reduce the size of the exploded graph dumps
was to skip the state dump as long as the state is the same as in all of
the predecessor nodes. With all the new facilities in place (node joining,
diff dumps), this feature doesn't do much, and when it does,
it's more harmful than useful. Let's remove it.
llvm-svn: 375280
The joined nodes now actually have the same state. That was intended
from the start but the original implementation turned out to be buggy.
Differential Revision: https://reviews.llvm.org/D69150
llvm-svn: 375278
ExplodedGraph nodes will now have a numeric identifier stored in them
which will keep track of the order in which the nodes were created
and it will be fully deterministic both accross runs and across machines.
This is extremely useful for debugging as it allows reliably setting
conditional breakpoints by node IDs.
llvm-svn: 375186
Part of C++20 Concepts implementation effort. Added Concept Specialization Expressions that are created when a concept is refe$
D41217 on Phabricator.
(recommit after fixing failing Parser test on windows)
llvm-svn: 374903
Part of C++20 Concepts implementation effort. Added Concept Specialization Expressions that are created when a concept is referenced with arguments, and tests thereof.
llvm-svn: 374882
Added parsing/sema/codegen support for 'parallel master taskloop'
constructs. Some of the clauses, like 'grainsize', 'num_tasks', 'final'
and 'priority' are not supported in full, only constant expressions can
be used currently in these clauses.
llvm-svn: 374791
The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.
llvm-svn: 374717
Some compilers have trouble converting unique_ptr<PathSensitiveBugReport> to
unique_ptr<BugReport> causing some functions to fail to compile.
Changing the return type of the functions that fail to compile does not
appear to have any issues.
I ran into this issue building with clang 3.8 on Ubuntu 16.04.
llvm-svn: 372668
Summary:
https://bugs.llvm.org/show_bug.cgi?id=43102
In today's edition of "Is this any better now that it isn't crashing?", I'd like to show you a very interesting test case with loop widening.
Looking at the included test case, it's immediately obvious that this is not only a false positive, but also a very bad bug report in general. We can see how the analyzer mistakenly invalidated `b`, instead of its pointee, resulting in it reporting a null pointer dereference error. Not only that, the point at which this change of value is noted at is at the loop, rather then at the method call.
It turns out that `FindLastStoreVisitor` works correctly, rather the supplied explodedgraph is faulty, because `BlockEdge` really is the `ProgramPoint` where this happens.
{F9855739}
So it's fair to say that this needs improving on multiple fronts. In any case, at least the crash is gone.
Full ExplodedGraph: {F9855743}
Reviewers: NoQ, xazax.hun, baloghadamsoftware, Charusso, dcoughlin, rnkovacs, TWeaver
Subscribers: JesperAntonsson, uabelho, Ka-Ka, bjope, whisperity, szepet, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66716
llvm-svn: 372269
Traditionally, clang-tidy uses the term check, and the analyzer uses checker,
but in the very early years, this wasn't the case, and code originating from the
early 2010's still incorrectly refer to checkers as checks.
This patch attempts to hunt down most of these, aiming to refer to checkers as
checkers, but preserve references to callback functions (like checkPreCall) as
checks.
Differential Revision: https://reviews.llvm.org/D67140
llvm-svn: 371760
At this point the PathDiagnostic, PathDiagnosticLocation, PathDiagnosticPiece
structures no longer rely on anything specific to Static Analyzer, so we can
move them out of it for everybody to use.
PathDiagnosticConsumers are still to be handed off.
Differential Revision: https://reviews.llvm.org/D67419
llvm-svn: 371661
This method of PathDiagnostic is a part of Static Analyzer's particular
path diagnostic construction scheme. As such, it doesn't belong to
the PathDiagnostic class, but to the Analyzer.
Differential Revision: https://reviews.llvm.org/D67418
llvm-svn: 371660
These static functions deal with ExplodedNodes which is something we don't want
the PathDiagnostic interface to know anything about, as it's planned to be
moved out of libStaticAnalyzerCore.
Differential Revision: https://reviews.llvm.org/D67382
llvm-svn: 371659
That's one of the few random entities in the PathDiagnostic interface that
are specific to the Static Analyzer. By moving them out we could let
everybody use path diagnostics without linking against Static Analyzer.
Differential Revision: https://reviews.llvm.org/D67381
llvm-svn: 371658
Checkers are now required to specify whether they're creating a
path-sensitive report or a path-insensitive report by constructing an
object of the respective type.
This makes BugReporter more independent from the rest of the Static Analyzer
because all Analyzer-specific code is now in sub-classes.
Differential Revision: https://reviews.llvm.org/D66572
llvm-svn: 371450
Allow attaching fixit hints to Static Analyzer BugReports.
Fixits are attached either to the bug report itself or to its notes
(path-sensitive event notes or path-insensitive extra notes).
Add support for fixits in text output (including the default text output that
goes without notes, as long as the fixit "belongs" to the warning).
Add support for fixits in the plist output mode.
Implement a fixit for the path-insensitive DeadStores checker. Only dead
initialization warning is currently covered.
Implement a fixit for the path-sensitive VirtualCall checker when the virtual
method is not pure virtual (in this case the "fix" is to suppress the warning
by qualifying the call).
Both fixits are under an off-by-default flag for now, because they
require more careful testing.
Differential Revision: https://reviews.llvm.org/D65182
llvm-svn: 371257
Most functions that our checkers react upon are not C-style variadic functions,
and therefore they have as many actual arguments as they have formal parameters.
However, it's not impossible to define a variadic function with the same name.
This will crash any checker that relies on CallDescription to check the number
of arguments but silently assumes that the number of parameters is the same.
Change CallDescription to check both the number of arguments and the number of
parameters by default.
If we're intentionally trying to match variadic functions, allow specifying
arguments and parameters separately (possibly omitting any of them).
For now we only have one CallDescription which would make use of those,
namely __builtin_va_start itself.
Differential Revision: https://reviews.llvm.org/D67019
llvm-svn: 371256
There are some functions which can't be given a null pointer as parameter either
because it has a nonnull attribute or it is declared to have undefined behavior
(e.g. strcmp()). Sometimes it is hard to determine from the checker message
which parameter is null at the invocation, so now this information is included
in the message.
This commit fixes https://bugs.llvm.org/show_bug.cgi?id=39358
Reviewed By: NoQ, Szelethus, whisperity
Patch by Tibor Brunner!
Differential Revision: https://reviews.llvm.org/D66333
llvm-svn: 370798
Enables the users to specify an optional flag which would warn for more dead
stores.
Previously it ignored if the dead store happened e.g. in an if condition.
if ((X = generate())) { // dead store to X
}
This patch introduces the `WarnForDeadNestedAssignments` option to the checker,
which is `false` by default - so this change would not affect any previous
users.
I have updated the code, tests and the docs as well. If I missed something, tell
me.
I also ran the analysis on Clang which generated 14 more reports compared to the
unmodified version. All of them seemed reasonable for me.
Related previous patches:
rGf224820b45c6847b91071da8d7ade59f373b96f3
Reviewers: NoQ, krememek, Szelethus, baloghadamsoftware
Reviewed By: Szelethus
Patch by Balázs Benics!
Differential Revision: https://reviews.llvm.org/D66733
llvm-svn: 370767
Range errors (dereferencing or incrementing the past-the-end iterator or
decrementing the iterator of the first element of the range) and access of
invalidated iterators lead to undefined behavior. There is no point to
continue the analysis after such an error on the same execution path, but
terminate it by a sink node (fatal error). This also improves the
performance and helps avoiding double reports (e.g. in case of nested
iterators).
Differential Revision: https://reviews.llvm.org/D62893
llvm-svn: 370314
Write tests for the actual crash that was found. Write comments and refactor
code around 17 style bugs and suppress 3 false positives.
Differential Revision: https://reviews.llvm.org/D66847
llvm-svn: 370246
It was known to be a compile-time constant so it wasn't evaluated during
symbolic execution, but it wasn't evaluated as a compile-time constant either.
Differential Revision: https://reviews.llvm.org/D66565
llvm-svn: 370245
If the global variable has an initializer, we'll ignore it because we're usually
not analyzing the program from the beginning, which means that the global
variable may have changed before we start our analysis.
However when we're analyzing main() as the top-level function, we can rely
on global initializers to still be valid. At least in C; in C++ we have global
constructors that can still break this logic.
This patch allows the Static Analyzer to load constant initializers from
global variables if the top-level function of the current analysis is main().
Differential Revision: https://reviews.llvm.org/D65361
llvm-svn: 370244
According to the SARIF specification, "a text region does not include the character specified by endColumn".
Differential Revision: https://reviews.llvm.org/D65206
llvm-svn: 370060
Summary: EnumCastOutOfRangeChecker should not perform enum range checks on LValueToRValue casts, since this type of cast does not actually change the underlying type. Performing the unnecessary check actually triggered an assertion failure deeper in EnumCastOutOfRange for certain input (which is captured in the accompanying test code).
Reviewers: #clang, Szelethus, gamesh411, NoQ
Reviewed By: Szelethus, gamesh411, NoQ
Subscribers: NoQ, gamesh411, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, bjope, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66014
llvm-svn: 369760
Our SVal hierarchy doesn't allow modeling pointer casts as no-op. The
pointer type is instead encoded into the pointer object. Defer to our
usual pointer casting facility, SValBuilder::evalBinOp().
Fixes a crash.
llvm-svn: 369729
As discussed on the mailing list, notes originating from the tracking of foreach
loop conditions are always meaningless.
Differential Revision: https://reviews.llvm.org/D66131
llvm-svn: 369613
Summary:
This patch introduces `DynamicCastInfo` similar to `DynamicTypeInfo` which
is stored in `CastSets` which are storing the dynamic cast informations of
objects based on memory regions. It could be used to store and check the
casts and prevent infeasible paths.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D66325
llvm-svn: 369605
In D65724, I do a pretty thorough explanation about how I'm solving this
problem, I think that summary nails whats happening here ;)
Differential Revision: https://reviews.llvm.org/D65725
llvm-svn: 369596
Exactly what it says on the tin! Note that we're talking about interestingness
in general, hence this isn't a control-dependency-tracking specific patch.
Differential Revision: https://reviews.llvm.org/D65724
llvm-svn: 369589
We defined (on the mailing list and here on phabricator) 2 different cases where
retrieving information about a control dependency condition is very important:
* When the condition's last write happened in a different stack frame
* When the collapse point of the condition (when we can constrain it to be
true/false) didn't happen in the actual condition.
It seems like we solved this problem with the help of expression value tracking,
and have started working on better diagnostics notes about this process.
Expression value tracking is nothing more than registering a variety of visitors
to construct reports about it. Each of the registered visitors (ReturnVisitor,
FindLastStoreVisitor, NoStoreFuncVisitor, etc) have something to go by: a
MemRegion, an SVal, an ExplodedNode, etc. For this reason, better explaining a
last write is super simple, we can always just pass on some more information to
the visitor in question (as seen in D65575).
ConditionBRVisitor is a different beast, as it was built for a different
purpose. It is responsible for constructing events at, well, conditions, and is
registered only once, and isn't a part of the "expression value tracking
family". Unfortunately, it is also the visitor to tinker with for constructing
better diagnostics about the collapse point problem.
This creates a need for alternative way to communicate with ConditionBRVisitor
that a specific condition is being tracked for for the reason of being a control
dependency. Since at almost all PathDiagnosticEventPiece construction the
visitor checks interestingness, it makes sense to pair interestingness with a
reason as to why we marked an entity as such.
Differential Revision: https://reviews.llvm.org/D65723
llvm-svn: 369583
Can't add much more to the title! This is part 1, the case where the collapse
point isn't in the condition point is the responsibility of ConditionBRVisitor,
which I'm addressing in part 2.
Differential Revision: https://reviews.llvm.org/D65575
llvm-svn: 369574
Add defensive check that prevents a crash when we try to evaluate a destructor
whose this-value is a concrete integer that isn't a null.
Differential Revision: https://reviews.llvm.org/D65349
llvm-svn: 369450
Calling a pure virtual method during construction or destruction
is undefined behavior. It's worth it to warn about it by default.
That part is now known as the cplusplus.PureVirtualCall checker.
Calling a normal virtual method during construction or destruction
may be fine, but does behave unexpectedly, as it skips virtual dispatch.
Do not warn about this by default, but let projects opt in into it
by enabling the optin.cplusplus.VirtualCall checker manually.
Give the two parts differentiated warning text:
Before:
Call to virtual function during construction or destruction:
Call to pure virtual function during construction
Call to virtual function during construction or destruction:
Call to virtual function during destruction
After:
Pure virtual method call:
Call to pure virtual method 'X::foo' during construction
has undefined behavior
Unexpected loss of virtual dispatch:
Call to virtual method 'Y::bar' during construction
bypasses virtual dispatch
Also fix checker names in consumers that support them (eg., clang-tidy)
because we now have different checker names for pure virtual calls and
regular virtual calls.
Also fix capitalization in the bug category.
Differential Revision: https://reviews.llvm.org/D64274
llvm-svn: 369449
Summary:
This patch introduces a new `analyzer-config` configuration:
`-analyzer-config silence-checkers`
which could be used to silence the given checkers.
It accepts a semicolon separated list, packed into quotation marks, e.g:
`-analyzer-config silence-checkers="core.DivideZero;core.NullDereference"`
It could be used to "disable" core checkers, so they model the analysis as
before, just if some of them are too noisy it prevents to emit reports.
This patch also adds support for that new option to the scan-build.
Passing the option `-disable-checker core.DivideZero` to the scan-build
will be transferred to `-analyzer-config silence-checkers=core.DivideZero`.
Reviewed By: NoQ, Szelethus
Differential Revision: https://reviews.llvm.org/D66042
llvm-svn: 369078
This is more of a temporary fix, long term, we should convert AnalyzerOptions.def
into the universally beloved (*coughs*) TableGen format, where they can more
easily be separated into developer-only, alpha, and user-facing configs.
Differential Revision: https://reviews.llvm.org/D66261
llvm-svn: 368980
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
Well, what is says on the tin I guess!
Some more changes:
* Move isInevitablySinking() from BugReporter.cpp to CFGBlock's interface
* Rename and move findBlockForNode() from BugReporter.cpp to
ExplodedNode::getCFGBlock()
Differential Revision: https://reviews.llvm.org/D65287
llvm-svn: 368836
Exactly what it says on the tin! The comments in the code detail this a
little more too.
Differential Revision: https://reviews.llvm.org/D64272
llvm-svn: 368817
Summary:
Explicitly deleting the copy constructor makes compiling the function
`ento::registerGenericTaintChecker` difficult with some compilers. When we
construct an `llvm::Optional<TaintConfig>`, the optional is constructed with a
const TaintConfig reference which it then uses to invoke the deleted TaintConfig
copy constructor.
I've observered this failing with clang 3.8 on Ubuntu 16.04.
Reviewers: compnerd, Szelethus, boga95, NoQ, alexshap
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, llvm-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66192
llvm-svn: 368779
When we're tracking a variable that is responsible for a null pointer
dereference or some other sinister programming error, we of course would like to
gather as much information why we think that the variable has that specific
value as possible. However, the newly introduced condition tracking shows that
tracking all values this thoroughly could easily cause an intolerable growth in
the bug report's length.
There are a variety of heuristics we discussed on the mailing list[1] to combat
this, all of them requiring to differentiate in between tracking a "regular
value" and a "condition".
This patch introduces the new `bugreporter::TrackingKind` enum, adds it to
several visitors as a non-optional argument, and moves some functions around to
make the code a little more coherent.
[1] http://lists.llvm.org/pipermail/cfe-dev/2019-June/062613.html
Differential Revision: https://reviews.llvm.org/D64270
llvm-svn: 368777
Summary:
The following code snippet taken from D64271#1572188 has an issue: namely,
because `flag`'s value isn't undef or a concrete int, it isn't being tracked.
int flag;
bool coin();
void foo() {
flag = coin();
}
void test() {
int *x = 0;
int local_flag;
flag = 1;
foo();
local_flag = flag;
if (local_flag)
x = new int;
foo();
local_flag = flag;
if (local_flag)
*x = 5;
}
This, in my opinion, makes no sense, other values may be interesting too.
Originally added by rC185608.
Differential Revision: https://reviews.llvm.org/D64287
llvm-svn: 368773
During the evaluation of D62883, I noticed a bunch of totally
meaningless notes with the pattern of "Calling 'A'" -> "Returning value"
-> "Returning from 'A'", which added no value to the report at all.
This patch (not only affecting tracked conditions mind you) prunes
diagnostic messages to functions that return a value not constrained to
be 0, and are also linear.
Differential Revision: https://reviews.llvm.org/D64232
llvm-svn: 368771
I feel this is kinda important, because in a followup patch I'm adding different
kinds of interestingness, and propagating the correct kind in BugReporter.cpp is
just one less thing to worry about.
Differential Revision: https://reviews.llvm.org/D65578
llvm-svn: 368755
Apparently this does literally nothing.
When you think about this, it makes sense. If something is really important,
we're tracking it anyways, and that system is sophisticated enough to mark
actually interesting statements as such. I wouldn't say that it's even likely
that subexpressions are also interesting (array[10 - x + x]), so I guess even
if this produced any effects, its probably undesirable.
Differential Revision: https://reviews.llvm.org/D65487
llvm-svn: 368752
In D65379, I briefly described the construction of bug paths from an
ExplodedGraph. This patch is about refactoring the code processing the bug path
into a bug report.
A part of finding a valid bug report was running all visitors on the bug path,
so we already have a (possibly empty) set of diagnostics for each ExplodedNode
in it.
Then, for each diagnostic consumer, we construct non-visitor diagnostic pieces.
* We first construct the final diagnostic piece (the warning), then
* We start ascending the bug path from the error node's predecessor (since the
error node itself was used to construct the warning event). For each node
* We check the location (whether its a CallEnter, CallExit) etc. We simultaneously
keep track of where we are with the execution by pushing CallStack when we see a
CallExit (keep in mind that everything is happening in reverse!), popping it
when we find a CallEnter, compacting them into a single PathDiagnosticCallEvent.
void f() {
bar();
}
void g() {
f();
error(); // warning
}
=== The bug path ===
(root) -> f's CallEnter -> bar() -> f's CallExit -> (error node)
=== Constructed report ===
f's CallEnter -> bar() -> f's CallExit
^ /
\ V
(root) ---> f's CallEvent --> (error node)
* We also keep track of different PathPieces different location contexts
* (CallEvent::path in the above example has f's LocationContext, while the
CallEvent itself is in g's context) in a LocationContextMap object. Construct
whatever piece, if any, is needed for the note.
* If we need to generate edges (or arrows) do so. Make sure to also connect
these pieces with the ones that visitors emitted.
* Clean up the constructed PathDiagnostic by making arrows nicer, pruning
function calls, etc.
So I complained about mile long function invocations with seemingly the same
parameters being passed around. This problem, as I see it, a natural candidate
for creating classes and tying them all together.
I tried very hard to make the implementation feel natural, like, rolling off the
tongue. I introduced 2 new classes: PathDiagnosticBuilder (I mean, I kept the
name but changed almost everything in it) contains every contextual information
(owns the bug path, the diagnostics constructed but the visitors, the BugReport
itself, etc) needed for constructing a PathDiagnostic object, and is pretty much
completely immutable. BugReportContruct is the object containing every
non-contextual information (the PathDiagnostic object we're constructing, the
current location in the bug path, the location context map and the call stack I
meantioned earlier), and is passed around all over the place as a single entity
instead of who knows how many parameters.
I tried to used constness, asserts, limiting visibility of fields to my
advantage to clean up the code big time and dramatically improve safety. Also,
whenever I found the code difficult to understand, I added comments and/or
examples.
Here's a complete list of changes and my design philosophy behind it:
* Instead of construcing a ReportInfo object (added by D65379) after finding a
valid bug report, simply return an optional PathDiagnosticBuilder object straight
away. Move findValidReport into the class as a static method. I find
GRBugReporter::generatePathDiagnostics a joy to look at now.
* Rename generatePathDiagnosticForConsumer to generate (maybe not needed, but
felt that way in the moment) and moved it to PathDiagnosticBuilder. If we don't
need to generate diagnostics, bail out straight away, like we always should have.
After that, construct a BugReportConstruct object, leaving the rest of the logic
untouched.
* Move all static methods that would use contextual information into
PathDiagnosticBuilder, reduce their parameter count drastically by simply
passing around a BugReportConstruct object.
* Glance at the code I removed: Could you tell what the original
PathDiagnosticBuilder::LC object was for? It took a gooood long while for me to
realize that nothing really. It is always equal with the LocationContext
associated with our current position in the bug path. Remove it completely.
* The original code contains the following expression quite a bit:
LCM[&PD.getActivePath()], so what does it mean? I said that we collect the
contexts associated with different PathPieces, but why would we ever modify that,
shouldn't it be set? Well, theoretically yes, but in the implementation, the
address of PathDiagnostic::getActivePath doesn't change if we move to an outer,
previously unexplored function. Add both descriptive method names and
explanations to BugReportConstruct to help on this.
* Add plenty of asserts, both for safety and as a poor man's documentation.
Differential Revision: https://reviews.llvm.org/D65484
llvm-svn: 368737
When I'm new to a file/codebase, I personally find C++'s strong static type
system to be a great aid. BugReporter.cpp is still painful to read however:
function calls are made with mile long parameter lists, seemingly all of them
taken with a non-const reference/pointer. This patch fixes nothing but this:
make a few things const, and hammer it until it compiles.
Differential Revision: https://reviews.llvm.org/D65382
llvm-svn: 368735
find clang/ -type f -exec sed -i 's/std::shared_ptr<PathDiagnosticPiece>/PathDiagnosticPieceRef/g' {} \;
git diff -U3 --no-color HEAD^ | clang-format-diff-6.0 -p1 -i
Just as C++ is meant to be refactored, right?
Differential Revision: https://reviews.llvm.org/D65381
llvm-svn: 368717
This patch refactors the utility functions and classes around the construction
of a bug path.
At a very high level, this consists of 3 steps:
* For all BugReports in the same BugReportEquivClass, collect all their error
nodes in a set. With that set, create a new, trimmed ExplodedGraph whose leafs
are all error nodes.
* Until a valid report is found, construct a bug path, which is yet another
ExplodedGraph, that is linear from a given error node to the root of the graph.
* Run all visitors on the constructed bug path. If in this process the report
got invalidated, start over from step 2.
Now, to the changes within this patch:
* Do not allow the invalidation of BugReports up to the point where the trimmed
graph is constructed. Checkers shouldn't add bug reports that are known to be
invalid, and should use visitors and argue about the entirety of the bug path if
needed.
* Do not calculate indices. I may be biased, but I personally find code like
this horrible. I'd like to point you to one of the comments in the original code:
SmallVector<const ExplodedNode *, 32> errorNodes;
for (const auto I : bugReports) {
if (I->isValid()) {
HasValid = true;
errorNodes.push_back(I->getErrorNode());
} else {
// Keep the errorNodes list in sync with the bugReports list.
errorNodes.push_back(nullptr);
}
}
Not on my watch. Instead, use a far easier to follow trick: store a pointer to
the BugReport in question, not an index to it.
* Add range iterators to ExplodedGraph's successors and predecessors, and a
visitor range to BugReporter.
* Rename TrimmedGraph to BugPathGetter. Because that is what it has always been:
no sane graph type should store an iterator-like state, or have an interface not
exposing a single graph-like functionalities.
* Rename ReportGraph to BugPathInfo, because it is only a linear path with some
other context.
* Instead of having both and out and in parameter (which I think isn't ever
excusable unless we use the out-param for caching), return a record object with
descriptive getter methods.
* Where descriptive names weren't sufficient, compliment the code with comments.
Differential Revision: https://reviews.llvm.org/D65379
llvm-svn: 368694
The goal of this refactoring effort was to better understand how interestingness
was propagated in BugReporter.cpp, which eventually turned out to be a dead end,
but with such a twist, I wouldn't even want to spoil it ahead of time. However,
I did get to learn a lot about how things are working in there.
In these series of patches, as well as cleaning up the code big time, I invite
you to study how BugReporter.cpp operates, and discuss how we could design this
file to reduce the horrible mess that it is.
This patch reverts a great part of rC162028, which holds the title "Allow
multiple PathDiagnosticConsumers to be used with a BugReporter at the same
time.". This, however doesn't imply that there's any need for multiple "layers"
or stacks of interesting symbols and regions, quite the contrary, I would argue
that we would like to generate the same amount of information for all output
types, and only process them differently.
Differential Revision: https://reviews.llvm.org/D65378
llvm-svn: 368689
Summary:
A condition could be a multi-line expression where we create the highlight
in separated chunks. PathDiagnosticPopUpPiece is not made for that purpose,
it cannot be added to multiple lines because we have only one ending part
which contains all the notes. So that it cannot have multiple endings and
therefore this patch narrows down the ranges of the highlight to the given
interesting variable of the condition. It prevents HTML-breaking injections.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D65663
llvm-svn: 368382
This patch is a prerequisite for using LangStandard from Driver in
https://reviews.llvm.org/D64793.
It moves LangStandard* and InputKind::Language to Basic. It is mostly
mechanical, with only a few changes of note:
- enum Language has been changed into enum class Language : uint8_t to
avoid a clash between OpenCL in enum Language and OpenCL in enum
LangFeatures and not to increase the size of class InputKind.
- Now that getLangStandardForName, which is currently unused, also checks
both canonical and alias names, I've introduced a helper getLangKind
which factors out a code pattern already used 3 times.
The patch has been tested on x86_64-pc-solaris2.11, sparcv9-sun-solaris2.11,
and x86_64-pc-linux-gnu.
There's a companion patch for lldb which uses LangStandard.h
(https://reviews.llvm.org/D65717).
While polly includes isl which in turn uses InputKind::C, that part of the
code isn't even built inside the llvm tree. I've posted a patch to allow
for both InputKind::C and Language::C upstream
(https://groups.google.com/forum/#!topic/isl-development/6oEvNWOSQFE).
Differential Revision: https://reviews.llvm.org/D65562
llvm-svn: 367864
Summary:
It allows discriminating between stack frames of the same call that is
called multiple times in a loop.
Thanks to Artem Dergachev for the great idea!
Reviewed By: NoQ
Tags: #clang
Differential Revision: https://reviews.llvm.org/D65587
llvm-svn: 367608
While we implemented taint propagation rules for several
builtin/standard functions, there's a natural desire for users to add
such rules to custom functions.
A series of patches will implement an option that allows users to
annotate their functions with taint propagation rules through a YAML
file. This one adds parsing of the configuration file, which may be
specified in the commands line with the analyzer config:
alpha.security.taint.TaintPropagation:Config. The configuration may
contain propagation rules, filter functions (remove taint) and sink
functions (give a warning if it gets a tainted value).
I also added a new header for future checkers to conveniently read YAML
files as checker options.
Differential Revision: https://reviews.llvm.org/D59555
llvm-svn: 367190
Summary:
When cross TU analysis is used it is possible that a macro expansion
is generated for a macro that is defined (and used) in other than
the main translation unit. To get the expansion for it the source
location in the original source file and original preprocessor
is needed.
Reviewers: martong, xazax.hun, Szelethus, ilya-biryukov
Reviewed By: Szelethus
Subscribers: mgorny, NoQ, ilya-biryukov, rnkovacs, dkrupp, Szelethus, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64638
llvm-svn: 367006
Summary:
Integer Set Library using retain-count based allocation which is not
modeled in MallocChecker.
Reviewed By: NoQ
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64680
llvm-svn: 366391
Summary:
It models the LLVM casts:
- `cast<>`
- `dyn_cast<>`
- `cast_or_null<>`
- `dyn_cast_or_null<>`
It has a very basic support without checking the `classof()` function.
(It reapplies the reverted 'llvm-svn: 365582' patch with proper test file.)
Reviewed By: NoQ
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64374
llvm-svn: 365585
Summary:
It models the LLVM casts:
- `cast<>`
- `dyn_cast<>`
- `cast_or_null<>`
- `dyn_cast_or_null<>`
It has a very basic support without checking the `classof()` function.
Reviewed By: NoQ
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64374
llvm-svn: 365582
This patch is a major part of my GSoC project, aimed to improve the bug
reports of the analyzer.
TL;DR: Help the analyzer understand that some conditions are important,
and should be explained better. If an CFGBlock is a control dependency
of a block where an expression value is tracked, explain the condition
expression better by tracking it.
if (A) // let's explain why we believe A to be true
10 / x; // division by zero
This is an experimental feature, and can be enabled by the
off-by-default analyzer configuration "track-conditions".
In detail:
This idea was inspired by the program slicing algorithm. Essentially,
two things are used to produce a program slice (a subset of the program
relevant to a (statement, variable) pair): data and control
dependencies. The bug path (the linear path in the ExplodedGraph that leads
from the beginning of the analysis to the error node) enables to
analyzer to argue about data dependencies with relative ease.
Control dependencies are a different slice of the cake entirely.
Just because we reached a branch during symbolic execution, it
doesn't mean that that particular branch has any effect on whether the
bug would've occured. This means that we can't simply rely on the bug
path to gather control dependencies.
In previous patches, LLVM's IDFCalculator, which works on a control flow
graph rather than the ExplodedGraph was generalized to solve this issue.
We use this information to heuristically guess that the value of a tracked
expression depends greatly on it's control dependencies, and start
tracking them as well.
After plenty of evaluations this was seen as great idea, but still
lacking refinements (we should have different descriptions about a
conditions value), hence it's off-by-default.
Differential Revision: https://reviews.llvm.org/D62883
llvm-svn: 365207
I intend to improve the analyzer's bug reports by tracking condition
expressions.
01 bool b = messyComputation();
02 int i = 0;
03 if (b) // control dependency of the bug site, let's explain why we assume val
04 // to be true
05 10 / i; // warn: division by zero
I'll detail this heuristic in the followup patch, strictly related to this one
however:
* Create the new ControlDependencyCalculator class that uses llvm::IDFCalculator
to (lazily) calculate control dependencies for Clang's CFG.
* A new debug checker debug.DumpControlDependencies is added for lit tests
* Add unittests
Differential Revision: https://reviews.llvm.org/D62619
llvm-svn: 365197
Transform clang::DominatorTree to be able to also calculate post dominators.
* Tidy up the documentation
* Make it clang::DominatorTree template class (similarly to how
llvm::DominatorTreeBase works), rename it to clang::CFGDominatorTreeImpl
* Clang's dominator tree is now called clang::CFGDomTree
* Clang's brand new post dominator tree is called clang::CFGPostDomTree
* Add a lot of asserts to the dump() function
* Create a new checker to test the functionality
Differential Revision: https://reviews.llvm.org/D62551
llvm-svn: 365028
Add a label to nodes that have a bug report attached or on which
the analysis was generally interrupted.
Fix printing has_report and implement printing is_sink in the graph dumper.
Differential Revision: https://reviews.llvm.org/D64110
llvm-svn: 364992
This commit adds a new builtin, __builtin_bit_cast(T, v), which performs a
bit_cast from a value v to a type T. This expression can be evaluated at
compile time under specific circumstances.
The compile time evaluation currently doesn't support bit-fields, but I'm
planning on fixing this in a follow up (some of the logic for figuring this out
is in CodeGen). I'm also planning follow-ups for supporting some more esoteric
types that the constexpr evaluator supports, as well as extending
__builtin_memcpy constexpr evaluation to use the same infrastructure.
rdar://44987528
Differential revision: https://reviews.llvm.org/D62825
llvm-svn: 364954
Due to RVO the target region of a function that returns an object by
value isn't necessarily a temporary object region; it may be an
arbitrary memory region. In particular, it may be a field of a bigger
object.
Make sure we don't invalidate the bigger object when said function is
evaluated conservatively.
Differential Revision: https://reviews.llvm.org/D63968
llvm-svn: 364870
The NonnullGlobalConstants checker models the rule "it doesn't make sense
to make a constant global pointer and initialize it to null"; it makes sure
that whatever it's initialized with is known to be non-null.
Ironically, annotating the type of the pointer as _Nonnull breaks the checker.
Fix handling of the _Nonnull annotation so that it was instead one more reason
to believe that the value is non-null.
Differential Revision: https://reviews.llvm.org/D63956
llvm-svn: 364869
This patch uses the new CDF_MaybeBuiltin flag to handle C library functions.
It's mostly an NFC/refactoring pass, but it does fix a bug in handling memset()
when it expands to __builtin___memset_chk() because the latter has
one more argument and memset() handling code was trying to match
the exact number of arguments. Now the code is deduplicated and there's
less room for mistakes.
Differential Revision: https://reviews.llvm.org/D62557
llvm-svn: 364868
When matching C standard library functions in the checker, it's easy to forget
that they are often implemented as macros that are expanded to builtins.
Such builtins would have a different name, so matching the callee identifier
would fail, or may sometimes have more arguments than expected, so matching
the exact number of arguments would fail, but this is fine as long as we have
all the arguments that we need in their respective places.
This patch adds a set of flags to the CallDescription class so that to handle
various special matching rules, and adds the first flag into this set,
which enables a more fuzzy matching for functions that
may be implemented as compiler builtins.
Differential Revision: https://reviews.llvm.org/D62556
llvm-svn: 364867
It encapsulates the procedure of figuring out whether a call event
corresponds to a function that's modeled by a checker.
Checker developers no longer need to worry about performance of
lookups into their own custom maps.
Add unittests - which finally test CallDescription itself as well.
Differential Revision: https://reviews.llvm.org/D62441
llvm-svn: 364866
The -analyzer-stats flag now allows you to find out how much time was spent
on AST-based analysis and on path-sensitive analysis and, separately,
on bug visitors, as they're occasionally a performance problem on their own.
The total timer wasn't useful because there's anyway a total time printed out.
Remove it.
Differential Revision: https://reviews.llvm.org/D63227
llvm-svn: 364266
Summary:
After evaluation it would be an Unknown value and tracking would be lost.
Reviewers: NoQ, xazax.hun, ravikandhadai, baloghadamsoftware, Szelethus
Reviewed By: NoQ
Subscribers: szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy,
dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63720
llvm-svn: 364259
Summary:
- Now we could see the `has_report` property in `trim-egraph` mode.
- This patch also removes the trailing comma after each node.
Reviewers: NoQ
Reviewed By: NoQ
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin,
mikhail.ramalho, Szelethus, donat.nagy, dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63436
llvm-svn: 364193
Quotes around StringRegions are now escaped and unescaped correctly,
producing valid JSON.
Additionally, add a forgotten escape for Store values.
Differential Revision: https://reviews.llvm.org/D63519
llvm-svn: 363897
Include a unique pointer so that it was possible to figure out if it's
the same cluster in different program states. This allows comparing
dumps of different states against each other.
Differential Revision: https://reviews.llvm.org/D63362
llvm-svn: 363896
Location context ID is a property of the location context, not of an item
within it. It's useful to know the id even when there are no items
in the context, eg. for the purposes of figuring out how did contents
of the Environment for the same location context changed across states.
Differential Revision: https://reviews.llvm.org/D62754
llvm-svn: 363895
This changes the checker callback signature to use the modern, easy to
use interface. Additionally, this unblocks future work on allowing
checkers to implement evalCall() for calls that don't correspond to any
call-expression or require additional information that's only available
as part of the CallEvent, such as C++ constructors and destructors.
Differential Revision: https://reviews.llvm.org/D62440
llvm-svn: 363893
IIG is a replacement for MIG in DriverKit: IIG is autogenerating C++ code.
Suppress dead store warnings on such code, as the tool seems to be producing
them regularly, and the users of IIG are not in position to address these
warnings, as they don't control the autogenerated code. IIG-generated code
is identified by looking at the comments at the top of the file.
Differential Revision: https://reviews.llvm.org/D63118
llvm-svn: 363892
Summary:
This patch applies a change similar to rC363069, but for SARIF files.
The `%diff_sarif` lit substitution invokes `diff` with a non-portable
`-I` option. The intended effect can be achieved by normalizing the
inputs to `diff` beforehand. Such normalization can be done with
`grep -Ev`, which is also used by other tests.
Additionally, this patch updates the SARIF output to have a newline at
the end of the file. This makes it so that the SARIF file qualifies as a
POSIX text file, which increases the consumability of the generated file
in relation to various tools.
Reviewers: NoQ, sfertile, xingxue, jasonliu, daltenty, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, Charusso, jsji, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62952
llvm-svn: 363822
Often times, when an ArraySubscriptExpr was reported as null or
undefined, the bug report was difficult to understand, because the
analyzer explained why arr[i] has that value, but didn't realize that in
fact i's value is very important as well. This patch fixes this by
tracking the indices of arrays.
Differential Revision: https://reviews.llvm.org/D63080
llvm-svn: 363510
Summary:
When we traversed backwards on ExplodedNodes to see where processed the
given statement we `break` too early. With the current approach we do not
miss the CallExitEnd ProgramPoint which stands for an inlined call.
Reviewers: NoQ, xazax.hun, ravikandhadai, baloghadamsoftware, Szelethus
Reviewed By: NoQ
Subscribers: szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy,
dkrupp, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62926
llvm-svn: 363491
nullptr_t does not access memory.
We now reuse CK_NullToPointer to represent a conversion from a glvalue
of type nullptr_t to a prvalue of nullptr_t where necessary.
This reinstates r363337, reverted in r363352.
llvm-svn: 363429
Revert 363340 "Remove unused SK_LValueToRValue initialization step."
Revert 363337 "PR23833, DR2140: an lvalue-to-rvalue conversion on a glvalue of type"
Revert 363295 "C++ DR712 and others: handle non-odr-use resulting from an lvalue-to-rvalue conversion applied to a member access or similar not-quite-trivial lvalue expression."
llvm-svn: 363352
nullptr_t does not access memory.
We now reuse CK_NullToPointer to represent a conversion from a glvalue
of type nullptr_t to a prvalue of nullptr_t where necessary.
This reinstates r345562, reverted in r346065, now that CodeGen's
handling of non-odr-used variables has been fixed.
llvm-svn: 363337
Summary:
As suggested in the review of D62949, this patch updates the plist
output to have a newline at the end of the file. This makes it so that
the plist output file qualifies as a POSIX text file, which increases
the consumability of the generated plist file in relation to various
tools.
Reviewers: NoQ, sfertile, xingxue, jasonliu, daltenty
Reviewed By: NoQ, xingxue
Subscribers: jsji, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63041
llvm-svn: 362992
Summary:
We're using the clang static analyzer together with a number of
custom analyses in our CI system to ensure that certain invariants
are statiesfied for by the code every commit. Unfortunately, there
currently doesn't seem to be a good way to determine whether any
analyzer warnings were emitted, other than parsing clang's output
(or using scan-build, which then in turn parses clang's output).
As a simpler mechanism, simply add a `-analyzer-werror` flag to CC1
that causes the analyzer to emit its warnings as errors instead.
I briefly tried to have this be `Werror=analyzer` and make it go
through that machinery instead, but that seemed more trouble than
it was worth in terms of conflicting with options to the actual build
and special cases that would be required to circumvent the analyzers
usual attempts to quiet non-analyzer warnings. This is simple and it
works well.
Reviewed-By: NoQ, Szelethusw
Differential Revision: https://reviews.llvm.org/D62885
llvm-svn: 362855
Summary:
This new piece is similar to our macro expansion printing in HTML reports:
On mouse-hover event it pops up on variables. Similar to note pieces it
supports `plist` diagnostics as well.
It is optional, on by default: `add-pop-up-notes=true`.
Extra: In HTML reports `background-color: LemonChiffon` was too light,
changed to `PaleGoldenRod`.
Reviewers: NoQ, alexfh
Reviewed By: NoQ
Subscribers: cfe-commits, gerazo, gsd, george.karpenkov, alexfh, xazax.hun,
baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho,
Szelethus, donat.nagy, dkrupp
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60670
llvm-svn: 362014
The `cplusplus.SelfAssignment` checker has a visitor that is added
to every `BugReport` to mark the to branch of the self assignment
operator with e.g. `rhs == *this` and `rhs != *this`. With the new
`NoteTag` feature this visitor is not needed anymore. Instead the
checker itself marks the two branches using the `NoteTag`s.
Differential Revision: https://reviews.llvm.org/D62479
llvm-svn: 361818
When initialization of virtual base classes is skipped, we now tell the user
about it, because this aspect of C++ isn't very well-known.
The implementation is based on the new "note tags" feature (r358781).
In order to make use of it, allow note tags to produce prunable notes,
and move the note tag factory to CoreEngine.
Differential Revision: https://reviews.llvm.org/D61817
llvm-svn: 361682
This patch adds the run-time CFG branch that would skip initialization of
virtual base classes depending on whether the constructor is called from a
superclass constructor or not. Previously the Static Analyzer was already
skipping virtual base-class initializers in such constructors, but it wasn't
skipping their arguments and their potential side effects, which was causing
pr41300 (and was generally incorrect). The previous skipping behavior is
now replaced with a hard assertion that we're not even getting there due
to how our CFG works.
The new CFG element is under a CFG build option so that not to break other
consumers of the CFG by this change. Static Analyzer support for this change
is implemented.
Differential Revision: https://reviews.llvm.org/D61816
llvm-svn: 361681
Turn it into a variant class instead. This conversion does indeed save some code
but there's a plan to add support for more kinds of terminators that aren't
necessarily based on statements, and with those in mind it becomes more and more
confusing to have CFGTerminators implicitly convertible to a Stmt *.
Differential Revision: https://reviews.llvm.org/D61814
llvm-svn: 361586
Same patch as D62093, but for checker/plugin options, the only
difference being that options for alpha checkers are implicitly marked
as alpha.
Differential Revision: https://reviews.llvm.org/D62093
llvm-svn: 361566
These options are now only visible under
-analyzer-checker-option-help-developer.
Differential Revision: https://reviews.llvm.org/D61839
llvm-svn: 361561
Previously, the only way to display the list of available checkers was
to invoke the analyzer with -analyzer-checker-help frontend flag. This
however wasn't really great from a maintainer standpoint: users came
across checkers meant strictly for development purposes that weren't to
be tinkered with, or those that were still in development. This patch
creates a clearer division in between these categories.
From now on, we'll have 3 flags to display the list checkers. These
lists are mutually exclusive and can be used in any combination (for
example to display both stable and alpha checkers).
-analyzer-checker-help: Displays the list for stable, production ready
checkers.
-analyzer-checker-help-alpha: Displays the list for in development
checkers. Enabling is discouraged
for non-development purposes.
-analyzer-checker-help-developer: Modeling and debug checkers. Modeling
checkers shouldn't be enabled/disabled
by hand, and debug checkers shouldn't
be touched by users.
Differential Revision: https://reviews.llvm.org/D62093
llvm-svn: 361558
Add the new frontend flag -analyzer-checker-option-help to display all
checker/package options.
Differential Revision: https://reviews.llvm.org/D57858
llvm-svn: 361552
This patch refactors begin and end symbol creation by moving symbol
conjuration into the `create...` functions. This way the functions'
responsibilities are clearer and makes possible to add more functions
handling these symbols (e.g. functions for handling the container's
size) without code multiplication.
Differential Revision: https://reviews.llvm.org/D61136
llvm-svn: 361141
Since D57922, the config table contains every checker option, and it's default
value, so having it as an argument for getChecker*Option is redundant.
By the time any of the getChecker*Option function is called, we verified the
value in CheckerRegistry (after D57860), so we can confidently assert here, as
any irregularities detected at this point must be a programmer error. However,
in compatibility mode, verification won't happen, so the default value must be
restored.
This implies something else, other than adding removing one more potential point
of failure -- debug.ConfigDumper will always contain valid values for
checker/package options!
Differential Revision: https://reviews.llvm.org/D59195
llvm-svn: 361042
Validate whether the option exists, and also whether the supplied value is of
the correct type. With this patch, invoking the analyzer should be, at least
in the frontend mode, a lot safer.
Differential Revision: https://reviews.llvm.org/D57860
llvm-svn: 361011
The more entries we have in AnalyzerOptions::ConfigTable, the more helpful
debug.ConfigDumper is. With this patch, I'm pretty confident that it'll now emit
the entire state of the analyzer, minus the frontend flags.
It would be nice to reserve the config table specifically to checker options
only, as storing the regular analyzer configs is kinda redundant.
Differential Revision: https://reviews.llvm.org/D57922
llvm-svn: 361006
Summary:
This patch implements the source location builtins `__builtin_LINE(), `__builtin_FUNCTION()`, `__builtin_FILE()` and `__builtin_COLUMN()`. These builtins are needed to implement [`std::experimental::source_location`](https://rawgit.com/cplusplus/fundamentals-ts/v2/main.html#reflection.src_loc.creation).
With the exception of `__builtin_COLUMN`, GCC also implements these builtins, and Clangs behavior is intended to match as closely as possible.
Reviewers: rsmith, joerg, aaron.ballman, bogner, majnemer, shafik, martong
Reviewed By: rsmith
Subscribers: rnkovacs, loskutov, riccibruno, mgorny, kunitoki, alexr, majnemer, hfinkel, cfe-commits
Differential Revision: https://reviews.llvm.org/D37035
llvm-svn: 360937
The checker was crashing when it was trying to assume a structure
to be null or non-null so that to evaluate the effect of the annotation.
Differential Revision: https://reviews.llvm.org/D61958
llvm-svn: 360790
Suppress MIG checker false positives that occur when the programmer increments
the reference count before calling a MIG destructor, and the MIG destructor
literally boils down to decrementing the reference count.
Differential Revision: https://reviews.llvm.org/D61925
llvm-svn: 360737
When looking for the location context of the call site, unwrap block invocation
contexts because they are attached to the current AnalysisDeclContext
while what we need is the previous AnalysisDeclContext.
Differential Revision: https://reviews.llvm.org/D61545
llvm-svn: 360202
new expression.
This was voted into C++20 as a defect report resolution, so we
retroactively apply it to all prior language modes (though it can never
actually be used before C++11 mode).
llvm-svn: 360006
https://bugs.llvm.org/show_bug.cgi?id=41741
Pretty much the same as D61246 and D61106, this time for __complex__ types. Upon
further investigation, I realized that we should regard all types
Type::isScalarType returns true for as primitive, so I merged
isMemberPointerType(), isBlockPointerType() and isAnyComplexType()` into that
instead.
I also stumbled across yet another bug,
https://bugs.llvm.org/show_bug.cgi?id=41753, but it seems to be unrelated to
this checker.
Differential Revision: https://reviews.llvm.org/D61569
llvm-svn: 359998
During my work on analyzer dependencies, I created a great amount of new
checkers that emitted no diagnostics at all, and were purely modeling some
function or another.
However, the user shouldn't really disable/enable these by hand, hence this
patch, which hides these by default. I intentionally chose not to hide alpha
checkers, because they have a scary enough name, in my opinion, to cause no
surprise when they emit false positives or cause crashes.
The patch introduces the Hidden bit into the TableGen files (you may remember
it before I removed it in D53995), and checkers that are either marked as
hidden, or are in a package that is marked hidden won't be displayed under
-analyzer-checker-help. -analyzer-checker-help-hidden, a new flag meant for
developers only, displays the full list.
Differential Revision: https://reviews.llvm.org/D60925
llvm-svn: 359720
https://bugs.llvm.org/show_bug.cgi?id=41611
Similarly to D61106, the checker ran over an llvm_unreachable for vector types:
struct VectorSizeLong {
VectorSizeLong() {}
__attribute__((__vector_size__(16))) long x;
};
void __vector_size__LongTest() {
VectorSizeLong v;
}
Since, according to my short research,
"The vector_size attribute is only applicable to integral and float scalars,
although arrays, pointers, and function return values are allowed in conjunction
with this construct."
[src: https://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Vector-Extensions.html#Vector-Extensions]
vector types are safe to regard as primitive.
Differential Revision: https://reviews.llvm.org/D61246
llvm-svn: 359539
Currently we always inline functions that have no branches, i.e. have exactly
three CFG blocks: ENTRY, some code, EXIT. This makes sense because when there
are no branches, it means that there's no exponential complexity introduced
by inlining such function. Such functions also don't trigger various fundamental
problems with our inlining mechanism, such as the problem of inlined
defensive checks.
Sometimes the CFG may contain more blocks, but in practice it still has
linear structure because all directions (except, at most, one) of all branches
turned out to be unreachable. When this happens, still treat the function
as "small". This is useful, in particular, for dealing with C++17 if constexpr.
Differential Revision: https://reviews.llvm.org/D61051
llvm-svn: 359531
Don't crash when trying to model a call in which the callee is unknown
in compile time, eg. a pointer-to-member call.
Differential Revision: https://reviews.llvm.org/D61285
llvm-svn: 359530
This patch is more of a fix than a real improvement: in checkPostCall()
we should return immediately after finding the right call and handling
it. This both saves unnecessary processing and double-handling calls by
mistake.
Differential Revision: https://reviews.llvm.org/D61134
llvm-svn: 359283
Because RetainCountChecker has custom "local" reasoning about escapes,
it has a separate facility to deal with tracked symbols at end of analysis
and check them for leaks regardless of whether they're dead or not.
This facility iterates over the list of tracked symbols and reports
them as leaks, but it needs to treat the return value specially.
Some custom allocators tend to return the value with an offset, storing
extra metadata at the beginning of the buffer. In this case the return value
would be a non-base region. In order to avoid false positives, we still need to
find the original symbol within the return value, otherwise it'll be unable
to match it to the item in the list of tracked symbols.
Differential Revision: https://reviews.llvm.org/D60991
llvm-svn: 359263
the assertion is in fact incorrect: there is a cornercase in Objective-C++
in which a C++ object is not constructed with a constructor, but merely
zero-initialized. Namely, this happens when an Objective-C message is sent
to a nil and it is supposed to return a C++ object.
Differential Revision: https://reviews.llvm.org/D60988
llvm-svn: 359262
https://bugs.llvm.org/show_bug.cgi?id=41590
For the following code snippet, UninitializedObjectChecker crashed:
struct MyAtomicInt {
_Atomic(int) x;
MyAtomicInt() {}
};
void entry() {
MyAtomicInt b;
}
The problem was that _Atomic types were not regular records, unions,
dereferencable or primitive, making the checker hit the llvm_unreachable at
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp:347.
The solution is to regard these types as primitive as well. The test case shows
that with this addition, not only are we able to get rid of the crash, but we
can identify x as uninitialized.
Differential Revision: https://reviews.llvm.org/D61106
llvm-svn: 359230
If macro "CHECK_X(x)" expands to something like "if (x != NULL) ...",
the "Assuming..." note no longer says "Assuming 'x' is equal to CHECK_X".
Differential Revision: https://reviews.llvm.org/D59121
llvm-svn: 359037
Summary:
The existing CTU mechanism imports `FunctionDecl`s where the definition is available in another TU. This patch extends that to VarDecls, to bind more constants.
- Add VarDecl importing functionality to CrossTranslationUnitContext
- Import Decls while traversing them in AnalysisConsumer
- Add VarDecls to CTU external mappings generator
- Name changes from "external function map" to "external definition map"
Reviewers: NoQ, dcoughlin, xazax.hun, george.karpenkov, martong
Reviewed By: xazax.hun
Subscribers: Charusso, baloghadamsoftware, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, george.karpenkov, mgorny, whisperity, szepet, rnkovacs, a.sidorin, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D46421
llvm-svn: 358968
A compilation warning was in my previous commit which broke the buildbot
because it is using `-Werror` for compilation. This patch fixes this
issue.
llvm-svn: 358955
Currently iterator checkers record comparison of iterator positions
and process them for keeping track the distance between them (e.g.
whether a position is the same as the end position). However this
makes some processing unnecessarily complex and it is not needed at
all: we only need to keep track between the abstract symbols stored
in these iterator positions. This patch changes this and opens the
path to comparisons to the begin() and end() symbols between the
container (e.g. size, emptiness) which are stored as symbols, not
iterator positions. The functionality of the checker is unchanged.
Differential Revision: https://reviews.llvm.org/D53701
llvm-svn: 358951
When growing a body on a body farm, it's essential to use the same redeclaration
of the function that's going to be used during analysis. Otherwise our
ParmVarDecls won't match the ones that are used to identify argument regions.
This boils down to trusting the reasoning in AnalysisDeclContext. We shouldn't
canonicalize the declaration before farming the body because it makes us not
obey the sophisticated decision-making process of AnalysisDeclContext.
Differential Revision: https://reviews.llvm.org/D60899
llvm-svn: 358946
Stuffing invalid source locations (such as those in functions produced by
body farms) into path diagnostics causes crashes.
Fix a typo in a nearby function name.
Differential Revision: https://reviews.llvm.org/D60808
llvm-svn: 358945
Implement cplusplus.SmartPtrModeling, a new checker that doesn't
emit any warnings but models methods of smart pointers more precisely.
For now the only thing it does is make `(bool) P` return false when `P`
is a freshly moved pointer. This addresses a false positive in the
use-after-move-checker.
Differential Revision: https://reviews.llvm.org/D60796
llvm-svn: 358944
Moved UninitializedObjectChecker from the 'alpha.cplusplus' to the
'optin.cplusplus' package.
Differential Revision: https://reviews.llvm.org/D58573
llvm-svn: 358797
TL;DR:
* Add checker and package options to the TableGen files
* Added a new class called CmdLineOption, and both Package and Checker recieved
a list<CmdLineOption> field.
* Added every existing checker and package option to Checkers.td.
* The CheckerRegistry class
* Received some comments to most of it's inline classes
* Received the CmdLineOption and PackageInfo inline classes, a list of
CmdLineOption was added to CheckerInfo and PackageInfo
* Added addCheckerOption and addPackageOption
* Added a new field called Packages, used in addPackageOptions, filled up in
addPackage
Detailed description:
In the last couple months, a lot of effort was put into tightening the
analyzer's command line interface. The main issue is that it's spectacularly
easy to mess up a lenghty enough invocation of the analyzer, and the user was
given no warnings or errors at all in that case.
We can divide the effort of resolving this into several chapters:
* Non-checker analyzer configurations:
Gather every analyzer configuration into a dedicated file. Emit errors for
non-existent configurations or incorrect values. Be able to list these
configurations. Tighten AnalyzerOptions interface to disallow making such
a mistake in the future.
* Fix the "Checker Naming Bug" by reimplementing checker dependencies:
When cplusplus.InnerPointer was enabled, it implicitly registered
unix.Malloc, which implicitly registered some sort of a modeling checker
from the CStringChecker family. This resulted in all of these checker
objects recieving the name "cplusplus.InnerPointer", making AnalyzerOptions
asking for the wrong checker options from the command line:
cplusplus.InnerPointer:Optimisic
istead of
unix.Malloc:Optimistic.
This was resolved by making CheckerRegistry responsible for checker
dependency handling, instead of checkers themselves.
* Checker options: (this patch included!)
Same as the first item, but for checkers.
(+ minor fixes here and there, and everything else that is yet to come)
There were several issues regarding checker options, that non-checker
configurations didn't suffer from: checker plugins are loaded runtime, and they
could add new checkers and new options, meaning that unlike for non-checker
configurations, we can't collect every checker option purely by generating code.
Also, as seen from the "Checker Naming Bug" issue raised above, they are very
rarely used in practice, and all sorts of skeletons fell out of the closet while
working on this project.
They were extremely problematic for users as well, purely because of how long
they were. Consider the following monster of a checker option:
alpha.cplusplus.UninitializedObject:CheckPointeeInitialization=false
While we were able to verify whether the checker itself (the part before the
colon) existed, any errors past that point were unreported, easily resulting
in 7+ hours of analyses going to waste.
This patch, similarly to how dependencies were reimplemented, uses TableGen to
register checker options into Checkers.td, so that Checkers.inc now contains
entries for both checker and package options. Using the preprocessor,
Checkers.inc is converted into code in CheckerRegistry, adding every builtin
(checkers and packages that have an entry in the Checkers.td file) checker and
package option to the registry. The new addPackageOption and addCheckerOption
functions expose the same functionality to statically-linked non-builtin and
plugin checkers and packages as well.
Emitting errors for incorrect user input, being able to list these options, and
some other functionalies will land in later patches.
Differential Revision: https://reviews.llvm.org/D57855
llvm-svn: 358752
Ideally, there is no reason behind not being able to depend on checkers that
come from a different plugin (or on builtin checkers) -- however, this is only
possible if all checkers are added to the registry before resolving checker
dependencies. Since I used a binary search in my addDependency method, this also
resulted in an assertion failure (due to CheckerRegistry::Checkers not being
sorted), since the function used by plugins to register their checkers
(clang_registerCheckers) calls addDependency.
This patch resolves this issue by only noting which dependencies have to
established when addDependency is called, and resolves them at a later stage
when no more checkers are added to the registry, by which point
CheckerRegistry::Checkers is already sorted.
Differential Revision: https://reviews.llvm.org/D59461
llvm-svn: 358750
Default RegionStore bindings represent values that can be obtained by loading
from anywhere within the region, not just the specific offset within the region
that they are said to be bound to. For example, default-binding a character \0
to an int (eg., via memset()) means that the whole int is 0, not just
that its lower byte is 0.
Even though memset and bzero were modeled this way, it didn't work correctly
when applied to simple variables. Eg., in
int x;
memset(x, 0, sizeof(x));
we did produce a default binding, but were unable to read it later, and 'x'
was perceived as an uninitialized variable even after memset.
At the same time, if we replace 'x' with a variable of a structure or array
type, accessing fields or elements of such variable was working correctly,
which was enough for most cases. So this was only a problem for variables of
simple integer/enumeration/floating-point/pointer types.
Fix loading default bindings from RegionStore for regions of simple variables.
Add a unit test to document the API contract as well.
Differential Revision: https://reviews.llvm.org/D60742
llvm-svn: 358722
There are barely any lines I haven't changed in these files, so I think I could
might as well leave it in an LLVM coding style conforming state. I also renamed
2 functions and moved addDependency out of line to ease on followup patches.
Differential Revision: https://reviews.llvm.org/D59457
llvm-svn: 358676
For the following code snippet:
void builtin_function_call_crash_fixes(char *c) {
__builtin_strncpy(c, "", 6);
__builtin_memset(c, '\0', (0));
__builtin_memcpy(c, c, 0);
}
security.insecureAPI.DeprecatedOrUnsafeBufferHandling caused a regression, as it
didn't recognize functions starting with __builtin_. Fixed exactly that.
I wanted to modify an existing test file, but the two I found didn't seem like
perfect candidates. While I was there, I prettified their RUN: lines.
Differential Revision: https://reviews.llvm.org/D59812
llvm-svn: 358609
Writing stuff into an argument variable is usually equivalent to writing stuff
to a local variable: it will have no effect outside of the function.
There's an important exception from this rule: if the argument variable has
a non-trivial destructor, the destructor would be invoked on
the parent stack frame, exposing contents of the otherwise dead
argument variable to the caller.
If such argument is the last place where a pointer is stored before the function
exits and the function is the one we've started our analysis from (i.e., we have
no caller context for it), we currently diagnose a leak. This is incorrect
because the destructor of the argument still has access to the pointer.
The destructor may deallocate the pointer or even pass it further.
Treat writes into such argument regions as "escapes" instead, suppressing
spurious memory leak reports but not messing with dead symbol removal.
Differential Revision: https://reviews.llvm.org/D60112
llvm-svn: 358321
The idea behind this heuristic is that normally the visitor is there to
inform the user that a certain function may fail to initialize a certain
out-parameter. For system header functions this is usually dictated by the
contract, and it's unlikely that the header function has accidentally
forgot to put the value into the out-parameter; it's more likely
that the user has intentionally skipped the error check.
Warnings on skipped error checks are more like security warnings;
they aren't necessarily useful for all users, and they should instead
be introduced on a per-API basis.
Differential Revision: https://reviews.llvm.org/D60107
llvm-svn: 357810
Requires making the llvm::MemoryBuffer* stored by SourceManager const,
which in turn requires making the accessors for that return const
llvm::MemoryBuffer*s and updating all call sites.
The original motivation for this was to use it and fix the TODO in
CodeGenAction.cpp's ConvertBackendLocation() by using the UnownedTag
version of createFileID, and since llvm::SourceMgr* hands out a const
llvm::MemoryBuffer* this is required. I'm not sure if fixing the TODO
this way actually works, but this seems like a good change on its own
anyways.
No intended behavior change.
Differential Revision: https://reviews.llvm.org/D60247
llvm-svn: 357724
__builtin_constant_p(x) is a compiler builtin that evaluates to 1 when
its argument x is a compile-time constant and to 0 otherwise. In CodeGen
it is simply lowered to the respective LLVM intrinsic. In the Analyzer
we've been trying to delegate modeling to Expr::EvaluateAsInt, which is
allowed to sometimes fail for no apparent reason.
When it fails, let's conservatively return false. Modeling it as false
is pretty much never wrong, and it is only required to return true
on a best-effort basis, which every user should expect.
Fixes VLAChecker false positives on code that tries to emulate
static asserts in C by constructing a VLA of dynamic size -1 under the
assumption that this dynamic size is actually a constant
in the sense of __builtin_constant_p.
Differential Revision: https://reviews.llvm.org/D60110
llvm-svn: 357557
At least gcc 7.4 complained with
../tools/clang/lib/StaticAnalyzer/Checkers/Taint.cpp:26:53: warning: extra ';' [-Wpedantic]
TaintTagType);
^
llvm-svn: 357461
It turns out that SourceManager::isInSystemHeader() crashes when an invalid
source location is passed into it. Invalid source locations are relatively
common: not only they come from body farms, but also, say, any function in C
that didn't come with a forward declaration would have an implicit
forward declaration with invalid source locations.
There's a more comfy API for us to use in the Static Analyzer:
CallEvent::isInSystemHeader(), so just use that.
Differential Revision: https://reviews.llvm.org/D59901
llvm-svn: 357329
It is now an inter-checker communication API, similar to the one that
connects MallocChecker/CStringChecker/InnerPointerChecker: simply a set of
setters and getters for a state trait.
Differential Revision: https://reviews.llvm.org/D59861
llvm-svn: 357326
The transfer function for the CFG element that represents a logical operation
computes the value of the operation and does nothing else. The element
appears after all the short circuit decisions were made, so they don't need
to be made again at this point.
Because our expression evaluation is imprecise, it is often hard to
discriminate between:
(1) we don't know the value of the RHS because we failed to evaluate it
and
(2) we don't know the value of the RHS because it didn't need to be evaluated.
This is hard because it depends on our knowledge about the value of the LHS
(eg., if LHS is true, then RHS in (LHS || RHS) doesn't need to be computed)
but LHS itself may have been evaluated imprecisely and we don't know whether
it is true or not. Additionally, the Analyzer wouldn't necessarily even remember
what the value of the LHS was because theoretically it's not really necessary
to know it for any future evaluations.
In order to work around these issues, the transfer function for logical
operations consists in looking at the ExplodedGraph we've constructed so far
in order to figure out from which CFG direction did we arrive here.
Such post-factum backtracking that doesn't involve looking up LHS and RHS values
is usually possible. However sometimes it fails because when we deduplicate
exploded nodes with the same program point and the same program state we may end
up in a situation when we reached the same program point from two or more
different directions.
By removing the assertion, we admit that the procedure indeed sometimes fails to
work. When it fails, we also admit that we don't know the value of the logical
operator.
Differential Revision: https://reviews.llvm.org/D59857
llvm-svn: 357325
Almost all path-sensitive checkers need to tell the user when something specific
to that checker happens along the execution path but does not constitute a bug
on its own. For instance, a call to operator delete in C++ has consequences
that are specific to a use-after-free bug. Deleting an object is not a bug
on its own, but when the Analyzer finds an execution path on which a deleted
object is used, it'll have to explain to the user when exactly during that path
did the deallocation take place.
Historically such custom notes were added by implementing "bug report visitors".
These visitors were post-processing bug reports by visiting every ExplodedNode
along the path and emitting path notes whenever they noticed that a change that
is relevant to a bug report occurs within the program state. For example,
it emits a "memory is deallocated" note when it notices that a pointer changes
its state from "allocated" to "deleted".
The "visitor" approach is powerful and efficient but hard to use because
such preprocessing implies that the developer first models the effects
of the event (say, changes the pointer's state from "allocated" to "deleted"
as part of operator delete()'s transfer function) and then forgets what happened
and later tries to reverse-engineer itself and figure out what did it do
by looking at the report.
The proposed approach tries to avoid discarding the information that was
available when the transfer function was evaluated. Instead, it allows the
developer to capture all the necessary information into a closure that
will be automatically invoked later in order to produce the actual note.
This should reduce boilerplate and avoid very painful logic duplication.
On the technical side, the closure is a lambda that's put into a special kind of
a program point tag, and a special bug report visitor visits all nodes in the
report and invokes all note-producing closures it finds along the path.
For now it is up to the lambda to make sure that the note is actually relevant
to the report. For instance, a memory deallocation note would be irrelevant when
we're reporting a division by zero bug or if we're reporting a use-after-free
of a different, unrelated chunk of memory. The lambda can figure these thing out
by looking at the bug report object that's passed into it.
A single checker is refactored to make use of the new functionality: MIGChecker.
Its program state is trivial, making it an easy testing ground for the first
version of the API.
Differential Revision: https://reviews.llvm.org/D58367
llvm-svn: 357323
Since rL335814, if the constraint manager cannot find a range set for `A - B`
(where `A` and `B` are symbols) it looks for a range for `B - A` and returns
it negated if it exists. However, if a range set for both `A - B` and `B - A`
is stored then it only returns the first one. If we both use `A - B` and
`B - A`, these expressions behave as two totally unrelated symbols. This way
we miss some useful deductions which may lead to false negatives or false
positives.
This tiny patch changes this behavior: if the symbolic expression the
constraint manager is looking for is a difference `A - B`, it tries to
retrieve the range for both `A - B` and `B - A` and if both exists it returns
the intersection of range `A - B` and the negated range of `B - A`. This way
every time a checker applies new constraints to the symbolic difference or to
its negated it always affects both the original difference and its negated.
Differential Revision: https://reviews.llvm.org/D55007
llvm-svn: 357167
Remove CompilerInstance::VirtualFileSystem and
CompilerInstance::setVirtualFileSystem, instead relying on the VFS in
the FileManager. CompilerInstance and its clients already went to some
trouble to make these match. Now they are guaranteed to match.
As part of this, I added a VFS parameter (defaults to nullptr) to
CompilerInstance::createFileManager, to avoid repeating construction
logic in clients that just wanted to customize the VFS.
https://reviews.llvm.org/D59377
llvm-svn: 357037
r356634 didn't fix all the problems caused by r356222 - even though simple
constructors involving transparent init-list expressions are now evaluated
precisely, many more complicated constructors aren't, for other reasons.
The attached test case is an example of a constructor that will never be
evaluated precisely - simply because there isn't a constructor there (instead,
the program invokes run-time undefined behavior by returning without a return
statement that should have constructed the return value).
Fix another part of the problem for such situations: evaluate transparent
init-list expressions transparently, so that to avoid creating ill-formed
"transparent" nonloc::CompoundVals.
Differential Revision: https://reviews.llvm.org/D59622
llvm-svn: 356969
Summary:
If the constraint information is not changed between two program states the
analyzer has not learnt new information and made no report. But it is
possible to happen because we have no information at all. The new approach
evaluates the condition to determine if that is the case and let the user
know we just `Assuming...` some value.
Reviewers: NoQ, george.karpenkov
Reviewed By: NoQ
Subscribers: llvm-commits, xazax.hun, baloghadamsoftware, szepet, a.sidorin,
mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gsd, gerazo
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D57410
llvm-svn: 356323
Summary:
Removed the `GDM` checking what could prevent reports made by this visitor.
Now we rely on constraint changes instead.
(It reapplies 356318 with a feature from 356319 because build-bot failure.)
Reviewers: NoQ, george.karpenkov
Reviewed By: NoQ
Subscribers: cfe-commits, jdoerfert, gerazo, xazax.hun, baloghadamsoftware,
szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp
Tags: #clang
Differential Revision: https://reviews.llvm.org/D54811
llvm-svn: 356322
Summary: If the constraint information is not changed between two program states the analyzer has not learnt new information and made no report. But it is possible to happen because we have no information at all. The new approach evaluates the condition to determine if that is the case and let the user know we just 'Assuming...' some value.
Reviewers: NoQ, george.karpenkov
Reviewed By: NoQ
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, gsd, gerazo
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57410
llvm-svn: 356319
Summary: Removed the `GDM` checking what could prevent reports made by this visitor. Now we rely on constraint changes instead.
Reviewers: NoQ, george.karpenkov
Reviewed By: NoQ
Subscribers: jdoerfert, gerazo, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp
Tags: #clang
Differential Revision: https://reviews.llvm.org/D54811
llvm-svn: 356318
RegionStore now knows how to bind a nonloc::CompoundVal that represents the
value of an aggregate initializer when it has its initial segment of sub-values
correspond to base classes.
Additionally, fixes the crash from pr40022.
Differential Revision: https://reviews.llvm.org/D59054
llvm-svn: 356222
For a rather short code snippet, if debug.ReportStmts (added in this patch) was
enabled, a bug reporter visitor crashed:
struct h {
operator int();
};
int k() {
return h();
}
Ultimately, this originated from PathDiagnosticLocation::createMemberLoc, as it
didn't handle the case where it's MemberExpr typed parameter returned and
invalid SourceLocation for MemberExpr::getMemberLoc. The solution was to find
any related valid SourceLocaion, and Stmt::getBeginLoc happens to be just that.
Differential Revision: https://reviews.llvm.org/D58777
llvm-svn: 356161
Checking whether two regions are the same is a partially decidable problem:
either we know for sure that they are the same or we cannot decide. A typical
case for this are the symbolic regions based on conjured symbols. Two
different conjured symbols are either the same or they are different. Since
we cannot decide this and want to reduce false positives as much as possible
we exclude these regions whenever checking whether two containers are the
same at iterator mismatch check.
Differential Revision: https://reviews.llvm.org/D53754
llvm-svn: 356049
Buildbot breaks when LLVm is compiled with memory sanitizer.
WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0xa3d16d8 in getMacroNameAndPrintExpansion(blahblah)
lib/StaticAnalyzer/Core/PlistDiagnostics.cpp:903:11
llvm-svn: 355911
When there is a functor-like macro which is passed as parameter to another
"function" macro then its parameters are not listed at the place of expansion:
#define foo(x) int bar() { return x; }
#define hello(fvar) fvar(0)
hello(foo)
int main() { 1 / bar(); }
Expansion of hello(foo) asserted Clang, because it expected an l_paren token in
the 3rd line after "foo", since it is a function-like token.
Patch by Tibor Brunner!
Differential Revision: https://reviews.llvm.org/D57893
llvm-svn: 355903
In the commited testfile, macro expansion (the one implemented for the plist
output) runs into an infinite recursion. The issue originates from the algorithm
being faulty, as in
#define value REC_MACRO_FUNC(value)
the "value" is being (or at least attempted) expanded from the same macro.
The solved this issue by gathering already visited macros in a set, which does
resolve the crash, but will result in an incorrect macro expansion, that would
preferably be fixed down the line.
Patch by Tibor Brunner!
Differential Revision: https://reviews.llvm.org/D57891
llvm-svn: 355705
Asserting on invalid input isn't very nice, hence the patch to emit an error
instead.
This is the first of many patches to overhaul the way we handle checker options.
Differential Revision: https://reviews.llvm.org/D57850
llvm-svn: 355704
In D55734, we implemented a far more general way of describing taint propagation
rules for functions, like being able to specify an unlimited amount of
source and destination parameters. Previously, we didn't have a particularly
elegant way of expressing the propagation rules for functions that always return
(either through an out-param or return value) a tainted value. In this patch,
we model these functions similarly to other ones, by assigning them a
TaintPropagationRule that describes that they "create a tainted value out of
nothing".
The socket C function is somewhat special, because for certain parameters (for
example, if we supply localhost as parameter), none of the out-params should
be tainted. For this, we added a general solution of being able to specify
custom taint propagation rules through function pointers.
Patch by Gábor Borsik!
Differential Revision: https://reviews.llvm.org/D59055
llvm-svn: 355703
Summary:
When comparing a symbolic region and a constant, the constant would be
widened or truncated to the width of a void pointer, meaning that the
constant could be incorrectly truncated when handling symbols for
non-default address spaces. In the attached test case this resulted in a
false positive since the constant was truncated to zero. To fix this,
widen/truncate the constant to the width of the symbol expression's
type.
This commit does not consider non-symbolic regions as I'm not sure how
to generalize getting the type there.
This fixes PR40814.
Reviewers: NoQ, zaks.anna, george.karpenkov
Reviewed By: NoQ
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, Szelethus, donat.nagy, dkrupp, jdoerfert, Charusso, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58665
llvm-svn: 355592
This patch includes the necessary code for converting between a fixed point type and integer.
This also includes constant expression evaluation for conversions with these types.
Differential Revision: https://reviews.llvm.org/D56900
llvm-svn: 355462
The gets function has no SrcArgs. Because the default value for isTainted was
false, it didn't mark its DstArgs as tainted.
Patch by Gábor Borsik!
Differential Revision: https://reviews.llvm.org/D58828
llvm-svn: 355396
Under the term "subchecker", I mean checkers that do not have a checker class on
their own, like unix.MallocChecker to unix.DynamicMemoryModeling.
Since a checker object was required in order to retrieve checker options,
subcheckers couldn't possess options on their own.
This patch is also an excuse to change the argument order of getChecker*Option,
it always bothered me, now it resembles the actual command line argument
(checkername:option=value).
Differential Revision: https://reviews.llvm.org/D57579
llvm-svn: 355297
#define f(y) x
#define x f(x)
int main() { x; }
This example results a compilation error since "x" in the first line was not
defined earlier. However, the macro expression printer goes to an infinite
recursion on this example.
Patch by Tibor Brunner!
Differential Revision: https://reviews.llvm.org/D57892
llvm-svn: 354806
Add more "consuming" functions. For now only vm_deallocate() was supported.
Add a non-zero value that isn't an error; this value is -305 ("MIG_NO_REPLY")
and it's fine to deallocate data when you are returning this error.
Make sure that the mig_server_routine annotation is inherited.
rdar://problem/35380337
Differential Revision: https://reviews.llvm.org/D58397
llvm-svn: 354643
When a MIG server routine argument is released in an automatic destructor,
the Static Analyzer thinks that this happens after the return statement, and so
the violation of the MIG convention doesn't happen.
Of course, it doesn't quite work that way, so this is a false negative.
Add a hack that makes the checker double-check at the end of function
that no argument was released when the routine fails with an error.
rdar://problem/35380337
Differential Revision: https://reviews.llvm.org/D58392
llvm-svn: 354642
Add a BugReporterVisitor for highlighting the events of deallocating a
parameter. All such events are relevant to the emitted report (as long as the
report is indeed emitted), so all of them will get highlighted.
Add a trackExpressionValue visitor for highlighting where does the error return
code come from.
Do not add a trackExpressionValue visitor for highlighting how the deallocated
argument(s) was(were) copied around. This still remains to be implemented.
rdar://problem/35380337
Differential Revision: https://reviews.llvm.org/D58368
llvm-svn: 354641
r354530 has added a new function/block/message attribute "mig_server_routine"
that attracts compiler's attention to functions that need to follow the MIG
server routine convention with respect to deallocating out-of-line data that
was passed to them as an argument.
Teach the checker to identify MIG routines by looking at this attribute,
rather than by making heuristic-based guesses.
rdar://problem/35380337
Differential Revision: https://reviews.llvm.org/58366
llvm-svn: 354638
This checker detects use-after-free bugs in (various forks of) the Mach kernel
that are caused by errors in MIG server routines - functions called remotely by
MIG clients. The MIG convention forces the server to only deallocate objects
it receives from the client when the routine is executed successfully.
Otherwise, if the server routine exits with an error, the client assumes that
it needs to deallocate the out-of-line data it passed to the server manually.
This means that deallocating such data within the MIG routine and then returning
a non-zero error code is always a dangerous use-after-free bug.
rdar://problem/35380337
Differential Revision: https://reviews.llvm.org/D57558
llvm-svn: 354635
FindLastStoreBRVisitor tries to find the first node in the exploded graph where
the current value was assigned to a region. This node is called the "store
site". It is identified by a pair of Pred and Succ nodes where Succ already has
the binding for the value while Pred does not have it. However the visitor
mistakenly identifies a node pair as the store site where the value is a
`LazyCompoundVal` and `Pred` does not have a store yet but `Succ` has it. In
this case the `LazyCompoundVal` is different in the `Pred` node because it also
contains the store which is different in the two nodes. This error may lead to
crashes (a declaration is cast to a parameter declaration without check) or
misleading bug path notes.
In this patch we fix this problem by checking for unequal `LazyCompoundVals`: if
their region is equal, and their store is the same as the store of their nodes
we consider them as equal when looking for the "store site". This is an
approximation because we do not check for differences of the subvalues
(structure members or array elements) in the stores.
Differential Revision: https://reviews.llvm.org/D58067
llvm-svn: 353943
There are certain unsafe or deprecated (since C11) buffer handling
functions which should be avoided in safety critical code. They
could cause buffer overflows. A new checker,
'security.insecureAPI.DeprecatedOrUnsafeBufferHandling' warns for
every occurrence of such functions (unsafe or deprecated printf,
scanf family, and other buffer handling functions, which now have
a secure variant).
Patch by Dániel Kolozsvári!
Differential Revision: https://reviews.llvm.org/D35068
llvm-svn: 353698