Summary:
This checker verifies if default placement new is provided with pointers
to sufficient storage capacity.
Noncompliant Code Example:
#include <new>
void f() {
short s;
long *lp = ::new (&s) long;
}
Based on SEI CERT rule MEM54-CPP
https://wiki.sei.cmu.edu/confluence/display/cplusplus/MEM54-CPP.+Provide+placement+new+with+properly+aligned+pointe
This patch does not implement checking of the alignment.
Reviewers: NoQ, xazax.hun
Subscribers: mgorny, whisperity, xazax.hun, baloghadamsoftware, szepet,
rnkovacs, a.sidorin, mikhail.ramalho, donat
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71612
This avoids unneeded copies when using a range-based for loops.
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.
Differential Revision: https://reviews.llvm.org/D70869
Method '-[NSCoder decodeValueOfObjCType:at:]' is not only deprecated
but also a security hazard, hence a loud check.
Differential Revision: https://reviews.llvm.org/D71728
MallocChecker warns when memory is passed into -[NSData initWithBytesNoCopy]
but isn't allocated by malloc(), because it will be deallocated by free().
However, initWithBytesNoCopy has an overload that takes an arbitrary block
for deallocating the object. If such overload is used, it is no longer
necessary to make sure that the memory is allocated by malloc().
This is useful for clients that are relying on linearized CFGs for evaluating
subexpressions and want the default initializer to be evaluated properly.
The upcoming lifetime analysis is using this but it might also be useful
for the static analyzer at some point.
Differential Revision: https://reviews.llvm.org/D71642
This canonicalizes the representation of unknown pointer symbols,
which reduces the overall confusion in pointer cast representation.
Patch by Vince Bridgers!
Differential Revision: https://reviews.llvm.org/D70836
This patch introduces the namespaces for the configured functions and
also enables the use of the member functions.
I added an optional Scope field for every configured function. Functions
without Scope match for every function regardless of the namespace.
Functions with Scope will match if the full name of the function starts
with the Scope.
Multiple functions can exist with the same name.
Differential Revision: https://reviews.llvm.org/D70878
AbstractBasicReader.h has quite a few dependencies already,
and that's only likely to increase. Meanwhile, ASTRecordReader
is really an implementation detail of the ASTReader that is only
used in a small number of places.
I've kept it in a public header for the use of projects like Swift
that might want to plug in to Clang's serialization framework.
I've also moved OMPClauseReader into an implementation file,
although it can't be made private because of friendship.
Some AST nodes which stands for implicit initialization is shared. The analyzer
will do the same evaluation on the same nodes resulting in the same state. The
analyzer will "cache out", i.e. it thinks that it visited an already existing
node in the exploded graph. This is not true in this case and we lose coverage.
Since these nodes do not really require any processing from the analyzer
we just omit them from the CFG.
Differential Revision: https://reviews.llvm.org/D71371
This patch introduced additional PointerEscape callbacks after conservative
calls for output parameters. This should not really affect the current
checkers but the upcoming FuchsiaHandleChecker relies on this heavily.
Differential Revision: https://reviews.llvm.org/D71224
The checker was trying to analyze the body of every method in Objective-C
@implementation clause but the sythesized accessor stubs that were introduced
into it by 2073dd2d have no bodies.
While analyzing code `memcmp(a, NULL, n);', where `a' has an unconstrained
symbolic value, the analyzer was emitting a warning about the *first* argument
being a null pointer, even though we'd rather have it warn about the *second*
argument.
This happens because CStringChecker first checks whether the two argument
buffers are in fact the same buffer, in order to take the fast path.
This boils down to assuming `a == NULL' to true. Then the subsequent check
for null pointer argument "discovers" that `a' is null.
Don't take the fast path unless we are *sure* that the buffers are the same.
Otherwise proceed as normal.
Differential Revision: https://reviews.llvm.org/D71322
Sometimes the return value of a comparison operator call is
`UnkownVal`. Since no assumptions can be made on `UnknownVal`,
this leeds to keeping impossible execution paths in the
exploded graph resulting in poor performance and false
positives. To overcome this we replace unknown results of
iterator comparisons by conjured symbols.
Differential Revision: https://reviews.llvm.org/D70244
Debugging the Iterator Modeling checker or any of the iterator checkers
is difficult without being able to see the relations between the
iterator variables and their abstract positions, as well as the abstract
symbols denoting the begin and the end of the container.
This patch adds the checker-specific part of the Program State printing
to the Iterator Modeling checker.
A monolithic checker class is hard to maintain. This patch splits it up
into a modeling part, the three checkers and a debug checker. The common
functions are moved into a library.
Differential Revision: https://reviews.llvm.org/D70320
It was a step in the right direction but it is not clear how can this
fit into the checker API at this point. The pre-escape happens in the
analyzer core and the checker has no control over it. If the checker
is not interestd in a pre-escape it would need to do additional work
on each escape to check if the escaped symbol is originated from an
"uninteresting" pre-escaped memory region. In order to keep the
checker API simple we abandoned this solution for now.
We will reland this once we have a better answer for what to do on the
checker side.
This reverts commit f3a28202ef.
We want to escape all symbols that are stored into escaped regions.
The problem is, we did not know which local regions were escaped. Until now.
This should fix some false positives like the one in the tests.
Differential Revision: https://reviews.llvm.org/D71152
This commit sets the Self and Imp declarations for ObjC method declarations,
in addition to the definitions. It also fixes
a bunch of code in clang that had wrong assumptions about when getSelfDecl() would be set:
- CGDebugInfo::getObjCMethodName and AnalysisConsumer::getFunctionName would assume that it was
set for method declarations part of a protocol, which they never were,
and that self would be a Class type, which it isn't as it is id for a protocol.
Also use the Canonical Decl to index the set of Direct methods so that
when calls and implementations interleave, the same llvm::Function is
used and the same symbol name emitted.
Radar-Id: rdar://problem/57661767
Patch by: Pierre Habouzit
Differential Revision: https://reviews.llvm.org/D71091
When implementation of the block runtime is available, we should not
warn that block layout fields are uninitialized simply because they're
on the stack.
This patch is the last of the series of patches which allow the user to
annotate their functions with taint propagation rules.
I implemented the use of the configured filtering functions. These
functions can remove taintedness from the symbols which are passed at
the specified arguments to the filters.
Differential Revision: https://reviews.llvm.org/D59516
Fix a canonicalization problem for the newly added property accessor stubs that
was causing a wrong decl to be used for 'self' in the accessor's body farm.
Fix a crash when constructing a body farm for accessors of a property
that is declared and @synthesize'd in different (but related) interfaces.
Differential Revision: https://reviews.llvm.org/D70158
Let the checkers use a reference instead of a copy in a range-based
for loop.
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.
Differential Revision: https://reviews.llvm.org/D70047
When bugreporter::trackExpressionValue() is invoked on a DeclRefExpr,
it tries to do most of its computations over the node in which
this DeclRefExpr is computed, rather than on the error node (or whatever node
is stuffed into it). One reason why we can't simply use the error node is
that the binding to that variable might have already disappeared from the state
by the time the bug is found.
In case of the inlined defensive checks visitor, the DeclRefExpr node
is in fact sometimes too *early*: the call in which the inlined defensive check
has happened might have not been entered yet.
Change the visitor to be fine with tracking dead symbols (which it is totally
capable of - the collapse point for the symbol is still well-defined), and fire
it up directly on the error node. Keep using "LVState" to find out which value
should we be tracking, so that there weren't any problems with accidentally
loading an ill-formed value from a dead variable.
Differential Revision: https://reviews.llvm.org/D67932
This patch is motivated by (and factored out from)
https://reviews.llvm.org/D66121 which is a debug info bugfix. Starting
with DWARF 5 all Objective-C methods are nested inside their
containing type, and that patch implements this for synthesized
Objective-C properties.
1. SemaObjCProperty populates a list of synthesized accessors that may
need to inserted into an ObjCImplDecl.
2. SemaDeclObjC::ActOnEnd inserts forward-declarations for all
accessors for which no override was provided into their
ObjCImplDecl. This patch does *not* synthesize AST function
*bodies*. Moving that code from the static analyzer into Sema may
be a good idea though.
3. Places that expect all methods to have bodies have been updated.
I did not update the static analyzer's inliner for synthesized
properties to point back to the property declaration (see
test/Analysis/Inputs/expected-plists/nullability-notes.m.plist), which
I believed to be more bug than a feature.
Differential Revision: https://reviews.llvm.org/D68108
rdar://problem/53782400
For white-box testing correct container and iterator modelling it is essential
to access the internal data structures stored for container and iterators. This
patch introduces a simple debug checkers called debug.IteratorDebugging to
achieve this.
Differential Revision: https://reviews.llvm.org/D67156
- Fix false positive reports of strlcat.
- The return value of strlcat and strlcpy is now correctly calculated.
- The resulting string length of strlcat and strlcpy is now correctly
calculated.
Patch by Daniel Krupp!
Differential Revision: https://reviews.llvm.org/D66049
Summary:
Recognization of function names is done now with the CallDescription
class instead of using IdentifierInfo. This means function name and
argument count is compared too.
A new check for filtering not global-C-functions was added.
Test was updated.
Reviewers: Szelethus, NoQ, baloghadamsoftware, Charusso
Reviewed By: Szelethus, NoQ, Charusso
Subscribers: rnkovacs, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy, Charusso, dkrupp, Szelethus, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67706
Member operator declarations and member operator expressions
have different numbering of parameters and arguments respectively:
one of them includes "this", the other does not.
Account for this inconsistency when figuring out whether
the parameter needs to be manually rebound from the Environment
to the Store when entering a stack frame of an operator call,
as opposed to being constructed with a constructor and as such
already having the necessary Store bindings.
Differential Revision: https://reviews.llvm.org/D69155
The '->' thing has always been confusing; the actual operation '->'
translates to a pointer dereference together with adding a FieldRegion,
but FieldRegion on its own doesn't imply an additional pointer
dereference.
llvm-svn: 375281
One of the first attempts to reduce the size of the exploded graph dumps
was to skip the state dump as long as the state is the same as in all of
the predecessor nodes. With all the new facilities in place (node joining,
diff dumps), this feature doesn't do much, and when it does,
it's more harmful than useful. Let's remove it.
llvm-svn: 375280
The joined nodes now actually have the same state. That was intended
from the start but the original implementation turned out to be buggy.
Differential Revision: https://reviews.llvm.org/D69150
llvm-svn: 375278
ExplodedGraph nodes will now have a numeric identifier stored in them
which will keep track of the order in which the nodes were created
and it will be fully deterministic both accross runs and across machines.
This is extremely useful for debugging as it allows reliably setting
conditional breakpoints by node IDs.
llvm-svn: 375186
Part of C++20 Concepts implementation effort. Added Concept Specialization Expressions that are created when a concept is refe$
D41217 on Phabricator.
(recommit after fixing failing Parser test on windows)
llvm-svn: 374903
Part of C++20 Concepts implementation effort. Added Concept Specialization Expressions that are created when a concept is referenced with arguments, and tests thereof.
llvm-svn: 374882
Added parsing/sema/codegen support for 'parallel master taskloop'
constructs. Some of the clauses, like 'grainsize', 'num_tasks', 'final'
and 'priority' are not supported in full, only constant expressions can
be used currently in these clauses.
llvm-svn: 374791
The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.
llvm-svn: 374717
Some compilers have trouble converting unique_ptr<PathSensitiveBugReport> to
unique_ptr<BugReport> causing some functions to fail to compile.
Changing the return type of the functions that fail to compile does not
appear to have any issues.
I ran into this issue building with clang 3.8 on Ubuntu 16.04.
llvm-svn: 372668
Summary:
https://bugs.llvm.org/show_bug.cgi?id=43102
In today's edition of "Is this any better now that it isn't crashing?", I'd like to show you a very interesting test case with loop widening.
Looking at the included test case, it's immediately obvious that this is not only a false positive, but also a very bad bug report in general. We can see how the analyzer mistakenly invalidated `b`, instead of its pointee, resulting in it reporting a null pointer dereference error. Not only that, the point at which this change of value is noted at is at the loop, rather then at the method call.
It turns out that `FindLastStoreVisitor` works correctly, rather the supplied explodedgraph is faulty, because `BlockEdge` really is the `ProgramPoint` where this happens.
{F9855739}
So it's fair to say that this needs improving on multiple fronts. In any case, at least the crash is gone.
Full ExplodedGraph: {F9855743}
Reviewers: NoQ, xazax.hun, baloghadamsoftware, Charusso, dcoughlin, rnkovacs, TWeaver
Subscribers: JesperAntonsson, uabelho, Ka-Ka, bjope, whisperity, szepet, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66716
llvm-svn: 372269
Traditionally, clang-tidy uses the term check, and the analyzer uses checker,
but in the very early years, this wasn't the case, and code originating from the
early 2010's still incorrectly refer to checkers as checks.
This patch attempts to hunt down most of these, aiming to refer to checkers as
checkers, but preserve references to callback functions (like checkPreCall) as
checks.
Differential Revision: https://reviews.llvm.org/D67140
llvm-svn: 371760
At this point the PathDiagnostic, PathDiagnosticLocation, PathDiagnosticPiece
structures no longer rely on anything specific to Static Analyzer, so we can
move them out of it for everybody to use.
PathDiagnosticConsumers are still to be handed off.
Differential Revision: https://reviews.llvm.org/D67419
llvm-svn: 371661
This method of PathDiagnostic is a part of Static Analyzer's particular
path diagnostic construction scheme. As such, it doesn't belong to
the PathDiagnostic class, but to the Analyzer.
Differential Revision: https://reviews.llvm.org/D67418
llvm-svn: 371660
These static functions deal with ExplodedNodes which is something we don't want
the PathDiagnostic interface to know anything about, as it's planned to be
moved out of libStaticAnalyzerCore.
Differential Revision: https://reviews.llvm.org/D67382
llvm-svn: 371659
That's one of the few random entities in the PathDiagnostic interface that
are specific to the Static Analyzer. By moving them out we could let
everybody use path diagnostics without linking against Static Analyzer.
Differential Revision: https://reviews.llvm.org/D67381
llvm-svn: 371658
Checkers are now required to specify whether they're creating a
path-sensitive report or a path-insensitive report by constructing an
object of the respective type.
This makes BugReporter more independent from the rest of the Static Analyzer
because all Analyzer-specific code is now in sub-classes.
Differential Revision: https://reviews.llvm.org/D66572
llvm-svn: 371450
Allow attaching fixit hints to Static Analyzer BugReports.
Fixits are attached either to the bug report itself or to its notes
(path-sensitive event notes or path-insensitive extra notes).
Add support for fixits in text output (including the default text output that
goes without notes, as long as the fixit "belongs" to the warning).
Add support for fixits in the plist output mode.
Implement a fixit for the path-insensitive DeadStores checker. Only dead
initialization warning is currently covered.
Implement a fixit for the path-sensitive VirtualCall checker when the virtual
method is not pure virtual (in this case the "fix" is to suppress the warning
by qualifying the call).
Both fixits are under an off-by-default flag for now, because they
require more careful testing.
Differential Revision: https://reviews.llvm.org/D65182
llvm-svn: 371257
Most functions that our checkers react upon are not C-style variadic functions,
and therefore they have as many actual arguments as they have formal parameters.
However, it's not impossible to define a variadic function with the same name.
This will crash any checker that relies on CallDescription to check the number
of arguments but silently assumes that the number of parameters is the same.
Change CallDescription to check both the number of arguments and the number of
parameters by default.
If we're intentionally trying to match variadic functions, allow specifying
arguments and parameters separately (possibly omitting any of them).
For now we only have one CallDescription which would make use of those,
namely __builtin_va_start itself.
Differential Revision: https://reviews.llvm.org/D67019
llvm-svn: 371256
There are some functions which can't be given a null pointer as parameter either
because it has a nonnull attribute or it is declared to have undefined behavior
(e.g. strcmp()). Sometimes it is hard to determine from the checker message
which parameter is null at the invocation, so now this information is included
in the message.
This commit fixes https://bugs.llvm.org/show_bug.cgi?id=39358
Reviewed By: NoQ, Szelethus, whisperity
Patch by Tibor Brunner!
Differential Revision: https://reviews.llvm.org/D66333
llvm-svn: 370798
Enables the users to specify an optional flag which would warn for more dead
stores.
Previously it ignored if the dead store happened e.g. in an if condition.
if ((X = generate())) { // dead store to X
}
This patch introduces the `WarnForDeadNestedAssignments` option to the checker,
which is `false` by default - so this change would not affect any previous
users.
I have updated the code, tests and the docs as well. If I missed something, tell
me.
I also ran the analysis on Clang which generated 14 more reports compared to the
unmodified version. All of them seemed reasonable for me.
Related previous patches:
rGf224820b45c6847b91071da8d7ade59f373b96f3
Reviewers: NoQ, krememek, Szelethus, baloghadamsoftware
Reviewed By: Szelethus
Patch by Balázs Benics!
Differential Revision: https://reviews.llvm.org/D66733
llvm-svn: 370767
Range errors (dereferencing or incrementing the past-the-end iterator or
decrementing the iterator of the first element of the range) and access of
invalidated iterators lead to undefined behavior. There is no point to
continue the analysis after such an error on the same execution path, but
terminate it by a sink node (fatal error). This also improves the
performance and helps avoiding double reports (e.g. in case of nested
iterators).
Differential Revision: https://reviews.llvm.org/D62893
llvm-svn: 370314
Write tests for the actual crash that was found. Write comments and refactor
code around 17 style bugs and suppress 3 false positives.
Differential Revision: https://reviews.llvm.org/D66847
llvm-svn: 370246
It was known to be a compile-time constant so it wasn't evaluated during
symbolic execution, but it wasn't evaluated as a compile-time constant either.
Differential Revision: https://reviews.llvm.org/D66565
llvm-svn: 370245
If the global variable has an initializer, we'll ignore it because we're usually
not analyzing the program from the beginning, which means that the global
variable may have changed before we start our analysis.
However when we're analyzing main() as the top-level function, we can rely
on global initializers to still be valid. At least in C; in C++ we have global
constructors that can still break this logic.
This patch allows the Static Analyzer to load constant initializers from
global variables if the top-level function of the current analysis is main().
Differential Revision: https://reviews.llvm.org/D65361
llvm-svn: 370244
According to the SARIF specification, "a text region does not include the character specified by endColumn".
Differential Revision: https://reviews.llvm.org/D65206
llvm-svn: 370060
Summary: EnumCastOutOfRangeChecker should not perform enum range checks on LValueToRValue casts, since this type of cast does not actually change the underlying type. Performing the unnecessary check actually triggered an assertion failure deeper in EnumCastOutOfRange for certain input (which is captured in the accompanying test code).
Reviewers: #clang, Szelethus, gamesh411, NoQ
Reviewed By: Szelethus, gamesh411, NoQ
Subscribers: NoQ, gamesh411, xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, bjope, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66014
llvm-svn: 369760
Our SVal hierarchy doesn't allow modeling pointer casts as no-op. The
pointer type is instead encoded into the pointer object. Defer to our
usual pointer casting facility, SValBuilder::evalBinOp().
Fixes a crash.
llvm-svn: 369729
As discussed on the mailing list, notes originating from the tracking of foreach
loop conditions are always meaningless.
Differential Revision: https://reviews.llvm.org/D66131
llvm-svn: 369613
Summary:
This patch introduces `DynamicCastInfo` similar to `DynamicTypeInfo` which
is stored in `CastSets` which are storing the dynamic cast informations of
objects based on memory regions. It could be used to store and check the
casts and prevent infeasible paths.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D66325
llvm-svn: 369605
In D65724, I do a pretty thorough explanation about how I'm solving this
problem, I think that summary nails whats happening here ;)
Differential Revision: https://reviews.llvm.org/D65725
llvm-svn: 369596
Exactly what it says on the tin! Note that we're talking about interestingness
in general, hence this isn't a control-dependency-tracking specific patch.
Differential Revision: https://reviews.llvm.org/D65724
llvm-svn: 369589
We defined (on the mailing list and here on phabricator) 2 different cases where
retrieving information about a control dependency condition is very important:
* When the condition's last write happened in a different stack frame
* When the collapse point of the condition (when we can constrain it to be
true/false) didn't happen in the actual condition.
It seems like we solved this problem with the help of expression value tracking,
and have started working on better diagnostics notes about this process.
Expression value tracking is nothing more than registering a variety of visitors
to construct reports about it. Each of the registered visitors (ReturnVisitor,
FindLastStoreVisitor, NoStoreFuncVisitor, etc) have something to go by: a
MemRegion, an SVal, an ExplodedNode, etc. For this reason, better explaining a
last write is super simple, we can always just pass on some more information to
the visitor in question (as seen in D65575).
ConditionBRVisitor is a different beast, as it was built for a different
purpose. It is responsible for constructing events at, well, conditions, and is
registered only once, and isn't a part of the "expression value tracking
family". Unfortunately, it is also the visitor to tinker with for constructing
better diagnostics about the collapse point problem.
This creates a need for alternative way to communicate with ConditionBRVisitor
that a specific condition is being tracked for for the reason of being a control
dependency. Since at almost all PathDiagnosticEventPiece construction the
visitor checks interestingness, it makes sense to pair interestingness with a
reason as to why we marked an entity as such.
Differential Revision: https://reviews.llvm.org/D65723
llvm-svn: 369583
Can't add much more to the title! This is part 1, the case where the collapse
point isn't in the condition point is the responsibility of ConditionBRVisitor,
which I'm addressing in part 2.
Differential Revision: https://reviews.llvm.org/D65575
llvm-svn: 369574
Add defensive check that prevents a crash when we try to evaluate a destructor
whose this-value is a concrete integer that isn't a null.
Differential Revision: https://reviews.llvm.org/D65349
llvm-svn: 369450
Calling a pure virtual method during construction or destruction
is undefined behavior. It's worth it to warn about it by default.
That part is now known as the cplusplus.PureVirtualCall checker.
Calling a normal virtual method during construction or destruction
may be fine, but does behave unexpectedly, as it skips virtual dispatch.
Do not warn about this by default, but let projects opt in into it
by enabling the optin.cplusplus.VirtualCall checker manually.
Give the two parts differentiated warning text:
Before:
Call to virtual function during construction or destruction:
Call to pure virtual function during construction
Call to virtual function during construction or destruction:
Call to virtual function during destruction
After:
Pure virtual method call:
Call to pure virtual method 'X::foo' during construction
has undefined behavior
Unexpected loss of virtual dispatch:
Call to virtual method 'Y::bar' during construction
bypasses virtual dispatch
Also fix checker names in consumers that support them (eg., clang-tidy)
because we now have different checker names for pure virtual calls and
regular virtual calls.
Also fix capitalization in the bug category.
Differential Revision: https://reviews.llvm.org/D64274
llvm-svn: 369449
Summary:
This patch introduces a new `analyzer-config` configuration:
`-analyzer-config silence-checkers`
which could be used to silence the given checkers.
It accepts a semicolon separated list, packed into quotation marks, e.g:
`-analyzer-config silence-checkers="core.DivideZero;core.NullDereference"`
It could be used to "disable" core checkers, so they model the analysis as
before, just if some of them are too noisy it prevents to emit reports.
This patch also adds support for that new option to the scan-build.
Passing the option `-disable-checker core.DivideZero` to the scan-build
will be transferred to `-analyzer-config silence-checkers=core.DivideZero`.
Reviewed By: NoQ, Szelethus
Differential Revision: https://reviews.llvm.org/D66042
llvm-svn: 369078
This is more of a temporary fix, long term, we should convert AnalyzerOptions.def
into the universally beloved (*coughs*) TableGen format, where they can more
easily be separated into developer-only, alpha, and user-facing configs.
Differential Revision: https://reviews.llvm.org/D66261
llvm-svn: 368980
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
Well, what is says on the tin I guess!
Some more changes:
* Move isInevitablySinking() from BugReporter.cpp to CFGBlock's interface
* Rename and move findBlockForNode() from BugReporter.cpp to
ExplodedNode::getCFGBlock()
Differential Revision: https://reviews.llvm.org/D65287
llvm-svn: 368836
Exactly what it says on the tin! The comments in the code detail this a
little more too.
Differential Revision: https://reviews.llvm.org/D64272
llvm-svn: 368817
Summary:
Explicitly deleting the copy constructor makes compiling the function
`ento::registerGenericTaintChecker` difficult with some compilers. When we
construct an `llvm::Optional<TaintConfig>`, the optional is constructed with a
const TaintConfig reference which it then uses to invoke the deleted TaintConfig
copy constructor.
I've observered this failing with clang 3.8 on Ubuntu 16.04.
Reviewers: compnerd, Szelethus, boga95, NoQ, alexshap
Subscribers: xazax.hun, baloghadamsoftware, szepet, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, llvm-commits, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D66192
llvm-svn: 368779
When we're tracking a variable that is responsible for a null pointer
dereference or some other sinister programming error, we of course would like to
gather as much information why we think that the variable has that specific
value as possible. However, the newly introduced condition tracking shows that
tracking all values this thoroughly could easily cause an intolerable growth in
the bug report's length.
There are a variety of heuristics we discussed on the mailing list[1] to combat
this, all of them requiring to differentiate in between tracking a "regular
value" and a "condition".
This patch introduces the new `bugreporter::TrackingKind` enum, adds it to
several visitors as a non-optional argument, and moves some functions around to
make the code a little more coherent.
[1] http://lists.llvm.org/pipermail/cfe-dev/2019-June/062613.html
Differential Revision: https://reviews.llvm.org/D64270
llvm-svn: 368777
Summary:
The following code snippet taken from D64271#1572188 has an issue: namely,
because `flag`'s value isn't undef or a concrete int, it isn't being tracked.
int flag;
bool coin();
void foo() {
flag = coin();
}
void test() {
int *x = 0;
int local_flag;
flag = 1;
foo();
local_flag = flag;
if (local_flag)
x = new int;
foo();
local_flag = flag;
if (local_flag)
*x = 5;
}
This, in my opinion, makes no sense, other values may be interesting too.
Originally added by rC185608.
Differential Revision: https://reviews.llvm.org/D64287
llvm-svn: 368773
During the evaluation of D62883, I noticed a bunch of totally
meaningless notes with the pattern of "Calling 'A'" -> "Returning value"
-> "Returning from 'A'", which added no value to the report at all.
This patch (not only affecting tracked conditions mind you) prunes
diagnostic messages to functions that return a value not constrained to
be 0, and are also linear.
Differential Revision: https://reviews.llvm.org/D64232
llvm-svn: 368771
I feel this is kinda important, because in a followup patch I'm adding different
kinds of interestingness, and propagating the correct kind in BugReporter.cpp is
just one less thing to worry about.
Differential Revision: https://reviews.llvm.org/D65578
llvm-svn: 368755
Apparently this does literally nothing.
When you think about this, it makes sense. If something is really important,
we're tracking it anyways, and that system is sophisticated enough to mark
actually interesting statements as such. I wouldn't say that it's even likely
that subexpressions are also interesting (array[10 - x + x]), so I guess even
if this produced any effects, its probably undesirable.
Differential Revision: https://reviews.llvm.org/D65487
llvm-svn: 368752
In D65379, I briefly described the construction of bug paths from an
ExplodedGraph. This patch is about refactoring the code processing the bug path
into a bug report.
A part of finding a valid bug report was running all visitors on the bug path,
so we already have a (possibly empty) set of diagnostics for each ExplodedNode
in it.
Then, for each diagnostic consumer, we construct non-visitor diagnostic pieces.
* We first construct the final diagnostic piece (the warning), then
* We start ascending the bug path from the error node's predecessor (since the
error node itself was used to construct the warning event). For each node
* We check the location (whether its a CallEnter, CallExit) etc. We simultaneously
keep track of where we are with the execution by pushing CallStack when we see a
CallExit (keep in mind that everything is happening in reverse!), popping it
when we find a CallEnter, compacting them into a single PathDiagnosticCallEvent.
void f() {
bar();
}
void g() {
f();
error(); // warning
}
=== The bug path ===
(root) -> f's CallEnter -> bar() -> f's CallExit -> (error node)
=== Constructed report ===
f's CallEnter -> bar() -> f's CallExit
^ /
\ V
(root) ---> f's CallEvent --> (error node)
* We also keep track of different PathPieces different location contexts
* (CallEvent::path in the above example has f's LocationContext, while the
CallEvent itself is in g's context) in a LocationContextMap object. Construct
whatever piece, if any, is needed for the note.
* If we need to generate edges (or arrows) do so. Make sure to also connect
these pieces with the ones that visitors emitted.
* Clean up the constructed PathDiagnostic by making arrows nicer, pruning
function calls, etc.
So I complained about mile long function invocations with seemingly the same
parameters being passed around. This problem, as I see it, a natural candidate
for creating classes and tying them all together.
I tried very hard to make the implementation feel natural, like, rolling off the
tongue. I introduced 2 new classes: PathDiagnosticBuilder (I mean, I kept the
name but changed almost everything in it) contains every contextual information
(owns the bug path, the diagnostics constructed but the visitors, the BugReport
itself, etc) needed for constructing a PathDiagnostic object, and is pretty much
completely immutable. BugReportContruct is the object containing every
non-contextual information (the PathDiagnostic object we're constructing, the
current location in the bug path, the location context map and the call stack I
meantioned earlier), and is passed around all over the place as a single entity
instead of who knows how many parameters.
I tried to used constness, asserts, limiting visibility of fields to my
advantage to clean up the code big time and dramatically improve safety. Also,
whenever I found the code difficult to understand, I added comments and/or
examples.
Here's a complete list of changes and my design philosophy behind it:
* Instead of construcing a ReportInfo object (added by D65379) after finding a
valid bug report, simply return an optional PathDiagnosticBuilder object straight
away. Move findValidReport into the class as a static method. I find
GRBugReporter::generatePathDiagnostics a joy to look at now.
* Rename generatePathDiagnosticForConsumer to generate (maybe not needed, but
felt that way in the moment) and moved it to PathDiagnosticBuilder. If we don't
need to generate diagnostics, bail out straight away, like we always should have.
After that, construct a BugReportConstruct object, leaving the rest of the logic
untouched.
* Move all static methods that would use contextual information into
PathDiagnosticBuilder, reduce their parameter count drastically by simply
passing around a BugReportConstruct object.
* Glance at the code I removed: Could you tell what the original
PathDiagnosticBuilder::LC object was for? It took a gooood long while for me to
realize that nothing really. It is always equal with the LocationContext
associated with our current position in the bug path. Remove it completely.
* The original code contains the following expression quite a bit:
LCM[&PD.getActivePath()], so what does it mean? I said that we collect the
contexts associated with different PathPieces, but why would we ever modify that,
shouldn't it be set? Well, theoretically yes, but in the implementation, the
address of PathDiagnostic::getActivePath doesn't change if we move to an outer,
previously unexplored function. Add both descriptive method names and
explanations to BugReportConstruct to help on this.
* Add plenty of asserts, both for safety and as a poor man's documentation.
Differential Revision: https://reviews.llvm.org/D65484
llvm-svn: 368737
When I'm new to a file/codebase, I personally find C++'s strong static type
system to be a great aid. BugReporter.cpp is still painful to read however:
function calls are made with mile long parameter lists, seemingly all of them
taken with a non-const reference/pointer. This patch fixes nothing but this:
make a few things const, and hammer it until it compiles.
Differential Revision: https://reviews.llvm.org/D65382
llvm-svn: 368735
find clang/ -type f -exec sed -i 's/std::shared_ptr<PathDiagnosticPiece>/PathDiagnosticPieceRef/g' {} \;
git diff -U3 --no-color HEAD^ | clang-format-diff-6.0 -p1 -i
Just as C++ is meant to be refactored, right?
Differential Revision: https://reviews.llvm.org/D65381
llvm-svn: 368717
This patch refactors the utility functions and classes around the construction
of a bug path.
At a very high level, this consists of 3 steps:
* For all BugReports in the same BugReportEquivClass, collect all their error
nodes in a set. With that set, create a new, trimmed ExplodedGraph whose leafs
are all error nodes.
* Until a valid report is found, construct a bug path, which is yet another
ExplodedGraph, that is linear from a given error node to the root of the graph.
* Run all visitors on the constructed bug path. If in this process the report
got invalidated, start over from step 2.
Now, to the changes within this patch:
* Do not allow the invalidation of BugReports up to the point where the trimmed
graph is constructed. Checkers shouldn't add bug reports that are known to be
invalid, and should use visitors and argue about the entirety of the bug path if
needed.
* Do not calculate indices. I may be biased, but I personally find code like
this horrible. I'd like to point you to one of the comments in the original code:
SmallVector<const ExplodedNode *, 32> errorNodes;
for (const auto I : bugReports) {
if (I->isValid()) {
HasValid = true;
errorNodes.push_back(I->getErrorNode());
} else {
// Keep the errorNodes list in sync with the bugReports list.
errorNodes.push_back(nullptr);
}
}
Not on my watch. Instead, use a far easier to follow trick: store a pointer to
the BugReport in question, not an index to it.
* Add range iterators to ExplodedGraph's successors and predecessors, and a
visitor range to BugReporter.
* Rename TrimmedGraph to BugPathGetter. Because that is what it has always been:
no sane graph type should store an iterator-like state, or have an interface not
exposing a single graph-like functionalities.
* Rename ReportGraph to BugPathInfo, because it is only a linear path with some
other context.
* Instead of having both and out and in parameter (which I think isn't ever
excusable unless we use the out-param for caching), return a record object with
descriptive getter methods.
* Where descriptive names weren't sufficient, compliment the code with comments.
Differential Revision: https://reviews.llvm.org/D65379
llvm-svn: 368694
The goal of this refactoring effort was to better understand how interestingness
was propagated in BugReporter.cpp, which eventually turned out to be a dead end,
but with such a twist, I wouldn't even want to spoil it ahead of time. However,
I did get to learn a lot about how things are working in there.
In these series of patches, as well as cleaning up the code big time, I invite
you to study how BugReporter.cpp operates, and discuss how we could design this
file to reduce the horrible mess that it is.
This patch reverts a great part of rC162028, which holds the title "Allow
multiple PathDiagnosticConsumers to be used with a BugReporter at the same
time.". This, however doesn't imply that there's any need for multiple "layers"
or stacks of interesting symbols and regions, quite the contrary, I would argue
that we would like to generate the same amount of information for all output
types, and only process them differently.
Differential Revision: https://reviews.llvm.org/D65378
llvm-svn: 368689
Summary:
A condition could be a multi-line expression where we create the highlight
in separated chunks. PathDiagnosticPopUpPiece is not made for that purpose,
it cannot be added to multiple lines because we have only one ending part
which contains all the notes. So that it cannot have multiple endings and
therefore this patch narrows down the ranges of the highlight to the given
interesting variable of the condition. It prevents HTML-breaking injections.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D65663
llvm-svn: 368382
This patch is a prerequisite for using LangStandard from Driver in
https://reviews.llvm.org/D64793.
It moves LangStandard* and InputKind::Language to Basic. It is mostly
mechanical, with only a few changes of note:
- enum Language has been changed into enum class Language : uint8_t to
avoid a clash between OpenCL in enum Language and OpenCL in enum
LangFeatures and not to increase the size of class InputKind.
- Now that getLangStandardForName, which is currently unused, also checks
both canonical and alias names, I've introduced a helper getLangKind
which factors out a code pattern already used 3 times.
The patch has been tested on x86_64-pc-solaris2.11, sparcv9-sun-solaris2.11,
and x86_64-pc-linux-gnu.
There's a companion patch for lldb which uses LangStandard.h
(https://reviews.llvm.org/D65717).
While polly includes isl which in turn uses InputKind::C, that part of the
code isn't even built inside the llvm tree. I've posted a patch to allow
for both InputKind::C and Language::C upstream
(https://groups.google.com/forum/#!topic/isl-development/6oEvNWOSQFE).
Differential Revision: https://reviews.llvm.org/D65562
llvm-svn: 367864
Summary:
It allows discriminating between stack frames of the same call that is
called multiple times in a loop.
Thanks to Artem Dergachev for the great idea!
Reviewed By: NoQ
Tags: #clang
Differential Revision: https://reviews.llvm.org/D65587
llvm-svn: 367608
While we implemented taint propagation rules for several
builtin/standard functions, there's a natural desire for users to add
such rules to custom functions.
A series of patches will implement an option that allows users to
annotate their functions with taint propagation rules through a YAML
file. This one adds parsing of the configuration file, which may be
specified in the commands line with the analyzer config:
alpha.security.taint.TaintPropagation:Config. The configuration may
contain propagation rules, filter functions (remove taint) and sink
functions (give a warning if it gets a tainted value).
I also added a new header for future checkers to conveniently read YAML
files as checker options.
Differential Revision: https://reviews.llvm.org/D59555
llvm-svn: 367190
Summary:
When cross TU analysis is used it is possible that a macro expansion
is generated for a macro that is defined (and used) in other than
the main translation unit. To get the expansion for it the source
location in the original source file and original preprocessor
is needed.
Reviewers: martong, xazax.hun, Szelethus, ilya-biryukov
Reviewed By: Szelethus
Subscribers: mgorny, NoQ, ilya-biryukov, rnkovacs, dkrupp, Szelethus, gamesh411, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D64638
llvm-svn: 367006