instead push the terminator for the branch down into the basic blocks of the subexpressions of '&&' and '||'
respectively. This eliminates some artifical control-flow from the CFG and results in a more
compact CFG.
Note that this patch only alters the branches 'while', 'if' and 'for'. This was complex enough for
one patch. The remaining branches (e.g., do...while) can be handled in a separate patch, but they
weren't immediately tackled because they were less important.
It is possible that this patch introduces some subtle bugs, particularly w.r.t. to destructor placement.
I've tried to audit these changes, but it is also known that the destructor logic needs some refinement
in the area of '||' and '&&' regardless (i.e., their are known bugs).
llvm-svn: 160218
Previously we were using the static type of the base object to inline
methods, whether virtual or non-virtual. Now, we try to see if the base
object has a known type, and if so ask for its implementation of the method.
llvm-svn: 160094
This is probably not so useful yet because it is not path-sensitive, though
it does try to show inlining with indentation.
This also adds a dump() method to CallEvent, which should be useful for
debugging.
llvm-svn: 160030
C++ method calls and C function calls both appear as CallExprs in the AST.
This was causing crashes for an object that had a 'free' method.
<rdar://problem/11822244>
llvm-svn: 160029
Also contains a number of tweaks to inlining that are necessary
for constructors and destructors. (I have this enabled on a private
branch, but it is very much unstable.)
llvm-svn: 160023
In order to accomplish this, we now build the callee's stack frame
as part of the CallEnter node, rather than the subsequent BlockEdge node.
This should not have any effect on perceived behavior or diagnostics.
This makes it safe to re-enable inlining of member overloaded operators.
llvm-svn: 160022
While this work is still fairly tentative (destructors are still left out of
the CFG by default), we now handle destructors in the same way as any other
calls, instead of just automatically trying to inline them.
llvm-svn: 160020
These are currently unused, but are intended to be used in lieu of PreStmt
and PostStmt when the call is implicit (e.g. an automatic object destructor).
This also modifies the Data1 field of ProgramPoints to allow storing any
pointer-sized value, as opposed to only aligned pointers. This is necessary
to store SourceLocations.
There is currently no BugReporter support for these; they should be skipped
over in any diagnostic output.
This commit also tags checkers that currently rely on function calls only
occurring at StmtPoints.
llvm-svn: 160019
This was a regression introduced during the CallEvent changes; a call to
FunctionDecl::hasBody was also being used to replace the decl found by
lookup with the actual definition. To keep from making this mistake again
(particularly if/when we start inlining Objective-C methods), this commit
adds a "getDefinition()" method to CallEvent, which should do the right
thing under any circumstances.
llvm-svn: 159940
We use LazyCompoundVals to avoid copying the contents of structs and arrays
around in the store, and when we need to pass a struct around that already
has a LazyCompoundVal we just use the original one. However, it's possible
that the first field of a struct may have a LazyCompoundVal of its own, and
we currently can't distinguish a LazyCompoundVal for the first element of a
struct from a LazyCompoundVal for the entire struct. In this case we should
just drop the optimization and make a new LazyCompoundVal that encompasses
the old one.
PR13264 / <rdar://problem/11802440>
llvm-svn: 159866
very simple semantic analysis that just builds the AST; minor changes for lexer
to pick up source locations I didn't think about before.
Comments AST is modelled along the ideas of HTML AST: block and inline content.
* Block content is a paragraph or a command that has a paragraph as an argument
or verbatim command.
* Inline content is placed within some block. Inline content includes plain
text, inline commands and HTML as tag soup.
llvm-svn: 159790
This required moving the ctors for IntegerLiteral and FloatingLiteral out of
line which shouldn't change anything as they are usually called through Create
methods that are already out of line.
ASTContext::Deallocate has been a nop for a long time, drop it from ASTVector
and make it independent from ASTContext.h
Pass the StorageAllocator directly to AccessedEntity so it doesn't need to
have a definition of ASTContext around.
llvm-svn: 159718
Our current inlining support (specifically RegionStore::enterStackFrame)
doesn't know that calls to overloaded operators may be calls to non-static
member functions, and that in these cases the first argument should be
treated as 'this'. This caused incorrect results and sometimes crashes.
The long-term fix will be to rewrite RegionStore::enterStackFrame to use
CallEvent and its subclasses, but for now we can just disable these
problematic calls by classifying them under a new CallEvent,
CXXMemberOperatorCall.
llvm-svn: 159692
...and instead add an accessor. We're not using this today, but it's something
that should probably stay in the source for potential clients, and it doesn't
cost a lot. (ObjCPropertyAccess is only created on the stack, and right now
there's only ever one alive at a time.)
This reverts r159581 / commit 8e674e1da34a131faa7d43dc3fcbd6e49120edbe.
llvm-svn: 159595
we are encountering some scalability issues with memory usage. The
appropriate long term fix is to make the analysis more scalable, but
this will at least prevent the analyzer swapping when
analyzing very large functions.
llvm-svn: 159578
in the call graph had been inlined but for whatever reason we did not inline some
of its callees.
Also, fix a related traversal bug where we meant to do a BFS of the callgraph but
instead were doing a DFS.
llvm-svn: 159577
The preObjCMessage and postObjCMessage callbacks now take an ObjCMethodCall
argument, which can represent an explicit message send (ObjCMessageSend) or an
implicit message generated by a property access (ObjCPropertyAccess).
llvm-svn: 159559
Previously, the CallEvent subclass ObjCMessageInvocation was just a wrapper
around the existing ObjCMessage abstraction (over message sends and property
accesses). Now, we have abstract CallEvent ObjCMethodCall with subclasses
ObjCMessageSend and ObjCPropertyAccess.
In addition to removing yet another wrapper object, this should make it easy
to add a ObjCSubscriptAccess call event soon.
llvm-svn: 159558
This involved refactoring some common pointer-escapes code onto CallEvent,
then having MallocChecker use those callbacks for whether or not to consider
a pointer's /ownership/ as escaping. This still needs to be pinned down, and
probably we want to make the new argumentsMayEscape() function a little more
discerning (content invalidation vs. ownership/metadata invalidation), but
this is a good improvement.
As a bonus, also remove CallOrObjCMessage from the source completely.
llvm-svn: 159557
Both of these got uglier rather than cleaner because we don't have preCall and
postCall yet; properly wrapping a CallExpr in a CallEvent requires doing a bit
of deconstruction on the callee. Even when we have preCall and postCall we may
want to expose the current CallEvent to pre/postStmt<CallExpr>.
llvm-svn: 159556
This ended allowing quite a bit of cleanup, and some minor changes.
- CallEvent makes it easy to use hasNonZeroCallbackArg more aggressively, which
we check in order to avoid false positives with callbacks that might release
the object.
- In order to support this for functions which consume their arguments, there
are two new ArgEffects: DecRefAndStopTracking and DecRefMsgAndStopTracking.
These act just like StopTracking, except that if the object only had a
return count of +1 it's now considered released instead (so we still get
use-after-free messages).
- On the plus side, we no longer have to special-case
+[NSObject performSelector:withObject:afterDelay:] and friends.
- The use of IdentifierInfos in the method summary cache is now hidden; only
the ObjCInterfaceDecl gets passed around most of the time.
- Since we cache all "simple" summaries and check every function call, there is
no real benefit to having NULL stand in for default summaries anymore.
- Whitespace, unused methods, etc.
Even more simplification to come when we get check::postCall and can unify all
these other post* checks.
llvm-svn: 159555
This is intended to replace CallOrObjCMessage, and is eventually intended to be
used for anything that cares more about /what/ is being called than /how/ it's
being called. For example, inlining destructors should be the same as inlining
blocks, and checking __attribute__((nonnull)) should apply to the allocator
calls generated by operator new.
llvm-svn: 159554
The solution is a bit inefficient: it creates N checkers, one for each check, and
each check does a dispatch on the function name. This is redundant, but we can fix
this once we have the proper ability to enable/disable subchecks.
Fixes <rdar://problem/11780180>.
llvm-svn: 159459
Previously:
...the comment said DFS...
...the WorkList being instantiated said BFS...
...and the implementation was actually DFS...
...due to an unintentional change in 2010...
...and everything kept working anyway.
This fixes our std::deque implementation of BFS, but switches back to a
SmallVector-based implementation of DFS.
We should probably still investigate the ramifications of DFS vs. BFS,
especially for large functions (and especially when we hit our block path
limit), since this might completely change our memory use. It can also mask
some bugs and reveal others depending on when we halt analysis. But at least
we will not have this kind of little mistake creep in again.
llvm-svn: 159397
The implicit global allocation functions do not have valid source locations,
but we still want to treat them as being "system header" functions for the
purposes of how they affect program state.
llvm-svn: 159160
We don't handle exceptions yet, so we treat them as sinks. ExprEngine
hardcodes messages that are known to raise Objective-C exceptions like -raise,
but it was only checking for +raise:format: and +raise:format:arguments: on
NSException itself, not subclasses.
<rdar://problem/11724201>
llvm-svn: 159010
express library-level dependencies within Clang.
This is no more verbose really, and plays nicer with the rest of the
CMake facilities. It should also have no change in functionality.
llvm-svn: 158888
This commits sets the grounds for more aggressive use after free
checking. We will use the Relinquished sate to denote that someone
else is now responsible for releasing the memory.
llvm-svn: 158850
The default global placement new just returns the pointer it is given.
Note that other custom 'new' implementations with placement args are not
guaranteed to do this.
In addition, we need to invalidate placement args, since they may be updated by
the allocator function. (Also, right now we don't properly handle the
constructor inside a CXXNewExpr, so we need to invalidate the placement args
just so that callers know something changed!)
This invalidation is not perfect because CallOrObjCMessage doesn't support
CXXNewExpr, and all of our invalidation callbacks expect that if there's no
CallOrObjCMessage, the invalidation is happening manually (e.g. by a direct
assignment) and shouldn't affect checker-specific metadata (like malloc state);
hence the malloc test case in new-fail.cpp. But region values are now
properly invalidated, at least.
The long-term solution to this problem is to rework CallOrObjCMessage into
something more general, rather than the morass of branches it is today.
<rdar://problem/11679031>
llvm-svn: 158784
This happens in C++ mode right at the declaration of a struct VLA;
MallocChecker sees a bind and tries to get see if it's an escaping bind.
It's likely that our handling of this is still incomplete, but it fixes a
crash on valid without disturbing anything else for now.
llvm-svn: 158587
Specifically, although the bitmap context does not take ownership of the
buffer (unlike CGBitmapContextCreateWithData), the data buffer can be extracted
out of the created CGContextRef. Thus the buffer is not leaked even if its
original pointer goes out of scope, as long as
- the context escapes, or
- it is retrieved via CGBitmapContextGetData and freed.
Actually implementing that logic is beyond the current scope of MallocChecker,
so for now CGBitmapContextCreate goes on our system function exception list.
llvm-svn: 158579
We already didn't track objects that have delegates or callbacks or
objects that are passed through void * "context pointers". It's a
not-uncommon pattern to release the object in its callback, and so
the leak message we give is not very helpful.
llvm-svn: 158532
* Add \brief to produce a summary in the Doxygen output;
* Add missing parameter names to \param commands;
* Fix mismatched parameter names for \param commands;
* Add a parameter name so that the \param has a target.
llvm-svn: 158503
This does not actually give us the right behavior for reinterpret_cast
of references. Reverting so I can think about it some more.
This reverts commit 50a75a6e26a49011150067adac556ef978639fe6.
llvm-svn: 158341
These casts only appear in very well-defined circumstances, in which the
target of a reinterpret_cast or a function formal parameter is an lvalue
reference. According to the C++ standard, the following are equivalent:
reinterpret_cast<T&>( x)
*reinterpret_cast<T*>(&x)
[expr.reinterpret.cast]p11
llvm-svn: 158338
While collections containing nil elements can still be iterated over in an
Objective-C for-in loop, the most common Cocoa collections -- NSArray,
NSDictionary, and NSSet -- cannot contain nil elements. This checker adds
that assumption to the analyzer state.
This was the cause of some minor false positives concerning CFRelease calls
on objects in an NSArray.
llvm-svn: 158319
This has a small hit in the case where only one class is interesting
(NilArgChecker) but is a big improvement when looking for one of several
interesting classes (VariadicMethodTypeChecker), in which the most common
case is that there is no match.
llvm-svn: 158318
to addition.
We should not to warn in case the malloc size argument is an
addition containing 'sizeof' operator - it is common to use the pattern
to pack values of different sizes into a buffer.
Ex:
uint8_t *buffer = (uint8_t*)malloc(dataSize + sizeof(length));
llvm-svn: 158219
CmpRuns.py can be used to compare issues from different analyzer runs.
Since it uses the issue line number to unique 2 issues, adding a new
line to the beginning of a file makes all issues in the file reported as
new.
The hash will be an opaque value which could be used (along with the
function name) by CmpRuns to identify the same issues. This way, we only
fail to identify the same issue from two runs if the function it appears
in changes (not perfect, but much better than nothing).
llvm-svn: 158180
I falsely assumed that the memory spaces are equal when we reach this
point, they might not be when memory space of one or more is stack or
Unknown. We don't want a region from Heap space alias something with
another memory space.
llvm-svn: 158165
Add a concept of symbolic memory region belonging to heap memory space.
When comparing symbolic regions allocated on the heap, assume that they
do not alias.
Use symbolic heap region to suppress a common false positive pattern in
the malloc checker, in code that relies on malloc not returning the
memory aliased to other malloc allocations, stack.
llvm-svn: 158136
In addition, I've made the pointer and reference typedef 'void' rather than T*
just so they can't get misused. I would've omitted them entirely but
std::distance likes them to be there even if it doesn't use them.
This rolls back r155808 and r155869.
Review by Doug Gregor incorporating feedback from Chandler Carruth.
llvm-svn: 158104
When we timeout or exceed a max number of blocks within an inlined
function, we retry with no inlining starting from a node right before
the CallEnter node. We assume the state of that node is the state of the
program before we start evaluating the call. However, the node pruning
removes this node as unimportant.
Teach the node pruning to keep the predecessors of the call enter nodes.
llvm-svn: 157860
We should lock the number of elements after the initial parsing is
complete. Recursive AST visitors in AnalyzesConsumer and CallGarph can
trigger lazy pch deserialization resulting in more calls to
HandleTopLevelDecl and appending to the LocalTUDecls list. We should
ignore those.
llvm-svn: 157762
improved the pruning heuristics. The current heuristics are pretty good, but they make diagnostics
for uninitialized variables warnings particularly useless in some cases.
llvm-svn: 157734
The new debug.ExprInspection checker looks for calls to clang_analyzer_eval,
and emits a warning of TRUE, FALSE, or UNKNOWN (or UNDEFINED) based on the
constrained value of its (boolean) argument. It does not modify the analysis
state though the conditions tested can result in branches (e.g. through the
use of short-circuit operators).
llvm-svn: 156919
We check the address of the last element accessed, but with 0 calculating that
address results in element -1. This patch bails out early (and avoids a bunch
of other work at that).
Fixes PR12807.
llvm-svn: 156769
a horrible bug in GetLazyBindings where we falsely appended a field suffix when traversing 3 or more
layers of lazy bindings. I don't have a reduced test case yet; but I have added the original source
to an internal regression test suite. I'll see about coming up with a reduced test case.
Fixes <rdar://problem/11405978> (for real).
llvm-svn: 156580
to reason about.
As part of taint propagation, we now allow creation of non-integer
symbolic expressions like a cast from int to float.
Addresses PR12511 (radar://11215362).
llvm-svn: 156578
We report a leak at a point a leaked variable is no longer accessible.
The statement that happens to be at that point is not relevant to the
leak diagnostic and, thus, should not be highlighted.
radar://11178519
llvm-svn: 156530
RegionStore, so be explicit about it and generate UnknownVal().
This is a hack to ensure we never produce undefined values for a value
coming from a compound value. (The undefined values can lead to
false positives.)
radar://10127782
llvm-svn: 156446
disruptive, but it allows RegionStore to better "see" through casts that reinterpret arrays of values
as structs. Fixes <rdar://problem/11405978>.
llvm-svn: 156428
don't reason about.
Self is just like a local variable in init methods, so it can be
assigned anything like result of static functions, other methods ... So
to suppress false positives that result in such cases, stop tracking the
checker-specific state after self is being assigned to (unless the
value is't being assigned to is either self or conforms to our rules).
This change does not invalidate any existing regression tests.
llvm-svn: 156420
This could conceivably cut down on state proliferation, although we don't
use BasicConstraintManager by default anymore. No functionality change.
llvm-svn: 156362
This involves keeping track of three separate types: the symbol type, the
adjustment type, and the comparison type. For example, in "$x + 5 > 0ULL",
if the type of $x is 'signed char', the adjustment type is 'int' and the
comparison type is 'unsigned long long'. Most of the time these three types
will be the same, but we should still do the right thing when the
comparison value is out of range, and wraparound should be calculated in
the adjustment type.
This also re-disables an out-of-bounds test; we were extracting the symbol
from non-additive SymIntExprs, but then throwing away the integer.
Sorry for the large patch; both the basic and range constraint managers needed
to be updated together, since they share code in SimpleConstraintManager.
llvm-svn: 156361
There are more parts of the analyzer that could use the convenience of APSIntType, particularly the constraint engine, but that needs a fair amount of rewriting to handle mixed-type constraints anyway.
llvm-svn: 156360
SValBuilder should return an UnknownVal() when comparison of int and ptr
fails. Previous to this commit, it went on assuming that we are dealing
with pointer arithmetic.
PR12509, radar://11390991
llvm-svn: 156320
The logical change is that the integers in SymIntExprs may not have the same type as the symbols they are paired with. This was already the case with taint-propagation expressions created by SValBuilder::makeSymExprValNN, but I think those integers may never have been used. SimpleSValBuilder should be able to handle mixed-integer-type SymIntExprs fine now, though, and the constraint managers were already being defensive (though not entirely correct). All existing tests pass.
The logic in evalBinOpNN has been simplified so that conversion is done as late as possible. As a result, most of the switch cases have been reduced to do the minimal amount of work, delegating to another case when they can by substituting ConcreteInts and (as before) reversing the left and right arguments when useful.
Comparisons require special handling in two places (building SymIntExprs and evaluating constant-constant operations) because we don't /know/ the best type for comparing the two values. I've approximated the rules in Sema [C99 6.3.1.8] but it'd be nice to refactor Sema's actual algorithm into ASTContext.
This is also groundwork for handling mixed-type constraints better than we do now.
llvm-svn: 156270
specifically checks for equality to null.
Enforcing this general practice, which keeps the analyzer less
noisy, in the CString Checker. This change suppresses "Assigned value is
garbage or undefined" warning in the added test case.
llvm-svn: 156085
We need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.
int *p = malloc(sizeof(int));
ptr = p + (2-x);
llvm-svn: 156052
The resulting type info is stored in the SymSymExpr, so no reason not to
support construction of expression with different subexpression types.
llvm-svn: 156051
The change resulted in multiple issues on the buildbot, so it's not
ready for prime time. Only enable history tracking for tainted
data(which is experimental) for now.
llvm-svn: 156049
values through interesting expressions. This allows us to map from interesting values in a caller
to interesting values in a caller, thus recovering some precision in diagnostics lost from IPA.
Fixes <rdar://problem/11327497>
llvm-svn: 155971
reason about the expression.
This essentially keeps more history about how symbolic values were
constructed. As an optimization, previous to this commit, we only kept
the history if one of the symbols was tainted, but it's valuable keep
the history around for other purposes as well: it allows us to avoid
constructing conjured symbols.
Specifically, we need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.
int *p = malloc(sizeof(int));
ptr = p + (2-x);
This change brings 2% slowdown on sqlite. Fixes radar://11329382.
llvm-svn: 155944
filter_decl_iterator had a weird mismatch where both op* and op-> returned T*
making it difficult to generalize this filtering behavior into a reusable
library of any kind.
This change errs on the side of value, making op-> return T* and op* return
T&.
(reviewed by Richard Smith)
llvm-svn: 155808
of a mutable SmallPtrSet. While iterating over LocalTUDecls, there were cases
where we could modify LocalTUDecls, which could result in invalidating an iterator
and an analyzer crash. Along the way, switch some uses of std::queue to std::dequeue,
which should be slightly more efficient.
Unfortunately, this is a difficult case to create a test case for.
llvm-svn: 155680
This is needed to ensure that we always report issues in the correct
function. For example, leaks are identified when we call remove dead
bindings. In order to make sure we report a callee's leak in the callee,
we have to run the operation in the callee's context.
This change required quite a bit of infrastructure work since:
- We used to only run remove dead bindings before a given statement;
here we need to run it after the last statement in the function. For
this, we added additional Program Point and special mode in the
SymbolReaper to remove all symbols in context lower than the current
one.
- The call exit operation turned into a sequence of nodes, which are
now guarded by CallExitBegin and CallExitEnd nodes for clarity and
convenience.
(Sorry for the long diff.)
llvm-svn: 155244
attached. Since we do not support any attributes which appertain to a statement
(yet), testing of this is necessarily quite minimal.
Patch by Alexander Kornienko!
llvm-svn: 154723
We should not deserialize unused declarations from the PCH file. Achieve
this by storing the top level declarations during parsing
(HandleTopLevelDecl ASTConsumer callback) and analyzing/building a call
graph only for those.
Tested the patch on a sample ObjC file that uses PCH. With the patch,
the analyzes is 17.5% faster and clang consumes 40% less memory.
Got about 10% overall build/analyzes time decrease on a large Objective
C project.
A bit of CallGraph refactoring/cleanup as well..
llvm-svn: 154625
As per Jordy's review. Creating a symbol here is more flexible; however
I could not come up with an example where it was needed. (What
constrains can be added on of the symbol constrained to 0?)
llvm-svn: 154542
(Applied changes to CStringAPI, Malloc, and Taint.)
This might almost never happen, but we should not crash even if it does.
This fixes a crash on the internal analyzer buildbot, where postgresql's
configure was redefining memmove (radar://11219852).
llvm-svn: 154451
we use the same Expr* as the one being currently visited. This is preparation for transitioning to having
ProgramPoints refer to CFGStmts.
This required a bit of trickery. We wish to keep the old Expr* bindings in the Environment intact,
as plenty of logic relies on it and there is no reason to change it, but we sometimes want the Stmt* for
the ProgramPoint to be different than the Expr* being used for bindings. This requires adding an extra
argument for some functions (e.g., evalLocation). This looks a bit strange for some clients, but
it will look a lot cleaner when were start using CFGStmt* in the appropriate places.
As some fallout, the diagnostics arrows are a bit difference, since some of the node locations have changed.
I have audited these, and they look reasonable.
llvm-svn: 154214
consolidate some commonly used category strings into global references (more of this can be done, I just did a few).
Fixes <rdar://problem/11191537>.
llvm-svn: 154121
Store this info inside the function summary generated for all analyzed
functions. This is useful for coverage stats and can be helpful for
analyzer state space search strategies.
llvm-svn: 153923
properly reason about such accesses, but we shouldn't emit bogus "uninitialized value" warnings
either. Fixes <rdar://problem/11127008>.
llvm-svn: 153913
Fixes a false positive (radar://11152419). The current solution of
adding the info into 3 places is quite ugly. Pending a generic pointer
escapes callback.
llvm-svn: 153731
count.
This is an optimization for "retry without inlining" option. Here, if we
failed to inline a function due to reaching the basic block max count,
we are going to store this information and not try to inline it
again in the translation unit. This can be viewed as a function summary.
On sqlite, with this optimization, we are 30% faster then before and
cover 10% more basic blocks (partially because the number of times we
reach timeout is decreased by 20%).
llvm-svn: 153730
The analyzer gives up path exploration under certain conditions. For
example, when the same basic block has been visited more than 4 times.
With inlining turned on, this could lead to decrease in code coverage.
Specifically, if we give up inside the inlined function, the rest of
parent's basic blocks will not get analyzed.
This commit introduces an option to enable re-run along the failed path,
in which we do not inline the last inlined call site. This is done by
enqueueing the node before the processing of the inlined call site
with a special policy encoded in the state. The policy tells us not to
inline the call site along the path.
This lead to ~10% increase in the number of paths analyzed. Even though
we expected a much greater coverage improvement.
The option is turned off by default for now.
llvm-svn: 153534
Report root function name with exhausted block diagnostic.
Also, use stack frames, not just any location context when checking if
the basic block is in the same context.
llvm-svn: 153532
assigned to a struct. This is fallout from inlining results, which expose
far more patterns where people stuff CF objects into structs and pass them
around (and we can reason about it). The problem is that we don't have
a general way to detect when values have escaped, so as an intermediate step
we need to eagerly prune out such tracking.
Fixes <rdar://problem/11104566>.
llvm-svn: 153489
This required adding a change count token to BugReport, but also allowed us to ditch ImmutableList as the BugReporterVisitor data type.
Also, remove the hack from MallocChecker, now that visitors appear in the opposite order. This is not exactly a fix, but the common case -- custom diagnostics after generic ones -- is now the default behavior.
llvm-svn: 153369
Specifically, we use the last store of the leaked symbol in the leak diagnostic.
(No support for struct fields since the malloc checker doesn't track those
yet.)
+ Infrastructure to track the regions used in store evaluations.
This approach is more precise than iterating the store to
obtain the region bound to the symbol, which is used in RetainCount
checker. The region corresponds to what is uttered in the code in the
last store and we do not rely on the store implementation to support
this functionality.
llvm-svn: 153212
This is accomplished by calling markInteresting /during/ path diagnostic generation, and as such relies on deterministic ordering of BugReporterVisitors -- namely, that BugReporterVisitors are run in /reverse/ order from how they are added. (Right now that's a consequence of storing visitors in an ImmutableList, where new items are added to the front.) It's a little hacky, but it works for now.
I think this is the best we can do without storing the relation between the old and new symbols, and that would be a hit whether or not there ends up being an error.
llvm-svn: 153010
I tried to test the effects of this change on memory usage and run time, but what I saw on retain-release.m was indistinguishable from noise (debug and release builds). Even so, some caveman profiling showed 101 cache hits that we would have generated new summaries for before (i.e. not default or stop summaries), and the more code we analyze, the more memory we should save.
Maybe we should have a standard project for benchmarking the retain count checker's memory and time?
llvm-svn: 153007
The cocoa::deriveNamingConventions helper is just using method families anyway now, and the way RetainSummaryTemplate works means we're allocating an extra summary for every method with a relevant family.
Also, fix RetainSummaryTemplate to do the right thing w/r/t annotating an /existing/ summary. This was probably the real cause of <rdar://problem/10824732> and the fix in r152448.
llvm-svn: 152998
The symbol-aware stack hint combines the checker-provided message
with the information about how the symbol was passed to the callee: as
a parameter or a return value.
For malloc, the generated messages look like this :
"Returning from 'foo'; released memory via 1st parameter"
"Returning from 'foo'; allocated memory via 1st parameter"
"Returning from 'foo'; allocated memory returned"
"Returning from 'foo'; reallocation of 1st parameter failed"
(We are yet to handle cases when the symbol is a field in a struct or
an array element.)
llvm-svn: 152962
The only use of AggExprVisitor was in #if 0 code (the analyzer's incomplete C++ support), so there is no actual behavioral change anyway.
llvm-svn: 152856
BugVisitor DiagnosticPieces.
When checkers create a DiagnosticPieceEvent, they can supply an extra
string, which will be concatenated with the call exit message for every
call on the stack between the diagnostic event and the final bug report.
(This is a simple version, which could be/will be further enhanced.)
For example, this is used in Malloc checker to produce the ",
which allocated memory" in the following example:
static char *malloc_wrapper() { // 2. Entered call from 'use'
return malloc(12); // 3. Memory is allocated
}
void use() {
char *v;
v = malloc_wrapper(); // 1. Calling 'malloc_wrappers'
// 4. Returning from 'malloc_wrapper', which allocated memory
} // 5. Memory is never released; potential
memory leak
llvm-svn: 152837
inlining to be the reverse of their declaration.
This optimizes running time under inlining up to 20% since we do not
re-analyze the utility functions which are usually defined first in the
translation unit if they have already been analyzed while inlined into
the root functions.
llvm-svn: 152653
BFS should give slightly better performance. Ex: Suppose, we have two
roots R1 and R2. A callee function C is reachable through both. However,
C is not inlined when analyzing R1 due to inline stack depth limit. With
DFS, C will be analyzed as top level even though it would be analyzed as
inlined through R2. On the other hand, BFS could avoid analyzing C as
top level.
llvm-svn: 152652
AnalysisConsumer.
As a result:
- We now analyze the C++ methods which are defined within the
class body. These were completely skipped before.
- Ensure that AST checkers are called on functions in the
order they are defined in the Translation unit.
llvm-svn: 152650
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
as aborted, but didn't treat such cases as sinks in the ExplodedGraph.
Along the way, add basic support for CXXCatchStmt, expanding the set of code we actually analyze (hopefully correctly).
Fixes: <rdar://problem/10892489>
llvm-svn: 152468
We do not reanalyze a function, which has already been analyzed as an
inlined callee. As per PRELIMINARY testing, this gives over
50% run time reduction on some benchmarks without decreasing of the
number of bugs found.
Turning the mode on by default.
llvm-svn: 152440
Essentially, a bug centers around a story for various symbols and regions. We should only include
the path diagnostic events that relate to those symbols and regions.
The pruning is done by associating a set of interesting symbols and regions with a BugReporter, which
can be modified at BugReport creation or by BugReporterVisitors.
This patch reduces the diagnostics emitted in several of our test cases. I've vetted these as
having desired behavior. The only regression is a missing null check diagnostic for the return
value of realloc() in test/Analysis/malloc-plist.c. This will require some investigation to fix,
and I have added a FIXME to the test case.
llvm-svn: 152361
analyzed.
The CallGraph is used when inlining is on, which is the current default.
This alone does not bring any performance improvement. It's a
stepping stone for the upcoming optimization in which we do not
re-analyze a function that has already been analyzed while inlined in
other functions. Using the call graph makes it easier to play with
the order of functions to minimize redundant analyzes.
llvm-svn: 152352
The final graph contains a single root node, which is a parent of all externally available functions(and 'main'). As well as a list of Parentless/Unreachable functions, which are either truly unreachable or are unreachable due to our analyses imprecision.
The analyzer checkers debug.DumpCallGraph or debug.ViewGraph can be used to look at the produced graph.
Currently, the graph is not very precise, for example, it entirely skips edges resulted from ObjC method calls.
llvm-svn: 152272
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.
UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).
User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.
This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.
llvm-svn: 152211
command line options for inlining tuning.
This adds the option for stack depth bound as well as function size
bound.
+ minor doxygenification
llvm-svn: 151930
funopen, setvbuf.
Teach the checker and the engine about these APIs to resolve malloc
false positives. As I am adding more of these APIs, it is clear that all
this should be factored out into a separate callback (for example,
region escapes). Malloc, KeyChainAPI and RetainRelease checkers could
all use it.
llvm-svn: 151737
This introduces a concept of a "prunable" PathDiagnosticEvent. Currently this is a flag, but
we may evolve the concept to make this more dynamically inferred.
llvm-svn: 151663
When allocated buffer is passed to CF/NS..NoCopy functions, the
ownership is transfered unless the deallocator argument is set to
'kCFAllocatorNull'.
llvm-svn: 151608
Assume none of the ObjC messages defined in system headers free memory,
except for the ones containing 'freeWhenDone' selector. Currently, just
assume that the region escapes to the messages with 'freeWhenDone'
(ideally, we want to treat it as 'free()').
For now, always assume that regions escape when passed to C++ methods.
llvm-svn: 151410
that provides the behavior of the C++11 library trait
std::is_trivially_constructible<T, Args...>, which can't be
implemented purely as a library.
Since __is_trivially_constructible can have zero or more arguments, I
needed to add Yet Another Type Trait Expression Class, this one
handling arbitrary arguments. The next step will be to migrate
UnaryTypeTrait and BinaryTypeTrait over to this new, more general
TypeTrait class.
Fixes the Clang side of <rdar://problem/10895483> / PR12038.
llvm-svn: 151352
When we find two leak reports with the same allocation site, report only
one of them.
Provide a helper method to BugReporter to facilitate this.
llvm-svn: 151287
Make this call an exception in ExprEngine::invalidateArguments:
'int pthread_setspecific(ptheread_key k, const void *)' stores
a value into thread local storage. The value can later be retrieved
with 'void *ptheread_getspecific(pthread_key)'. So even thought the
parameter is 'const void *', the region escapes through the
call.
(Here we just blacklist the call in the ExprEngine's default
logic. Another option would be to add a checker which evaluates
the call and triggers the call to invalidate regions.)
Teach the Malloc Checker, which treats all system calls as safe about
the API.
llvm-svn: 151220
- We should not evaluate strdup in the Malloc Checker, it's the job of
CString checker, so just update the RefState to reflect allocated
memory.
- Refactor to reduce LOC: remove some wrapper auxiliary functions, make
all functions return the state and add the transition in one place
(instead of in each auxiliary function).
llvm-svn: 151188
block pointer that returns a block literal which captures (by copy)
the lambda closure itself. Some aspects of the block literal are left
unspecified, namely the capture variable (which doesn't actually
exist) and the body (which will be filled in by IRgen because it can't
be written as an AST).
Because we're switching to this model, this patch also eliminates
tracking the copy-initialization expression for the block capture of
the conversion function, since that information is now embedded in the
synthesized block literal. -1 side tables FTW.
llvm-svn: 151131
it aware of CString APIs that return the input parameter.
Malloc Checker needs to know how the 'strcpy' function is
evaluated. Introduce the dependency on CStringChecker for that.
CStringChecker knows all about these APIs.
Addresses radar://10864450
llvm-svn: 150846