While there is now some duplication between SimpleCall and the CXXInstanceCall
sub-hierarchy, this is much better than copy-and-pasting the devirtualization
logic shared by both instance methods and destructors.
An unfortunate side effect is that there is no longer a single CallEvent type
that corresponds to "calls written as CallExprs". For the most part this is a
good thing, but the checker callback eval::Call still takes a CallExpr rather
than a CallEvent (since we're not sure if we want to allow checkers to
evaluate other kinds of calls). A mistake here will be caught by a cast<> in
CheckerManager::runCheckersForEvalCall.
No functionality change.
llvm-svn: 161809
Virtual base regions are never layered, so simply stripping them off won't
necessarily get you to the correct casted class. Instead, what we want is
the same logic for evaluating dynamic_cast: strip off base regions if possible,
but add new base regions if necessary.
llvm-svn: 161808
This can occur with multiple inheritance, which jumps from one parent to
the other, and with virtual inheritance, since virtual base regions always
wrap the actual object and can't be nested within other base regions.
This also exposed some incorrect logic for multiple inheritance: even if B
is known not to derive from C, D might still derive from both of them.
llvm-svn: 161798
...and /do/ strip CXXBaseObjectRegions when casting to a virtual base class.
This allows us to enforce the invariant that a CXXBaseObjectRegion can always
provide an offset for its base region if its base region has a known class
type, by only allowing virtual bases and direct non-virtual bases to form
CXXBaseObjectRegions.
This does mean some slight problems for our modeling of dynamic_cast, which
needs to be resolved by finding a path from the current region to the class
we're trying to cast to.
llvm-svn: 161797
This was causing a crash when we tried to re-apply a base object region to
itself. It probably also caused incorrect offset calculations in RegionStore.
PR13569 / <rdar://problem/12076683>
llvm-svn: 161710
This mostly affects pure virtual methods, but would also affect parent
methods defined inline in the header when analyzing the child's source file.
llvm-svn: 161709
when we don't need to split.
In some cases we know that a method cannot have a different
implementation in a subclass:
- the class is declared in the main file (private)
- all the method declarations (including the ones coming from super
classes) are in the main file.
This can be improved further, but might be enough for the heuristic.
(When we are too aggressive splitting the state, efficiency suffers.
When we fail to split the state coverage might suffer.)
llvm-svn: 161681
Both methods need to clear out existing bindings and provide a new default
binding. Originally KillStruct always provided UnknownVal as the default,
but it's allowed symbolic values for quite some time (for handling returned
structs in C).
No functionality change.
llvm-svn: 161637
This should speed up activities that need to access bindings by cluster,
such as invalidation and dead-bindings cleaning. In some cases all we save
is the cost of building the region cluster map, but other times we can
actually avoid traversing the rest of the store.
In casual testing, this produced a speedup of nearly 10% analyzing SQLite,
with /less/ memory used.
llvm-svn: 161636
This makes it faster to access and invalidate bindings with symbolic offsets
by only computing this information once.
No intended functionality change.
llvm-svn: 161635
An ASTContext's RecordLayoutInfo can only be used to look up offsets of
direct base classes, and we need the offset to make non-symbolic bindings
in RegionStore. This change makes sure that we have one layer of
CXXBaseObjectRegion for each base we are casting through.
This was causing crashes on an internal buildbot.
llvm-svn: 161621
This is an initial (unoptimized) version. We split the path when
inlining ObjC instance methods. On one branch we always assume that the
type information for the given memory region is precise. On the other we
assume that we don't have the exact type info. It is important to check
since the class could be subclassed and the method can be overridden. If
we always inline we can loose coverage.
Had to refactor some of the call eval functions.
llvm-svn: 161552
Unfortunately, generalized region printing is very difficult:
- ElementRegions are used both for casting and as actual elements.
- Accessing values through a pointer means going through an intermediate
SymbolRegionValue; symbolic regions are untyped.
- Referring to implicitly-defined variables like 'this' and 'self' could be
very confusing if they come from another stack frame.
We fall back to simply not printing the region name if we can't be sure it
will print well. This will allow us to improve in the future.
llvm-svn: 161512
The main blocker on this (besides the previous commit) was that
ScanReachableSymbols was not looking through LazyCompoundVals.
Once that was fixed, it's easy enough to clear out malloc data on return,
just like we do when we bind to a global region.
<rdar://problem/10872635>
llvm-svn: 161511
RegionStore currently uses a (Region, Offset) pair to describe the locations
of memory bindings. However, this representation breaks down when we have
regions like 'array[index]', where 'index' is unknown. We used to store this
as (SubRegion, 0); now we mark them specially as (SubRegion, SYMBOLIC).
Furthermore, ProgramState::scanReachableSymbols depended on the existence of
a sub-region map, but RegionStore's implementation doesn't provide for such
a thing. Moving the store-traversing logic of scanReachableSymbols into the
StoreManager allows us to eliminate the notion of SubRegionMap altogether.
This fixes some particularly awkward broken test cases, now in
array-struct-region.c.
llvm-svn: 161510
Instead of sprinkling dynamic type info propagation throughout
ExprEngine, the added checker would add the more precise type
information on known APIs (Ex: ObjC alloc, new) and propagate
the type info in other cases (ex: ObjC init method, casts (the second is
not implemented yet)).
Add handling of ObjC alloc, new and init to the checker.
llvm-svn: 161357
Like base constructors, delegating constructors require no further
processing in the CFGInitializer node.
Also, add PrettyStackTraceLoc to the initializer and destructor logic
so we can get better stack traces in the future.
llvm-svn: 161283
Because of this, we would previously emit NO path notes when a parameter
is constrained to null (because there are no stores). Now we show where we
made the assumption, which is much more useful.
llvm-svn: 161280
The visitor walks back through the ExplodedGraph as expected, but
it wasn't actually keeping track of when a value was assigned. This
meant that it only worked when the value was assigned when the variable
was defined.
Tests in the next commit (dependent on another change).
llvm-svn: 161276
In the following code, find the type of the symbolic receiver by
following it and updating the dynamic type info in the state when we
cast the symbol from id to MyClass *.
MyClass *a = [[self alloc] init];
return 5/[a testSelf];
llvm-svn: 161264
engine.
The code that was supposed to split the tie in a deterministic way is
not deterministic. Most likely one of the profile methods uses a
pointer. After this change we do finally get the consistent diagnostic
output. Testing this requires running the analyzer on large code bases
and diffing the results.
llvm-svn: 161224
This makes the diagnostic output order deterministic.
1) This makes order of text diagnostics consistent from run to run.
2) Also resulted in different bugs being reported (from one run to
another) with plist-html output.
llvm-svn: 161151
While usually we'd use a symbolic region rather than a straight-up Unknown,
we can still generate unknowns via array subscripts with symbolic indexes.
(And if this ever changes in the future, we still shouldn't crash.)
llvm-svn: 161059
This was causing a crash in our array-to-pointer logic, since the region
was clearly not an array.
PR13440 / <rdar://problem/11977113>
llvm-svn: 161051
This removes explicit checks for 'this' and 'self' from
Store::enterStackFrame. It also removes getCXXThisRegion() as a virtual
method on all CallEvents; it's now only implemented in the parts of the
hierarchy where it is relevant. Finally, it removes the option to ask
for the ParmVarDecls attached to the definition of an inlined function,
saving a recomputation of the result of getRuntimeDefinition().
No visible functionality change!
llvm-svn: 161017
Previously, we were only checking the origin expressions of inlined calls.
Checkers using the generic postCall and older postObjCMessage callbacks were
ignored. Now that we have CallEventManager, it is much easier to create
a CallEvent generically when exiting an inlined function, which we can then
use for post-call checks.
No test case because we don't (yet) have any checkers that depend on this
behavior (which is why it hadn't been fixed before now).
llvm-svn: 161005
- Retrieves the type of the object/receiver from the state.
- Binds self during stack setup.
- Only explores the path on which the method is inlined (no
bifurcation to explore the path on which the method is not inlined).
llvm-svn: 160991
This ensures that it is valid to reference-count any CallEvents, and we
won't accidentally try to reclaim a CallEvent that lives on the stack.
It also hides an ugly switch statement for handling CallExprs!
There should be no functionality change here.
llvm-svn: 160986
This allows us to get around the C++ "virtual constructor" problem
when we'd like to create a CallEvent from an ExplodedNode, an inlined
StackFrameContext, or another CallEvent. The solution has three parts:
- CallEventManager uses a BumpPtrAllocator to allocate CallEvent-sized
memory blocks. It also keeps a cache of freed CallEvents for reuse.
- CallEvents all have protected copy constructors, along with cloneTo()
methods that use placement new to copy into CallEventManager-managed
memory, vtables intact.
- CallEvents owned by CallEventManager are now wrapped in an
IntrusiveRefCntPtr. Going forwards, it's probably a good idea to create
ALL CallEvents through the CallEventManager, so that we don't accidentally
try to reclaim a stack-allocated CallEvent.
All of this machinery is currently unused but will be put into use shortly.
llvm-svn: 160983
We were treating this like a CXXDefaultArgExpr, but
SubstNonTypeTemplateParmExpr actually appears when a template is
instantiated, i.e. we have all the information necessary to evaluate it.
This allows us to inline functions like llvm::array_lengthof.
<rdar://problem/11949235>
llvm-svn: 160846
It's a good thing CallEvents aren't created all over the place yet.
I checked all the uses this time and the private copy constructor
/really/ shouldn't cause any more problems.
llvm-svn: 160845
instead of walking to the preceding PostStmt node. There are cases where the last evaluated
expression does not appear in the ExplodedGraph.
Fixes PR 13466.
llvm-svn: 160819
After discussion, the type-based dispatch was decided to be bad for
maintenance and made it very easy for subtle bugs to creep in. Instead,
we'll just be very careful when we do have to allocate these on the heap.
llvm-svn: 160817
Our BugReporter knows how to deal with implicit statements: it looks in
the ParentMap until it finds a parent with a valid location. However, since
initializers are not in the body of a constructor, their sub-expressions are
not in the ParentMap. That was easy enough to fix in AnalysisDeclContext.
...and then even once THAT was fixed, there's still an extra funny case
of Objective-C object pointer fields under ARC, which are initialized with
a top-level ImplicitValueInitExpr. To catch these cases,
PathDiagnosticLocation will now fall back to the start of the current
function if it can't find any other valid SourceLocations. This isn't great,
but it's miles better than a crash.
(All of this is only relevant when constructors and destructors are being
inlined, i.e. under -cfg-add-initializers and -cfg-add-implicit-dtors.)
llvm-svn: 160810
This workaround is fairly lame: we simulate the first element's constructor
and destructor and rely on the region invalidation to "initialize" the rest
of the elements.
llvm-svn: 160809
Previously we were using ParentMap and crawling through the parent DeclStmt.
This should be at least slightly cheaper (and is also more flexible).
No (intended) functionality change.
llvm-svn: 160807
Most of the logic here is fairly simple; the interesting thing is that
we now distinguish complete constructors from base or delegate constructors.
We also make sure to cast to the base class before evaluating a constructor
or destructor, since non-virtual base classes may behave differently.
This includes some refactoring of VisitCXXConstructExpr and VisitCXXDestructor
in order to keep ExprEngine.cpp as clean as possible (leaving the details for
ExprEngineCXX.cpp).
llvm-svn: 160806
This modifies BugReporter and friends to handle CallEnter and CallExitEnd
program points that came from implicit call CFG nodes (read: destructors).
This required some extra handling for nested implicit calls. For example,
the added multiple-inheritance test case has a call graph that looks like this:
testMultipleInheritance3
~MultipleInheritance
~SmartPointer
~Subclass
~SmartPointer
***bug here***
In this case we correctly notice that we started in an inlined function
when we reach the CallEnter program point for the second ~SmartPointer.
However, when we reach the next CallEnter (for ~Subclass), we were
accidentally re-using the inner ~SmartPointer call in the diagnostics.
Rather than guess if we saw the corresponding CallExitEnd based on the
contents of the active path, we now just ask the PathDiagnostic if there's
any known stack before popping off the top path.
(A similar issue could have occured without multiple inheritance, but there
wasn't a test case for it.)
llvm-svn: 160804
- Some cleanup(the TODOs) will be done after ObjC method inlining is
complete.
- Simplified CallEvent::getDefinition not to require ISDynamicDispatch
parameter.
- Also addressed Jordan's comments from r160530.
llvm-svn: 160768
value by scanning the path, rather than assuming we have visited the '?:' operator
as a terminator (which sets a value indicating which expression to grab the
final ternary expression value from).
llvm-svn: 160760
As pointed out by Anna, we only differentiate between explicit message sends
This also adds support for ObjCSubscriptExprs, which are basically the same
as properties in many ways. We were already checking these, but not emitting
nice messages for them.
This depends on the llvm::PointerIntPair change in r160456.
llvm-svn: 160461
We will need to be able to easily reconstruct a CallEvent from an ExplodedNode
for diagnostic purposes, and that's exactly what factory functions are for.
CallEvent objects are small enough (four pointers and a SourceLocation) that
returning them through the stack is fairly cheap. Clients who just need to use
existing CallEvents can continue to do so using const references.
This uses the same sort of "kind-field-dispatch" as SVal, though most of the
nastiness is contained in the DISPATCH and DISPATCH_ARG macros at the end of
the file. (We can't use a template for this because member-pointers to base
class methods don't call derived-class methods even when casting to the
derived class. We can't use variadic macros because they're a C99 feature.)
llvm-svn: 160459
ObjC properties are handled through their semantic form of ObjCMessageExprs
and their wrapper PseudoObjectExprs, and have been for quite a while. The
syntactic ObjCPropertyRefExprs do not appear in the CFG and are not visited
by ExprEngine.
No functionality change.
llvm-svn: 160458
This code has been moved around multiple times, but seems to have been
obsolete ever since we started handled references like pointers.
llvm-svn: 160375
instead push the terminator for the branch down into the basic blocks of the subexpressions of '&&' and '||'
respectively. This eliminates some artifical control-flow from the CFG and results in a more
compact CFG.
Note that this patch only alters the branches 'while', 'if' and 'for'. This was complex enough for
one patch. The remaining branches (e.g., do...while) can be handled in a separate patch, but they
weren't immediately tackled because they were less important.
It is possible that this patch introduces some subtle bugs, particularly w.r.t. to destructor placement.
I've tried to audit these changes, but it is also known that the destructor logic needs some refinement
in the area of '||' and '&&' regardless (i.e., their are known bugs).
llvm-svn: 160218
Previously we were using the static type of the base object to inline
methods, whether virtual or non-virtual. Now, we try to see if the base
object has a known type, and if so ask for its implementation of the method.
llvm-svn: 160094
This is probably not so useful yet because it is not path-sensitive, though
it does try to show inlining with indentation.
This also adds a dump() method to CallEvent, which should be useful for
debugging.
llvm-svn: 160030
Also contains a number of tweaks to inlining that are necessary
for constructors and destructors. (I have this enabled on a private
branch, but it is very much unstable.)
llvm-svn: 160023
In order to accomplish this, we now build the callee's stack frame
as part of the CallEnter node, rather than the subsequent BlockEdge node.
This should not have any effect on perceived behavior or diagnostics.
This makes it safe to re-enable inlining of member overloaded operators.
llvm-svn: 160022
While this work is still fairly tentative (destructors are still left out of
the CFG by default), we now handle destructors in the same way as any other
calls, instead of just automatically trying to inline them.
llvm-svn: 160020
These are currently unused, but are intended to be used in lieu of PreStmt
and PostStmt when the call is implicit (e.g. an automatic object destructor).
This also modifies the Data1 field of ProgramPoints to allow storing any
pointer-sized value, as opposed to only aligned pointers. This is necessary
to store SourceLocations.
There is currently no BugReporter support for these; they should be skipped
over in any diagnostic output.
This commit also tags checkers that currently rely on function calls only
occurring at StmtPoints.
llvm-svn: 160019
This was a regression introduced during the CallEvent changes; a call to
FunctionDecl::hasBody was also being used to replace the decl found by
lookup with the actual definition. To keep from making this mistake again
(particularly if/when we start inlining Objective-C methods), this commit
adds a "getDefinition()" method to CallEvent, which should do the right
thing under any circumstances.
llvm-svn: 159940
We use LazyCompoundVals to avoid copying the contents of structs and arrays
around in the store, and when we need to pass a struct around that already
has a LazyCompoundVal we just use the original one. However, it's possible
that the first field of a struct may have a LazyCompoundVal of its own, and
we currently can't distinguish a LazyCompoundVal for the first element of a
struct from a LazyCompoundVal for the entire struct. In this case we should
just drop the optimization and make a new LazyCompoundVal that encompasses
the old one.
PR13264 / <rdar://problem/11802440>
llvm-svn: 159866
very simple semantic analysis that just builds the AST; minor changes for lexer
to pick up source locations I didn't think about before.
Comments AST is modelled along the ideas of HTML AST: block and inline content.
* Block content is a paragraph or a command that has a paragraph as an argument
or verbatim command.
* Inline content is placed within some block. Inline content includes plain
text, inline commands and HTML as tag soup.
llvm-svn: 159790
This required moving the ctors for IntegerLiteral and FloatingLiteral out of
line which shouldn't change anything as they are usually called through Create
methods that are already out of line.
ASTContext::Deallocate has been a nop for a long time, drop it from ASTVector
and make it independent from ASTContext.h
Pass the StorageAllocator directly to AccessedEntity so it doesn't need to
have a definition of ASTContext around.
llvm-svn: 159718
Our current inlining support (specifically RegionStore::enterStackFrame)
doesn't know that calls to overloaded operators may be calls to non-static
member functions, and that in these cases the first argument should be
treated as 'this'. This caused incorrect results and sometimes crashes.
The long-term fix will be to rewrite RegionStore::enterStackFrame to use
CallEvent and its subclasses, but for now we can just disable these
problematic calls by classifying them under a new CallEvent,
CXXMemberOperatorCall.
llvm-svn: 159692
...and instead add an accessor. We're not using this today, but it's something
that should probably stay in the source for potential clients, and it doesn't
cost a lot. (ObjCPropertyAccess is only created on the stack, and right now
there's only ever one alive at a time.)
This reverts r159581 / commit 8e674e1da34a131faa7d43dc3fcbd6e49120edbe.
llvm-svn: 159595
we are encountering some scalability issues with memory usage. The
appropriate long term fix is to make the analysis more scalable, but
this will at least prevent the analyzer swapping when
analyzing very large functions.
llvm-svn: 159578
The preObjCMessage and postObjCMessage callbacks now take an ObjCMethodCall
argument, which can represent an explicit message send (ObjCMessageSend) or an
implicit message generated by a property access (ObjCPropertyAccess).
llvm-svn: 159559
Previously, the CallEvent subclass ObjCMessageInvocation was just a wrapper
around the existing ObjCMessage abstraction (over message sends and property
accesses). Now, we have abstract CallEvent ObjCMethodCall with subclasses
ObjCMessageSend and ObjCPropertyAccess.
In addition to removing yet another wrapper object, this should make it easy
to add a ObjCSubscriptAccess call event soon.
llvm-svn: 159558
This involved refactoring some common pointer-escapes code onto CallEvent,
then having MallocChecker use those callbacks for whether or not to consider
a pointer's /ownership/ as escaping. This still needs to be pinned down, and
probably we want to make the new argumentsMayEscape() function a little more
discerning (content invalidation vs. ownership/metadata invalidation), but
this is a good improvement.
As a bonus, also remove CallOrObjCMessage from the source completely.
llvm-svn: 159557
This is intended to replace CallOrObjCMessage, and is eventually intended to be
used for anything that cares more about /what/ is being called than /how/ it's
being called. For example, inlining destructors should be the same as inlining
blocks, and checking __attribute__((nonnull)) should apply to the allocator
calls generated by operator new.
llvm-svn: 159554
Previously:
...the comment said DFS...
...the WorkList being instantiated said BFS...
...and the implementation was actually DFS...
...due to an unintentional change in 2010...
...and everything kept working anyway.
This fixes our std::deque implementation of BFS, but switches back to a
SmallVector-based implementation of DFS.
We should probably still investigate the ramifications of DFS vs. BFS,
especially for large functions (and especially when we hit our block path
limit), since this might completely change our memory use. It can also mask
some bugs and reveal others depending on when we halt analysis. But at least
we will not have this kind of little mistake creep in again.
llvm-svn: 159397
We don't handle exceptions yet, so we treat them as sinks. ExprEngine
hardcodes messages that are known to raise Objective-C exceptions like -raise,
but it was only checking for +raise:format: and +raise:format:arguments: on
NSException itself, not subclasses.
<rdar://problem/11724201>
llvm-svn: 159010
express library-level dependencies within Clang.
This is no more verbose really, and plays nicer with the rest of the
CMake facilities. It should also have no change in functionality.
llvm-svn: 158888
The default global placement new just returns the pointer it is given.
Note that other custom 'new' implementations with placement args are not
guaranteed to do this.
In addition, we need to invalidate placement args, since they may be updated by
the allocator function. (Also, right now we don't properly handle the
constructor inside a CXXNewExpr, so we need to invalidate the placement args
just so that callers know something changed!)
This invalidation is not perfect because CallOrObjCMessage doesn't support
CXXNewExpr, and all of our invalidation callbacks expect that if there's no
CallOrObjCMessage, the invalidation is happening manually (e.g. by a direct
assignment) and shouldn't affect checker-specific metadata (like malloc state);
hence the malloc test case in new-fail.cpp. But region values are now
properly invalidated, at least.
The long-term solution to this problem is to rework CallOrObjCMessage into
something more general, rather than the morass of branches it is today.
<rdar://problem/11679031>
llvm-svn: 158784
This happens in C++ mode right at the declaration of a struct VLA;
MallocChecker sees a bind and tries to get see if it's an escaping bind.
It's likely that our handling of this is still incomplete, but it fixes a
crash on valid without disturbing anything else for now.
llvm-svn: 158587
This does not actually give us the right behavior for reinterpret_cast
of references. Reverting so I can think about it some more.
This reverts commit 50a75a6e26a49011150067adac556ef978639fe6.
llvm-svn: 158341
These casts only appear in very well-defined circumstances, in which the
target of a reinterpret_cast or a function formal parameter is an lvalue
reference. According to the C++ standard, the following are equivalent:
reinterpret_cast<T&>( x)
*reinterpret_cast<T*>(&x)
[expr.reinterpret.cast]p11
llvm-svn: 158338
While collections containing nil elements can still be iterated over in an
Objective-C for-in loop, the most common Cocoa collections -- NSArray,
NSDictionary, and NSSet -- cannot contain nil elements. This checker adds
that assumption to the analyzer state.
This was the cause of some minor false positives concerning CFRelease calls
on objects in an NSArray.
llvm-svn: 158319
CmpRuns.py can be used to compare issues from different analyzer runs.
Since it uses the issue line number to unique 2 issues, adding a new
line to the beginning of a file makes all issues in the file reported as
new.
The hash will be an opaque value which could be used (along with the
function name) by CmpRuns to identify the same issues. This way, we only
fail to identify the same issue from two runs if the function it appears
in changes (not perfect, but much better than nothing).
llvm-svn: 158180
I falsely assumed that the memory spaces are equal when we reach this
point, they might not be when memory space of one or more is stack or
Unknown. We don't want a region from Heap space alias something with
another memory space.
llvm-svn: 158165
Add a concept of symbolic memory region belonging to heap memory space.
When comparing symbolic regions allocated on the heap, assume that they
do not alias.
Use symbolic heap region to suppress a common false positive pattern in
the malloc checker, in code that relies on malloc not returning the
memory aliased to other malloc allocations, stack.
llvm-svn: 158136
In addition, I've made the pointer and reference typedef 'void' rather than T*
just so they can't get misused. I would've omitted them entirely but
std::distance likes them to be there even if it doesn't use them.
This rolls back r155808 and r155869.
Review by Doug Gregor incorporating feedback from Chandler Carruth.
llvm-svn: 158104
When we timeout or exceed a max number of blocks within an inlined
function, we retry with no inlining starting from a node right before
the CallEnter node. We assume the state of that node is the state of the
program before we start evaluating the call. However, the node pruning
removes this node as unimportant.
Teach the node pruning to keep the predecessors of the call enter nodes.
llvm-svn: 157860
improved the pruning heuristics. The current heuristics are pretty good, but they make diagnostics
for uninitialized variables warnings particularly useless in some cases.
llvm-svn: 157734
a horrible bug in GetLazyBindings where we falsely appended a field suffix when traversing 3 or more
layers of lazy bindings. I don't have a reduced test case yet; but I have added the original source
to an internal regression test suite. I'll see about coming up with a reduced test case.
Fixes <rdar://problem/11405978> (for real).
llvm-svn: 156580
to reason about.
As part of taint propagation, we now allow creation of non-integer
symbolic expressions like a cast from int to float.
Addresses PR12511 (radar://11215362).
llvm-svn: 156578
RegionStore, so be explicit about it and generate UnknownVal().
This is a hack to ensure we never produce undefined values for a value
coming from a compound value. (The undefined values can lead to
false positives.)
radar://10127782
llvm-svn: 156446
disruptive, but it allows RegionStore to better "see" through casts that reinterpret arrays of values
as structs. Fixes <rdar://problem/11405978>.
llvm-svn: 156428
This could conceivably cut down on state proliferation, although we don't
use BasicConstraintManager by default anymore. No functionality change.
llvm-svn: 156362
This involves keeping track of three separate types: the symbol type, the
adjustment type, and the comparison type. For example, in "$x + 5 > 0ULL",
if the type of $x is 'signed char', the adjustment type is 'int' and the
comparison type is 'unsigned long long'. Most of the time these three types
will be the same, but we should still do the right thing when the
comparison value is out of range, and wraparound should be calculated in
the adjustment type.
This also re-disables an out-of-bounds test; we were extracting the symbol
from non-additive SymIntExprs, but then throwing away the integer.
Sorry for the large patch; both the basic and range constraint managers needed
to be updated together, since they share code in SimpleConstraintManager.
llvm-svn: 156361
There are more parts of the analyzer that could use the convenience of APSIntType, particularly the constraint engine, but that needs a fair amount of rewriting to handle mixed-type constraints anyway.
llvm-svn: 156360
SValBuilder should return an UnknownVal() when comparison of int and ptr
fails. Previous to this commit, it went on assuming that we are dealing
with pointer arithmetic.
PR12509, radar://11390991
llvm-svn: 156320
The logical change is that the integers in SymIntExprs may not have the same type as the symbols they are paired with. This was already the case with taint-propagation expressions created by SValBuilder::makeSymExprValNN, but I think those integers may never have been used. SimpleSValBuilder should be able to handle mixed-integer-type SymIntExprs fine now, though, and the constraint managers were already being defensive (though not entirely correct). All existing tests pass.
The logic in evalBinOpNN has been simplified so that conversion is done as late as possible. As a result, most of the switch cases have been reduced to do the minimal amount of work, delegating to another case when they can by substituting ConcreteInts and (as before) reversing the left and right arguments when useful.
Comparisons require special handling in two places (building SymIntExprs and evaluating constant-constant operations) because we don't /know/ the best type for comparing the two values. I've approximated the rules in Sema [C99 6.3.1.8] but it'd be nice to refactor Sema's actual algorithm into ASTContext.
This is also groundwork for handling mixed-type constraints better than we do now.
llvm-svn: 156270
We need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.
int *p = malloc(sizeof(int));
ptr = p + (2-x);
llvm-svn: 156052
The resulting type info is stored in the SymSymExpr, so no reason not to
support construction of expression with different subexpression types.
llvm-svn: 156051
The change resulted in multiple issues on the buildbot, so it's not
ready for prime time. Only enable history tracking for tainted
data(which is experimental) for now.
llvm-svn: 156049
values through interesting expressions. This allows us to map from interesting values in a caller
to interesting values in a caller, thus recovering some precision in diagnostics lost from IPA.
Fixes <rdar://problem/11327497>
llvm-svn: 155971
reason about the expression.
This essentially keeps more history about how symbolic values were
constructed. As an optimization, previous to this commit, we only kept
the history if one of the symbols was tainted, but it's valuable keep
the history around for other purposes as well: it allows us to avoid
constructing conjured symbols.
Specifically, we need to identify the value of ptr as
ElementRegion (result of pointer arithmetic) in the following code.
However, before this commit '(2-x)' evaluated to Unknown value, and as
the result, 'p + (2-x)' evaluated to Unknown value as well.
int *p = malloc(sizeof(int));
ptr = p + (2-x);
This change brings 2% slowdown on sqlite. Fixes radar://11329382.
llvm-svn: 155944
filter_decl_iterator had a weird mismatch where both op* and op-> returned T*
making it difficult to generalize this filtering behavior into a reusable
library of any kind.
This change errs on the side of value, making op-> return T* and op* return
T&.
(reviewed by Richard Smith)
llvm-svn: 155808
This is needed to ensure that we always report issues in the correct
function. For example, leaks are identified when we call remove dead
bindings. In order to make sure we report a callee's leak in the callee,
we have to run the operation in the callee's context.
This change required quite a bit of infrastructure work since:
- We used to only run remove dead bindings before a given statement;
here we need to run it after the last statement in the function. For
this, we added additional Program Point and special mode in the
SymbolReaper to remove all symbols in context lower than the current
one.
- The call exit operation turned into a sequence of nodes, which are
now guarded by CallExitBegin and CallExitEnd nodes for clarity and
convenience.
(Sorry for the long diff.)
llvm-svn: 155244
attached. Since we do not support any attributes which appertain to a statement
(yet), testing of this is necessarily quite minimal.
Patch by Alexander Kornienko!
llvm-svn: 154723
We should not deserialize unused declarations from the PCH file. Achieve
this by storing the top level declarations during parsing
(HandleTopLevelDecl ASTConsumer callback) and analyzing/building a call
graph only for those.
Tested the patch on a sample ObjC file that uses PCH. With the patch,
the analyzes is 17.5% faster and clang consumes 40% less memory.
Got about 10% overall build/analyzes time decrease on a large Objective
C project.
A bit of CallGraph refactoring/cleanup as well..
llvm-svn: 154625
As per Jordy's review. Creating a symbol here is more flexible; however
I could not come up with an example where it was needed. (What
constrains can be added on of the symbol constrained to 0?)
llvm-svn: 154542
we use the same Expr* as the one being currently visited. This is preparation for transitioning to having
ProgramPoints refer to CFGStmts.
This required a bit of trickery. We wish to keep the old Expr* bindings in the Environment intact,
as plenty of logic relies on it and there is no reason to change it, but we sometimes want the Stmt* for
the ProgramPoint to be different than the Expr* being used for bindings. This requires adding an extra
argument for some functions (e.g., evalLocation). This looks a bit strange for some clients, but
it will look a lot cleaner when were start using CFGStmt* in the appropriate places.
As some fallout, the diagnostics arrows are a bit difference, since some of the node locations have changed.
I have audited these, and they look reasonable.
llvm-svn: 154214
consolidate some commonly used category strings into global references (more of this can be done, I just did a few).
Fixes <rdar://problem/11191537>.
llvm-svn: 154121
Store this info inside the function summary generated for all analyzed
functions. This is useful for coverage stats and can be helpful for
analyzer state space search strategies.
llvm-svn: 153923
properly reason about such accesses, but we shouldn't emit bogus "uninitialized value" warnings
either. Fixes <rdar://problem/11127008>.
llvm-svn: 153913
Fixes a false positive (radar://11152419). The current solution of
adding the info into 3 places is quite ugly. Pending a generic pointer
escapes callback.
llvm-svn: 153731
count.
This is an optimization for "retry without inlining" option. Here, if we
failed to inline a function due to reaching the basic block max count,
we are going to store this information and not try to inline it
again in the translation unit. This can be viewed as a function summary.
On sqlite, with this optimization, we are 30% faster then before and
cover 10% more basic blocks (partially because the number of times we
reach timeout is decreased by 20%).
llvm-svn: 153730
The analyzer gives up path exploration under certain conditions. For
example, when the same basic block has been visited more than 4 times.
With inlining turned on, this could lead to decrease in code coverage.
Specifically, if we give up inside the inlined function, the rest of
parent's basic blocks will not get analyzed.
This commit introduces an option to enable re-run along the failed path,
in which we do not inline the last inlined call site. This is done by
enqueueing the node before the processing of the inlined call site
with a special policy encoded in the state. The policy tells us not to
inline the call site along the path.
This lead to ~10% increase in the number of paths analyzed. Even though
we expected a much greater coverage improvement.
The option is turned off by default for now.
llvm-svn: 153534
This required adding a change count token to BugReport, but also allowed us to ditch ImmutableList as the BugReporterVisitor data type.
Also, remove the hack from MallocChecker, now that visitors appear in the opposite order. This is not exactly a fix, but the common case -- custom diagnostics after generic ones -- is now the default behavior.
llvm-svn: 153369
Specifically, we use the last store of the leaked symbol in the leak diagnostic.
(No support for struct fields since the malloc checker doesn't track those
yet.)
+ Infrastructure to track the regions used in store evaluations.
This approach is more precise than iterating the store to
obtain the region bound to the symbol, which is used in RetainCount
checker. The region corresponds to what is uttered in the code in the
last store and we do not rely on the store implementation to support
this functionality.
llvm-svn: 153212
The symbol-aware stack hint combines the checker-provided message
with the information about how the symbol was passed to the callee: as
a parameter or a return value.
For malloc, the generated messages look like this :
"Returning from 'foo'; released memory via 1st parameter"
"Returning from 'foo'; allocated memory via 1st parameter"
"Returning from 'foo'; allocated memory returned"
"Returning from 'foo'; reallocation of 1st parameter failed"
(We are yet to handle cases when the symbol is a field in a struct or
an array element.)
llvm-svn: 152962
The only use of AggExprVisitor was in #if 0 code (the analyzer's incomplete C++ support), so there is no actual behavioral change anyway.
llvm-svn: 152856
BugVisitor DiagnosticPieces.
When checkers create a DiagnosticPieceEvent, they can supply an extra
string, which will be concatenated with the call exit message for every
call on the stack between the diagnostic event and the final bug report.
(This is a simple version, which could be/will be further enhanced.)
For example, this is used in Malloc checker to produce the ",
which allocated memory" in the following example:
static char *malloc_wrapper() { // 2. Entered call from 'use'
return malloc(12); // 3. Memory is allocated
}
void use() {
char *v;
v = malloc_wrapper(); // 1. Calling 'malloc_wrappers'
// 4. Returning from 'malloc_wrapper', which allocated memory
} // 5. Memory is never released; potential
memory leak
llvm-svn: 152837
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
as aborted, but didn't treat such cases as sinks in the ExplodedGraph.
Along the way, add basic support for CXXCatchStmt, expanding the set of code we actually analyze (hopefully correctly).
Fixes: <rdar://problem/10892489>
llvm-svn: 152468
We do not reanalyze a function, which has already been analyzed as an
inlined callee. As per PRELIMINARY testing, this gives over
50% run time reduction on some benchmarks without decreasing of the
number of bugs found.
Turning the mode on by default.
llvm-svn: 152440
Essentially, a bug centers around a story for various symbols and regions. We should only include
the path diagnostic events that relate to those symbols and regions.
The pruning is done by associating a set of interesting symbols and regions with a BugReporter, which
can be modified at BugReport creation or by BugReporterVisitors.
This patch reduces the diagnostics emitted in several of our test cases. I've vetted these as
having desired behavior. The only regression is a missing null check diagnostic for the return
value of realloc() in test/Analysis/malloc-plist.c. This will require some investigation to fix,
and I have added a FIXME to the test case.
llvm-svn: 152361
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.
UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).
User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.
This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.
llvm-svn: 152211
command line options for inlining tuning.
This adds the option for stack depth bound as well as function size
bound.
+ minor doxygenification
llvm-svn: 151930
funopen, setvbuf.
Teach the checker and the engine about these APIs to resolve malloc
false positives. As I am adding more of these APIs, it is clear that all
this should be factored out into a separate callback (for example,
region escapes). Malloc, KeyChainAPI and RetainRelease checkers could
all use it.
llvm-svn: 151737
This introduces a concept of a "prunable" PathDiagnosticEvent. Currently this is a flag, but
we may evolve the concept to make this more dynamically inferred.
llvm-svn: 151663
When allocated buffer is passed to CF/NS..NoCopy functions, the
ownership is transfered unless the deallocator argument is set to
'kCFAllocatorNull'.
llvm-svn: 151608
that provides the behavior of the C++11 library trait
std::is_trivially_constructible<T, Args...>, which can't be
implemented purely as a library.
Since __is_trivially_constructible can have zero or more arguments, I
needed to add Yet Another Type Trait Expression Class, this one
handling arbitrary arguments. The next step will be to migrate
UnaryTypeTrait and BinaryTypeTrait over to this new, more general
TypeTrait class.
Fixes the Clang side of <rdar://problem/10895483> / PR12038.
llvm-svn: 151352
When we find two leak reports with the same allocation site, report only
one of them.
Provide a helper method to BugReporter to facilitate this.
llvm-svn: 151287
Make this call an exception in ExprEngine::invalidateArguments:
'int pthread_setspecific(ptheread_key k, const void *)' stores
a value into thread local storage. The value can later be retrieved
with 'void *ptheread_getspecific(pthread_key)'. So even thought the
parameter is 'const void *', the region escapes through the
call.
(Here we just blacklist the call in the ExprEngine's default
logic. Another option would be to add a checker which evaluates
the call and triggers the call to invalidate regions.)
Teach the Malloc Checker, which treats all system calls as safe about
the API.
llvm-svn: 151220
block pointer that returns a block literal which captures (by copy)
the lambda closure itself. Some aspects of the block literal are left
unspecified, namely the capture variable (which doesn't actually
exist) and the body (which will be filled in by IRgen because it can't
be written as an AST).
Because we're switching to this model, this patch also eliminates
tracking the copy-initialization expression for the block capture of
the conversion function, since that information is now embedded in the
synthesized block literal. -1 side tables FTW.
llvm-svn: 151131
Holding the constructor directly makes no sense when list-initialized arrays come into play. The constructor is now held in a CXXConstructExpr, if construction is what is done. The new design can also distinguish properly between list-initialization and direct-initialization, as well as implicit default-initialization constructors and explicit value-initialization constructors. Finally, doing it this way removes redundance from the AST because CXXNewExpr doesn't try to handle both the allocation and the initialization responsibilities.
This breaks the static analysis of new expressions. I've filed PR12014 to track this.
llvm-svn: 150682
piece can always be generated.
The default end of diagnostic path piece was failing to generate on a
BlockEdge that was outgoing from a basic block without a terminator,
resulting in a very simple diagnostic being rendered (ex: no path
highlighting or custom visitors). Reuse another function, which is
essentially doing the same thing and correct it not to fail when a block
has no terminator.
llvm-svn: 150659
is general goodness because representations of member pointers are
not always equivalent across member pointer types on all ABIs
(even though this isn't really standard-endorsed).
Take advantage of the new information to teach IR-generation how
to do these reinterprets in constant initializers. Make sure this
works when intermingled with hierarchy conversions (although
this is not part of our motivating use case). Doing this in the
constant-evaluator would probably have been better, but that would
require a *lot* of extra structure in the representation of
constant member pointers: you'd really have to track an arbitrary
chain of hierarchy conversions and reinterpretations in order to
get this right. Ultimately, this seems less complex. I also
wasn't quite sure how to extend the constant evaluator to handle
foldings that we don't actually want to treat as extended
constant expressions.
llvm-svn: 150551
(In response of Ted's review of r150112.)
This moves the logic which checked if a symbol escapes through a
parameter to invalidateRegionCallback (instead of post CallExpr visit.)
To accommodate the change, added a CallOrObjCMessage parameter to
checkRegionChanges callback.
llvm-svn: 150513
This seems to negatively affect compile time onsome ObjC tests
(which use a lot of partial diagnostics I assume). I have to come
up with a way to keep them inline without including Diagnostic.h
everywhere. Now adding a new diagnostic requires a full rebuild
of e.g. the static analyzer which doesn't even use those diagnostics.
This reverts commit 6496bd10dc3a6d5e3266348f08b6e35f8184bc99.
This reverts commit 7af19b817ba964ac560b50c1ed6183235f699789.
This reverts commit fdd15602a42bbe26185978ef1e17019f6d969aa7.
This reverts commit 00bd44d5677783527d7517c1ffe45e4d75a0f56f.
This reverts commit ef9b60ffed980864a8db26ad30344be429e58ff5.
llvm-svn: 150006
- Capturing variables by-reference and by-copy within a lambda
- The representation of lambda captures
- The creation of the non-static data members in the lambda class
that store the captured variables
- The initialization of the non-static data members from the
captured variables
- Pretty-printing lambda expressions
There are a number of FIXMEs, both explicit and implied, including:
- Creating a field for a capture of 'this'
- Improved diagnostics for initialization failures when capturing
variables by copy
- Dealing with temporaries created during said initialization
- Template instantiation
- AST (de-)serialization
- Binding and returning the lambda expression; turning it into a
proper temporary
- Lots and lots of semantic constraints
- Parameter pack captures
llvm-svn: 149977
- Move the offending methods out of line and fix transitive includers.
- This required changing an enum in the PPCallback API into an unsigned.
llvm-svn: 149782
Fix all the files that depended on transitive includes of Diagnostic.h.
With this patch in place changing a diagnostic no longer requires a full rebuild of the StaticAnalyzer.
llvm-svn: 149781
Original log:
Convert ProgramStateRef to a smart pointer for managing the reference counts of ProgramStates. This leads to a slight memory
improvement, and a simplification of the logic for managing ProgramState objects.
# Please enter the commit message for your changes. Lines starting
llvm-svn: 149339
Original log:
Convert ProgramStateRef to a smart pointer for managing the reference counts of ProgramStates. This leads to a slight memory
improvement, and a simplification of the logic for managing ProgramState objects.
llvm-svn: 149336
At this point this is largely cosmetic, but it opens the door to replace
ProgramStateRef with a smart pointer that more eagerly acts in the role
of reclaiming unused ProgramState objects.
llvm-svn: 149081
to the underlying consumer implementation. This allows us to unique reports across analyses to multiple functions (which
shows up with inlining).
llvm-svn: 148997
This is accomplished by periodically reclaiming nodes in the graph. This was an optimization
done before the CFG was linearized, but the CFG linearization destroyed that optimization since each
freshly created node couldn't be reclaimed and we only looked at a window of nodes created between
each ProcessStmt. This optimization can be reclaimed my merely expanding the window to N number of nodes.
llvm-svn: 148888
- Add atomic-to/from-nonatomic cast types
- Emit atomic operations for arithmetic on atomic types
- Emit non-atomic stores for initialisation of atomic types, but atomic stores and loads for every other store / load
- Add a __atomic_init() intrinsic which does a non-atomic store to an _Atomic() type. This is needed for the corresponding C11 stdatomic.h function.
- Enables the relevant __has_feature() checks. The feature isn't 100% complete yet, but it's done enough that we want people testing it.
Still to do:
- Make the arithmetic operations on atomic types (e.g. Atomic(int) foo = 1; foo++;) use the correct LLVM intrinsic if one exists, not a loop with a cmpxchg.
- Add a signal fence builtin
- Properly set the fenv state in atomic operations on floating point values
- Correctly handle things like _Atomic(_Complex double) which are too large for an atomic cmpxchg on some platforms (this requires working out what 'correctly' means in this context)
- Fix the many remaining corner cases
llvm-svn: 148242
My hope is to reimplement this from first principles based on the simplifications of removing unneeded node builders
and re-evaluating how C++ calls are handled in the CFG. The hope is to turn inlining "on-by-default" as soon as possible
with a core set of things working well, and then expand over time.
llvm-svn: 147904
We already have a more conservative check in the compiler (if the
format string is not a literal, we warn). Still adding it here for
completeness and since this check is stronger - only triggered if the
format string is tainted.
llvm-svn: 147714
(Stmt*,LocationContext*) pairs to SVals instead of Stmt* to SVals.
This is needed to support basic IPA via inlining. Without this, we cannot tell
if a Stmt* binding is part of the current analysis scope (StackFrameContext) or
part of a parent context.
This change introduces an uglification of the use of getSVal(), and thus takes
two steps forward and one step back. There are also potential performance implications
of enlarging the Environment. Both can be addressed going forward by refactoring the
APIs and optimizing the internal representation of Environment. This patch
mainly introduces the functionality upon when we want to build upon (and clean up).
llvm-svn: 147688
as a result of a call.
Problem:
Global variables, which come in from system libraries should not be
invalidated by all calls. Also, non-system globals should not be
invalidated by system calls.
Solution:
The following solution to invalidation of globals seems flexible enough
for taint (does not invalidate stdin) and should not lead to too
many false positives. We split globals into 3 classes:
* immutable - values are preserved by calls (unless the specific
global is passed in as a parameter):
A : Most system globals and const scalars
* invalidated by functions defined in system headers:
B: errno
* invalidated by all other functions (note, these functions may in
turn contain system calls):
B: errno
C: all other globals (which are not in A nor B)
llvm-svn: 147569
type is a pointer to const. (radar://10595327)
The regions corresponding to the pointer and reference arguments to
a function get invalidated by the calls since a function call can
possibly modify the pointed to data. With this change, we are not going
to invalidate the data if the argument is a pointer to const. This
change makes the analyzer more optimistic in reporting errors.
(Support for C, C++ and Obj C)
llvm-svn: 147002
We are now often generating expressions even if the solver is not known to be able to simplify it. This is another cleanup of the existing code, where the rest of the analyzer and checkers should not base their logic on knowing ahead of the time what the solver can reason about.
In this case, CStringChecker is performing a check for overflow of 'left+right' operation. The overflow can be checked with either 'maxVal-left' or 'maxVal-right'. Previously, the decision was based on whether the expresion evaluated to undef or not. With this patch, we check if one of the arguments is a constant, in which case we know that 'maxVal-const' is easily simplified. (Another option is to use canReasonAbout() method of the solver here, however, it's currently is protected.)
This patch also contains 2 small bug fixes:
- swap the order of operators inside SValBuilder::makeGenericVal.
- handle a case when AddeVal is unknown in GenericTaintChecker::getPointedToSymbol.
llvm-svn: 146343
Fix a bug in SimpleSValBuilder, where we should swap lhs and rhs when calling generateUnknownVal(), - the function which creates symbolic expressions when data is tainted. The issue is not visible when we only create the expressions for taint since all expressions are commutative from taint perspective.
Refactor SymExpr::symbol_iterator::expand() to use a switch instead of a chain of ifs.
llvm-svn: 146336
types are equivalent.
+ A taint test which tests bitwise operations and which was
triggering an assertion due to presence of the integer to integer cast.
llvm-svn: 146240
- Created a new SymExpr type - SymbolCast.
- SymbolCast is created when we don't know how to simplify a NonLoc to
NonLoc casts.
- A bit of code refactoring: introduced dispatchCast to have better
code reuse, remove a goto.
- Updated the test case to showcase the new taint flow.
llvm-svn: 145985
class.
We are going into the direction of handling SymbolData and other SymExpr
uniformly, so it makes less sense to keep two different SVal classes.
For example, the checkers would have to take an extra step to reason
about each type separately.
The classes have the same members, we were just using the SVal kind
field for easy differentiation in 3 switch statements. The switch
statements look more ugly now, but we can make the code more readable in
other ways, for example, moving some code into separate functions.
llvm-svn: 145833
ExprEngine.
Teach SimpleConstraintManager::assumeSymRel() to propagate constraints
to symbolic expressions.
+ One extra warning (real bug) is now generated due to enhanced
assumeSymRel().
llvm-svn: 145832
ConstraintManager::canReasonAbout() from the ExprEngine.
ExprEngine should not care if the constraint solver can reason about
something or not. The solver should be able to handle all the SymExprs.
To do this, the solver should be able to keep track of not only the
SymbolData but of all SymExprs. This is why we change SymbolRef to be an
alias of SymExpr*. When encountering an expression it cannot simplify,
the solver should just add the constraints to it.
llvm-svn: 145831
We are getting name of the called function or it's declaration in a few checkers. Refactor them to use the helper function in the CheckerContext.
llvm-svn: 145576
When the solver and SValBuilder cannot reason about symbolic expressions (ex: (x+1)*y ), the analyzer conjures a new symbol with no ties to the past. This helps it to recover some path-sensitivity. However, this breaks the taint propagation.
With this commit, we are going to construct the expression even if we cannot reason about it later on if an operand is tainted.
Also added some comments and asserts.
llvm-svn: 144932
TaintTag.h will contain definitions of different taint kinds and their properties.
TaintManager will be responsible for implementing taint specific operations, storing taint.
ProgramState will provide API to add/remove taint.
llvm-svn: 144824
property references to use a new PseudoObjectExpr
expression which pairs a syntactic form of the expression
with a set of semantic expressions implementing it.
This should significantly reduce the complexity required
elsewhere in the compiler to deal with these kinds of
expressions (e.g. IR generation's special l-value kind,
the static analyzer's Message abstraction), at the lower
cost of specifically dealing with the odd AST structure
of these expressions. It should also greatly simplify
efforts to implement similar language features in the
future, most notably Managed C++'s properties and indexed
properties.
Most of the effort here is in dealing with the various
clients of the AST. I've gone ahead and simplified the
ObjC rewriter's use of properties; other clients, like
IR-gen and the static analyzer, have all the old
complexity *and* all the new complexity, at least
temporarily. Many thanks to Ted for writing and advising
on the necessary changes to the static analyzer.
I've xfailed a small diagnostics regression in the static
analyzer at Ted's request.
llvm-svn: 143867
implicitly perform an lvalue-to-rvalue conversion if used on an lvalue
expression. Also improve the documentation of Expr::Evaluate* to indicate which
of them will accept expressions with side-effects.
llvm-svn: 143263
Enqueue the nodes generated as the result of processing a statement
inside the Core Engine. This makes sure ExpEngine does not access
CoreEngine's private members and is more concise.
llvm-svn: 143089
Remove dead members/parameters: ProgramState, respondsToCallback, autoTransition.
Remove addTransition method since it's the same as generateNode. Maybe we should
rename generateNode to genTransition (since a transition is always automatically
generated)?
llvm-svn: 142946
Get rid of the EndOfPathBuilder completely.
Use the generic NodeBuilder to generate nodes.
Enqueue the end of path frontier explicitly.
llvm-svn: 142943
statements. As noted in the documentation for the AST node, the
semantics of __if_exists/__if_not_exists are somewhat different from
the way Visual C++ implements them, because our parsed-template
representation can't accommodate VC++ semantics without serious
contortions. Hopefully this implementation is "good enough".
llvm-svn: 142901
This commit removes the major functional dependency on the ExprEngine::Builder
member variable.
In some cases the code became more verbose. Particularly, we call takeNodes()
and addNodes() to move responsibility for the nodes from one builder to another.
This will get simplified later on.
llvm-svn: 142831
To convert iteratively, we take the nodes the local builder will
process from the from the global builder and add the generated nodes
after the short lived builder is done. PureStmtNodeBuilder is the
one we should eventually use everywhere. Added Stmt index and Builder
context as ExprEngine globals. To avoid passing them around.
llvm-svn: 142828
First step toward removing the global Stmt builder. Added several transitional methods (like takeNodes/addNodes).
+ Stop early if the set of exploded nodes for the next iteration is empty.
llvm-svn: 142827
This moves the responsibility for storing the output node set from the
builder to the clients. The builder is just responsible for transforming
an input set into the output set: {SrcSet/SrcNode} -> {Frontier}.
llvm-svn: 142826
NodeBuilder should not assume it's dealing with a single predecessor. Remove predecessor getters. Modify the BranchNodeBuilder to not be responsible for doing auto-transitions (which depend on a predecessor).
llvm-svn: 142453
It now only depends on a generic NodeBuilder instead. As part of this change, make the generic node builder results finalized by default.
llvm-svn: 142452
Take advantage of the new builders for branch processing. As part of this change pass generic NodeBuilder (instead of BranchNodeBuilder) to the BranchCondition callback and remove the unused methods form BranchBuilder.
llvm-svn: 142448
Currently we have a bunch of different node builders which provide some common
functionality but are difficult to refactor. Each builder generates nodes of
different kinds and calculates the frontier nodes, which should be propagated
to the next step (after the builder dies).
Introduce a new NodeBuilder which provides very basic node generation facilities
but takes care of the second problem. The idea is that all the other builders
will eventually use it. Use this builder in CheckerContext instead of
StmtNodeBuilder (the way the frontier is propagated to the StmtBuilder
is a hack and will be removed later on).
llvm-svn: 142443
- Remodel Expr::EvaluateAsInt to behave like the other EvaluateAs* functions,
and add Expr::EvaluateKnownConstInt to capture the current fold-or-assert
behaviour.
- Factor out evaluation of bitfield bit widths.
- Fix a few places which would evaluate an expression twice: once to determine
whether it is a constant expression, then again to get the value.
llvm-svn: 141561
Pull out the declaration of the ScanReachableSymbols so that it can be used directly. Document ProgramState::scanReachableSymbols() methods.
llvm-svn: 140323
- Get rid of PathDiagnosticLocation(SourceRange r,..) constructor by providing a bunch of create methods.
- The PathDiagnosticLocation(SourceLocation L,..), which is used by crate methods, will eventually become private.
- Test difference is in the case when the report starts at the beginning of the function. We used to represent that point as a range of the very first token in the first statement. Now, it's just a single location representing the first character of the first statement.
llvm-svn: 139932
- The closing brace is always a single location, not a range.
- The test case previously had a location key 57:1 followed by a range [57:1 - 57:1].
llvm-svn: 139832
- Fix a fixme and move the logic of creating a PathDiagnosticLocation corresponding to a ProgramPoint into a PathDiagnosticLocation constructor.
- Rename PathDiagnosticLocation::create to differentiate from the added constructor.
llvm-svn: 139825
- Modify all PathDiagnosticLocation constructors that take Stmt to also requre LocationContext.
- Add a constructor which should be used in case there is no valid statement/location (it will grab the location of the enclosing function).
llvm-svn: 139763
- It adds LocationContext to the PathDiagnosticLocation object and uses it to lookup the enclosing statement with a valid location.
- So far, the LocationContext is only available when the object is constructed from the ExplodedNode.
- Already found some subtle bugs(in plist-output-alternate.m) where the intermediate diagnostic steps were not previously shown.
llvm-svn: 139703
the lifetime of the block by copying it to the heap, or else we'll get
a dangling reference because the code working with the non-block-typed
object will not know it needs to copy.
There is some danger here, e.g. with assigning a block literal to an
unsafe variable, but, well, it's an unsafe variable.
llvm-svn: 139451
than conversions of C pointers to ObjC pointers. In order to ensure that
we've caught every case, add asserts to CastExpr that strictly determine
which cast kind is used for which kind of bit cast.
llvm-svn: 139352
Remove TransferFuncs from ExprEngine and AnalysisConsumer.
Demote RetainReleaseChecker to a regular checker, and give it the name osx.cocoa.RetainCount (class name change coming shortly). Update tests accordingly.
llvm-svn: 138998
Unlike the other callbacks, this one is a simple virtual method, since it is only to be used for debugging.
This new callback replaces the old ProgramState::Printer interface, and allows us to move the printing of refcount bindings from CFRefCount to RetainReleaseChecker.
llvm-svn: 138728
This is a common path for function and C++ method calls, Objective-C messages and property accesses, and C++ construct-exprs.
As support, add message receiver accessors to ObjCMessage and CallOrObjCMessage.
llvm-svn: 138718
Also, allow CallOrObjCMessage to wrap a CXXConstructExpr as well.
Finally, this allows us to remove the clunky whitelisting system from CFRefCount/RetainReleaseChecker. Slight regression due to CXXNewExprs not yet being handled in post-statement callbacks (PR forthcoming).
llvm-svn: 138716
Also convert stack-addr-ps.cpp to use the analyzer instead of just Sema, now
that it doesn't crash, and extract the stack-block test into another file since
it errors, and that prevents the analyzer from running.
llvm-svn: 138613
Because Checkers live for an entire translation unit, this persists summary caches across multiple code bodies and avoids repeated initialization (but probably at the cost of memory). This removes the last references from RetainReleaseChecker to CFRefCount.
llvm-svn: 138529
This is a very small regression (actually introduced in r138309) because it won't catch leaks of objects passed by reference to CFDictionaryCreate (they're considered to have escaped and are ignored). If this is important we can put in a specific eval::Call to restore the functionality.
llvm-svn: 138464
incorrectly in the CFG, and also the static analyzer. This patch regresses the analyzer a bit, but
that needs to be followed up with a better solution.
Fixes <rdar://problem/10008112>.
llvm-svn: 138372
1) Create a header file to expose the predefined visitors. And move the parent(BugReporterVisitor) there as well.
2) Remove the registerXXXVisitor functions - the Visitor constructors/getters can be used now to create the object. One exception is registerVarDeclsLastStore(), which registers more then one visitor, so make it static member of FindLastStoreBRVisitor.
3) Modify all the checkers to use the new API.
llvm-svn: 138126
One API change: I added BugReporter as an additional parameter to the BugReporterVisitor::VisitNode() method to allow visitors register other visitors with the report on the fly (while processing a node). This functionality is used by NilReceiverVisitor, which registers TrackNullOrUndefValue when the receiver is null.
llvm-svn: 138001
Having a notion of an actual ProgramPointTag will aid in introspection of the analyzer's behavior.
For example, the GraphViz output of the analyzer will pretty-print the tags in a useful manner.
llvm-svn: 137529
1) Change SymbolDependTy map to keep pointers as data. And other small tweaks like making the DenseMap smaller 64->16 elements; remove removeSymbolDependencies() as it will probably not be used.
2) Do not mark dependents live more then once.
llvm-svn: 137401
The motivation of this large change is to drastically simplify the logic in ExprEngine going forward.
Some fallout is that the output of some BugReporterVisitors is not as accurate as before; those will
need to be fixed over time. There is also some possible performance regression as RemoveDeadBindings
will be called frequently; this can also be improved over time.
llvm-svn: 136419
FullSourceLoc::getInstantiationLoc to ...::getExpansionLoc. This is part
of the API and documentation update from 'instantiation' as the term for
macros to 'expansion'.
llvm-svn: 135914
methods, including indirectly overridden methods like those
declared in protocols and categories. There are mismatches
that we would like to diagnose but aren't yet, but this
is fine for now.
I looked at approaches that avoided doing this lookup
unless we needed it, but the infer-related-result-type
checks were doing it anyway, so I left it with the same
fast-path check for no previous declartions of that
selector.
llvm-svn: 135743
to represent a fully-substituted non-type template parameter.
This should improve source fidelity, as well as being generically
useful for diagnostics and such.
llvm-svn: 135243
where we have an immediate need of a retained value.
As an exception, don't do this when the call is made as the immediate
operand of a __bridge retain. This is more in the way of a workaround
than an actual guarantee, so it's acceptable to be brittle here.
rdar://problem/9504800
llvm-svn: 134605
MaterializeTemporaryExpr captures a reference binding to a temporary
value, making explicit that the temporary value (a prvalue) needs to
be materialized into memory so that its address can be used. The
intended AST invariant here is that a reference will always bind to a
glvalue, and MaterializeTemporaryExpr will be used to convert prvalues
into glvalues for that binding to happen. For example, given
const int& r = 1.0;
The initializer of "r" will be a MaterializeTemporaryExpr whose
subexpression is an implicit conversion from the double literal "1.0"
to an integer value.
IR generation benefits most from this new node, since it was
previously guessing (badly) when to materialize temporaries for the
purposes of reference binding. There are likely more refactoring and
cleanups we could perform there, but the introduction of
MaterializeTemporaryExpr fixes PR9565, a case where IR generation
would effectively bind a const reference directly to a bitfield in a
struct. Addresses <rdar://problem/9552231>.
llvm-svn: 133521
Language-design credit goes to a lot of people, but I particularly want
to single out Blaine Garst and Patrick Beard for their contributions.
Compiler implementation credit goes to Argyrios, Doug, Fariborz, and myself,
in no particular order.
llvm-svn: 133103
There's no associated test for this because fully-constrained symbolic values are evaluated ahead of time in normal expressions. This can only come up in checker-constructed expressions (like the ones in an upcoming patch to CStringChecker).
llvm-svn: 133041
Also, have Environment stop looking through NoOp casts; it didn't match the behavior of LiveVariables. And once that's gone, the whole cast block of that switch is unnecessary.
llvm-svn: 132840
__builtin_astype(): Used to reinterpreted as another data type of the same size using for both scalar and vector data types.
Added test case.
llvm-svn: 132612
Type::isUnsignedIntegerOrEnumerationType(), which are like
Type::isSignedIntegerType() and Type::isUnsignedIntegerType() but also
consider the underlying type of a C++0x scoped enumeration type.
Audited all callers to the existing functions, switching those that
need to also handle scoped enumeration types (e.g., those that deal
with constant values) over to the new functions. Fixes PR9923 /
<rdar://problem/9447851>.
llvm-svn: 131735
- New isDefined() function checks for deletedness
- isThisDeclarationADefinition checks for deletedness
- New doesThisDeclarationHaveABody() does what
isThisDeclarationADefinition() used to do
- The IsDeleted bit is not propagated across redeclarations
- isDeleted() now checks the canoncial declaration
- New isDeletedAsWritten() does what it says on the tin.
- isUserProvided() now correct (thanks Richard!)
This fixes the bug that we weren't catching
void foo() = delete;
void foo() {}
as being a redefinition.
llvm-svn: 131013
Patch authored by John Wiegley.
These are array type traits used for parsing code that employs certain
features of the Embarcadero C++ compiler: __array_rank(T) and
__array_extent(T, Dim).
llvm-svn: 130351
Patch authored by David Abrahams.
These two expression traits (__is_lvalue_expr, __is_rvalue_expr) are used for
parsing code that employs certain features of the Embarcadero C++ compiler.
llvm-svn: 130122
for __unknown_anytype resolution to destructively modify the AST. So that's
what it does now, which significantly simplifies some of the implementation.
Normal member calls work pretty cleanly now, and I added support for
propagating unknown-ness through &.
llvm-svn: 129331
represents a dynamic cast where we know that the result is always null.
For example:
struct A {
virtual ~A();
};
struct B final : A { };
struct C { };
bool f(B* b) {
return dynamic_cast<C*>(b);
}
llvm-svn: 129256
to be reworked to model CallEnter/CallExit (just like all other calls). For now, treat constructors mostly
like other function calls, making the analysis of C++ code just a little more useful.
llvm-svn: 129166
The idea is that you can create a VarDecl with an unknown type, or a
FunctionDecl with an unknown return type, and it will still be valid to
access that object as long as you explicitly cast it at every use. I'm
still going back and forth about how I want to test this effectively, but
I wanted to go ahead and provide a skeletal implementation for the LLDB
folks' benefit and because it also improves some diagnostic goodness for
placeholder expressions.
llvm-svn: 129065
1) Change the CFG to include the DeclStmt for conditional variables, instead of using the condition itself as a faux DeclStmt.
2) Update ExprEngine (the static analyzer) to understand (1), so not to regress.
3) Update UninitializedValues.cpp to initialize all tracked variables to Uninitialized at the start of the function/method.
4) Only use the SelfReferenceChecker (SemaDecl.cpp) on global variables, leaving the dataflow analysis to handle other cases.
The combination of (1) and (3) allows the dataflow-based -Wuninitialized to find self-init problems when the initializer
contained control-flow.
llvm-svn: 128858
from how we process ordinary function calls, had a tremendous about of redundancy, and relied
strictly on inlining behavior (which was incomplete) to provide semantics instead of falling
back to the conservative analysis we use for C functions. This is a significant step into
making C++ analyzer support more useful.
llvm-svn: 128557
conventional categories into Basic and AST. Update the self-init checker
to use this logic; CFRefCountChecker is complicated enough that I didn't
want to touch it.
llvm-svn: 126817
- renames evalCastNL and evalCastL to evalCastFromNonLoc and
evalCastFromLoc (avoid abbreviations that aren't well known).
- makes all function parameter names start with a lower case letter
for consistency and distinction from member variables.
- avoids abbreviations in function parameter names.
Reviewed by kremenek@apple.com.
llvm-svn: 126722
A checker can register as receiver/listener of "events" (basically it registers a callback
with a function getting called with an argument of the event type) and other checkers can
register as "dispatchers" and can pass an event object to all the listeners.
This allows cooperation amongst checkers but with very loose coupling.
llvm-svn: 126658
This fixes a crash reported in PR9287, and also fixes a false positive involving the value of such ternary
expressions not properly getting propagated.
llvm-svn: 126362
-Introduce EndOfFunctionNodeBuilder::withCheckerTag to allow it be "specialized" with a
checker tag and not require the checkers to pass a tag.
-For EndOfFunctionNodeBuilder::generateNode, reverse the order of tag/P parameters since
there are actual calls that assume the second parameter is ExplodedNode.
llvm-svn: 126332
-In general, don't have the BugReporter deleting BugTypes, BugTypes will eventually become owned by checkers
and outlive the BugReporter. In the meantime, there will be some leaks since some checkers assume that
the BugTypes they create will be destroyed by the BugReporter.
-Have BugReporter::EmitBasicReport create BugTypes that are reused if the same name & category strings
are passed to EmitBasicReport. These BugTypes are owned and destroyed by the BugReporter.
This allows bugs reported through EmitBasicReport to be coalesced.
-Remove the llvm::FoldingSet<BugReportEquivClass> from BugType and move it into the BugReporter.
For uniquing BugReportEquivClass also use the BugType* so that we can iterate over all of them using only one set.
llvm-svn: 126272
-Migrate ObjCSelfInitChecker to CheckerV2. In the process remove the 'preCallSelfFlags' field
from the checker class and use GRState for storing that info.
-Get ExprEngine to start delegating checker running to CheckerManager.
llvm-svn: 126229
This yields a minor memory reduction (for larger functions) on Sqlite at the cost of slightly
higher memory usage on some functions because of the increased size of GRState (which can be optimized).
I expect the real memory savings from this enhancement will come when we aggressively
canabilize more of the ExplodedGraph.
llvm-svn: 126012
-Introduce CheckerV2, a set of templates for convenient declaration & registration of checkers.
Currently useful just for checkers working on the AST not the path-sensitive ones.
-Enhance CheckerManager to actually collect the checkers and turn it into the entry point for
running the checkers.
-Use the new mechanism for the LLVMConventionsChecker.
llvm-svn: 125778
class and to bind the shared value using OpaqueValueExpr. This fixes an
unnoticed problem with deserialization of these expressions where the
deserialized form would lose the vital pointer-equality trait; or rather,
it fixes it because this patch also does the right thing for deserializing
OVEs.
Change OVEs to not be a "temporary object" in the sense that copy elision is
permitted.
This new representation is not totally unawkward to work with, but I think
that's really part and parcel with the semantics we're modelling here. In
particular, it's much easier to fix things like the copy elision bug and to
make the CFG look right.
I've tried to update the analyzer to deal with this in at least some
obvious cases, and I think we get a much better CFG out, but the printing
of OpaqueValueExprs probably needs some work.
llvm-svn: 125744
LabelDecl and LabelStmt. There is a 1-1 correspondence between the
two, but this simplifies a bunch of code by itself. This is because
labels are the only place where we previously had references to random
other statements, causing grief for AST serialization and other stuff.
This does cause one regression (attr(unused) doesn't silence unused
label warnings) which I'll address next.
This does fix some minor bugs:
1. "The only valid attribute " diagnostic was capitalized.
2. Various diagnostics printed as ''labelname'' instead of 'labelname'
3. This reduces duplication of label checking between functions and blocks.
Review appreciated, particularly for the cindex and template bits.
llvm-svn: 125733
-Checkers will be defined in the tablegen file 'Checkers.td'.
-Apart from checkers, we can define checker "packages" that will contain a collection of checkers.
-Checkers can be enabled with -analyzer-checker=<name> and disabled with -analyzer-disable-checker=<name> e.g:
Enable checkers from 'cocoa' and 'corefoundation' packages except the self-initialization checker:
-analyzer-checker=cocoa -analyzer-checker=corefoundation -analyzer-disable-checker=cocoa.SelfInit
-Introduces CheckerManager and CheckerProvider. CheckerProviders get the set of checker names to enable/disable and
register them with the CheckerManager which will be the entry point for all checker-related functionality.
Currently only the self-initialization checker takes advantage of the new mechanism.
llvm-svn: 125503
The optimization involves eagerly pruning ExplodedNodes from the ExplodedGraph that contain
practically no difference between the predecessor and successor nodes. For example, if
the state is different between a predecessor and a node, the node is left in. Only for
the 'environment' component of the state do we not care if the ExplodedNodes are different.
This paves the way for future optimizations where we can reclaim the environment objects.
llvm-svn: 125154
Eventually there will also be a lib/StaticAnalyzer/Frontend that will handle initialization and checker registration.
Yet another library to avoid cyclic dependencies between Core and Checkers.
llvm-svn: 125124