Previously, we preferred to get a result type by looking at the callee's
declared result type. This allowed us to handlereferences, which are
represented in the AST as lvalues of their pointee type. (That is, a call
to a function returning 'int &' has type 'int' and value kind 'lvalue'.)
However, this results in us preferring the original type of a function
over a casted type. This is a problem when a function pointer is casted
to another type, because the conjured result value will have the wrong
type. AdjustedReturnValueChecker is supposed to handle this, but still
doesn't handle the case where there is no "original function" at all,
i.e. where the callee is unknown.
Now, we instead look at the call expression's value kind (lvalue, xvalue,
or prvalue), and adjust the expr's type accordingly. This will have no
effect when the function is inlined, and will conjure the value that will
actually be used when it is not.
This makes AdjustedReturnValueChecker /nearly/ unnecessary; unfortunately,
the cases where it would still be useful are where we need to cast the
result of an inlined function or a checker-evaluated function, and in these
cases we don't know what we're casting /from/ by the time we can do post-
call checks. In light of that, remove AdjustedReturnValueChecker, which
was already not checking quite a few calls.
llvm-svn: 163065
This is similar to how we divide up the StaticAnalyzer libraries to separate
core functionality to what is clearly associated with Frontend actions.
llvm-svn: 163050
More generally, this adds a new configuration option 'c++-inlining', which
controls which C++ member functions can be considered for inlining. This
uses the new -analyzer-config table, so the cc1 arguments will look like this:
... -analyzer-config c++-inlining=[none|methods|constructors|destructors]
Note that each mode implies that all the previous member function kinds
will be inlined as well; it doesn't make sense to inline destructors
without inlining constructors, for example.
The default mode is 'methods'.
llvm-svn: 163004
PathDiagnostics are actually profiled and uniqued independently of the
path on which the bug occurred. This is used to merge diagnostics that
refer to the same issue along different paths, as well as by the plist
diagnostics to reference files created by the HTML diagnostics.
However, there are two problems with the current implementation:
1) The bug description is included in the profile, but some
PathDiagnosticConsumers prefer abbreviated descriptions and some
prefer verbose descriptions. Fixed by including both descriptions in
the PathDiagnostic objects and always using the verbose one in the profile.
2) The "minimal" path generation scheme provides extra information about
which events came from macros that the "extensive" scheme does not.
This resulted not only in different locations for the plist and HTML
diagnostics, but also in diagnostics being uniqued in the plist output
but not in the HTML output. Fixed by storing the "end path" location
explicitly in the PathDiagnostic object, rather than trying to find the
last piece of the path when the diagnostic is requested.
This should hopefully finish unsticking our internal buildbot.
llvm-svn: 162965
Basically, do the correct thing to fix the XML generation error, rather
than making it even worse by unilaterally dereferencing a null pointer.
llvm-svn: 162964
(__builtin_* etc.) so that it isn't possible to take their address.
Specifically, introduce a new type to represent a reference to a builtin
function, and a new cast kind to convert it to a function pointer in the
operand of a call. Fixes PR13195.
llvm-svn: 162962
reanalyzed.
The policy on what to reanalyze should be in AnalysisConsumer with the
rest of visitation order logic.
There is no reason why ExprEngine needs to pass the Visited set to
CoreEngine, it can populate it itself.
llvm-svn: 162957
If the current path diagnostic does /not/ have files associated with it, we
were simply skipping on to the next diagnostic with 'continue'. But that
also skipped the close tag for the diagnostic's <dict> node.
Part of fixing our internal analyzer buildbot.
llvm-svn: 162939
This heuristic addresses the case when a pointer (or ref) is passed
to a function, which initializes the variable (or sets it to something
other than '0'). On the branch where the inlined function does not
set the value, we report use of undefined value (or NULL pointer
dereference). The access happens in the caller and the path
through the callee would get pruned away with regular path pruning. To
solve this issue, we previously disabled diagnostic pruning completely
on undefined and null pointer dereference checks, which entailed very
verbose diagnostics in most cases. Furthermore, not all of the
undef value checks had the diagnostic pruning disabled.
This patch implements the following heuristic: if we pass a pointer (or
ref) to the region (on which the error is reported) into a function and
it's value is either undef or 'NULL' (and is a pointer), do not prune
the function.
llvm-svn: 162863
a comma separated collection of key:value pairs (which are strings). This
allows a general way to provide analyzer configuration data from the command line.
No clients yet.
llvm-svn: 162827
Specifically, CallEventManager::getCaller was looking at the call site for
an inlined call and trying to see what kind of call it was, but it only
checked for CXXConstructExprClass. (It's not using an isa<> here to avoid
doing three more checks on the the statement class.)
This caused an unreachable when we actually did inline the constructor of a
temporary object.
PR13717
llvm-svn: 162792
When exiting a function, the analyzer looks for the last statement in the
function to see if it's a return statement (and thus bind the return value).
However, the search for "the last statement" was accepting statements that
were in implicitly-generated inlined functions (i.e. destructors). So we'd
go and get the statement from the destructor, and then say "oh look, this
function had no explicit return...guess there's no return value". And /that/
led to the value being returned being declared dead, and all our leak
checkers complaining.
llvm-svn: 162791
No test case since this is a debug option that we will never turn on by
default since it makes the leak checkers much less useful. (We'll only report
leaks at the end of analysis if -analyzer-purge=none.)
llvm-svn: 162772
This helper function (in the clang::ento::bugreporter namespace) may add more
than one visitor, but conceptually it's tracking a single use of a null or
undefined value and should do so as best it can.
Also, the BugReport parameter has been made a reference to underscore that
it is non-optional.
llvm-svn: 162720
As Anna pointed out to me offline, it's a little silly to walk backwards through
the graph to find the store site when BugReporter will do the exact same walk
as part of path diagnostic generation.
llvm-svn: 162719
Previously, if we were tracking stores to a variable 'x', and came across this:
x = foo();
...we would simply emit a note here and stop. Now, we'll step into 'foo' and
continue tracking the returned value from there.
<rdar://problem/12114689>
llvm-svn: 162718
The two callers are using this in order to be conservative, so let's just
clarify the information that's actually being provided here. This is not
related to inlining decisions in any way.
No functionality change.
llvm-svn: 162717
Because the CXXNewExpr appears after the CXXConstructExpr in the CFG, we don't
actually have the correct region to construct into at the time we decide
whether or not to inline. The long-term fix (discussed in PR12014) might be to
introduce a new CFG node (CFGAllocator) that appears before the constructor.
Tracking the short-term fix in <rdar://problem/12180598>.
llvm-svn: 162689
This allows us to better reason about status objects, like Clang's own
llvm::Optional (when its contents are trivially destructible), which are
often intended to be passed around by value.
We still don't inline constructors for temporaries in the general case.
<rdar://problem/11986434>
llvm-svn: 162681
This allows checkers (like the MallocChecker) to process the effects of the
bind. Previously, using a memory-allocating function (like strdup()) in an
initializer would result in a leak warning.
This does bend the expectations of checkBind a bit; since there is no
assignment expression, the statement being used is the initializer value.
In most cases this shouldn't matter because we'll use a PostInitializer
program point (rather than PostStmt) for any checker-generated nodes, though
we /will/ generate a PostStore node referencing the internal statement.
(In theory this could have funny effects if someone actually does an
assignment within an initializer; in practice, that seems like it would be
very rare.)
<rdar://problem/12171711>
llvm-svn: 162637
generated for a given diagnostic to another. Because PathDiagnostics
are specific to a give PathDiagnosticConsumer, store in
a FoldingSet a unique hash for a PathDiagnostic (that will be the same
for the same bug for different PathDiagnosticConsumers) that
stores a list of files generated. This can then be read by the
other PathDiagnosticConsumers.
This fixes breakage in the PLIST-HTML output.
llvm-svn: 162580
More generally, any time we try to track where a null value came from, we
should show if it came from a function. This usually isn't necessary if
the value is symbolic, but if the value is just a constant we previously
just ignored its origin entirely. Now, we'll step into the function and
recursively add a visitor to the returned expression.
<rdar://problem/12114609>
llvm-svn: 162563
With inlining, retain count checker starts tracking 'self' through the
init methods. The analyser results were too noisy if the developer
did not follow 'self = [super init]' pattern (which is common
especially in older code bases) - we reported self init anti-pattern AND
possible use-after-free. This patch teaches the retain count
checker to assume that [super init] does not fail when it's not consumed
by another expression. This silences the retain count warning that warns
about possibility of use-after-free when init fails, while preserving
all the other checking on 'self'.
llvm-svn: 162508
Until we have full support for pointers-to-members, we can at least
approximate some of their use by tracking null and non-null values.
We thus treat &A::m_ptr as a non-null void * symbol, and MemberPointer(0)
as a pointer-sized null constant.
This enables support for what is sometimes called the "safe bool" idiom,
demonstrated in the test case.
llvm-svn: 162495
This is trivial; the UserDefinedConversion always wraps a CXXMemberCallExpr
for the appropriate conversion function, so it's just a matter of
propagating that value to the CastExpr itself.
llvm-svn: 162494
A CXXDefaultArgExpr wraps an Expr owned by a ParmVarDecl belonging to the
called function. In general, ExprEngine and Environment ought to treat this
like a ParenExpr or other transparent wrapper expression, with the inside
expression evaluated first.
However, if we call the same function twice, we'd produce a CFG that contains
the same wrapped expression twice, and we're not set up to handle that. I've
added a FIXME to the CFG builder to come back to that, but meanwhile we can
at least handle expressions that don't need to be explicitly evaluated:
literals. This probably handles many common uses of default parameters:
true/false, null, etc.
Part of PR13385 / <rdar://problem/12156507>
llvm-svn: 162453
As part of this change, I discovered that a few of our tests were not testing
the RangeConstraintManager. Luckily all of those passed when I moved them
over to use that constraint manager.
llvm-svn: 162384
Also rename 'getCurrentBlockCounter()' to 'blockCount()'.
This ripples a bunch of code simplifications; mostly aesthetic,
but makes the code a bit tighter.
llvm-svn: 162349
No need to have the "get", the word "conjure" is a verb too!
Getting a conjured symbol is the same as conjuring one up.
This shortening is largely cosmetic, but just this simple changed
cleaned up a handful of lines, making them less verbose.
llvm-svn: 162348
Under -analyzer-ipa=basic-inlining, only C functions, blocks, and C++ static
member functions are inlined -- essentially, the calls that behave like simple
C function calls. This is essentially the behavior in Xcode 4.4.
C++ support still has some rough edges, and we don't want users to be worried
about them if they download and run their own checker. (In particular, the
massive number of false positives for analyzing LLVM comes from inlining
defensively-written code in contexts where more aggressive assumptions are
implicitly made. This problem is not unique to C++, but it is exacerbated by
the higher proportion of code that lives in header files in C++.)
The eventual goal is to be comfortable enough with C++ support (and simple
Objective-C support) to advance to -analyzer-ipa=inlining as the default
behavior. See the IPA design notes for more details.
llvm-svn: 162318
This reduces duplication across the Basic and Range constraint managers, and
keeps their internals free of dealing with the semantics of C++. It's still
a little unfortunate that the constraint manager is dealing with this at all,
but this is pretty much the only place to put it so that it will apply to all
symbolic values, even when embedded in larger expressions.
llvm-svn: 162313
By doing this in the constraint managers, we can ensure that ANY reference
whose value we don't know gets the effect, even if it's not a top-level
parameter.
llvm-svn: 162246
Generating a sink is significantly different behavior from generating a
normal node, and a simple boolean parameter can be rather opaque. Per
offline discussion with Anna, adding new generation methods is the
clearest way to communicate intent.
No functionality change.
llvm-svn: 162215
Forgetting to at least cast the result was giving us Loc/NonLoc problems
in SValBuilder (hitting an assertion). But the standard (both C and C++)
does actually guarantee that && and || will result in the actual values
1 and 0, typed as 'int' in C and 'bool' in C++, and we can easily model that.
PR13461
llvm-svn: 162209
Our current handling of 'throw' is all CFG-based: it jumps to a 'catch' block
if there is one and the function exit block if not. But this doesn't really
get the right behavior when a function is inlined: execution will continue on
the caller's side, which is always the wrong thing to do.
Even within a single function, 'throw' completely skips any destructors that
are to be run. This is essentially the same problem as @finally -- a CFGBlock
that can have multiple entry points, whose exit points depend on whether it
was entered normally or exceptionally.
Representing 'throw' as a sink matches our current (non-)handling of @throw.
It's not a perfect solution, but it's better than continuing analysis in an
inconsistent or even impossible state.
<rdar://problem/12113713>
llvm-svn: 162157
The CFG approximates @throw as a return statement, but that's not good
enough in inlined functions. Moreover, since Objective-C exceptions are
usually considered fatal, we should be suppressing leak warnings like we
do for calls to noreturn functions (like abort()).
The comments indicate that we were probably intending to do this all along;
it may have been inadvertantly changed during a refactor at one point.
llvm-svn: 162156
This fixes several issues:
- removes egregious hack where PlistDiagnosticConsumer would forward to HTMLDiagnosticConsumer,
but diagnostics wouldn't be generated consistently in the same way if PlistDiagnosticConsumer
was used by itself.
- emitting diagnostics to the terminal (using clang's diagnostic machinery) is no longer a special
case, just another PathDiagnosticConsumer. This also magically resolved some duplicate warnings,
as we now use PathDiagnosticConsumer's diagnostic pruning, which has scope for the entire translation
unit, not just the scope of a BugReporter (which is limited to a particular ExprEngine).
As an interesting side-effect, diagnostics emitted to the terminal also have their trailing "." stripped,
just like with diagnostics emitted to plists and HTML. This required some tests to be updated, but now
the tests have higher fidelity with what users will see.
There are some inefficiencies in this patch. We currently generate the report graph (from the ExplodedGraph)
once per PathDiagnosticConsumer, which is a bit wasteful, but that could be pulled up higher in the
logic stack. There is some intended duplication, however, as we now generate different PathDiagnostics (for the same issue)
for different PathDiagnosticConsumers. This is necessary to produce the diagnostics that a particular
consumer expects.
llvm-svn: 162028
and remove ASTContext reference (which was frequently bound to a dereferenced
null pointer) from the recursive lump of printPretty functions. In so doing,
fix (at least) one case where we intended to use the 'dump' mode, but that
failed because a null ASTContext reference had been passed in.
llvm-svn: 162011
This is the other half of C++11 [class.cdtor]p4 (the destructor side
was added in r161915). This also fixes an issue with post-call checks
where the 'this' value was already being cleaned out of the state, thus
being omitted from a reconstructed CXXConstructorCall.
llvm-svn: 161981
With reinterpret_cast, we can get completely unrelated types in a region
hierarchy together; this was resulting in CXXBaseObjectRegions being layered
directly on an (untyped) SymbolicRegion, whose symbol was from a completely
different type hierarchy. This was what was causing the internal buildbot to
fail.
Reverts r161911, which merely masked the problem.
llvm-svn: 161960
Previously we were checking -analyzer-ipa=dynamic-bifurcate only, and
unconditionally inlining everything else that had an available definition,
even under -analyzer-ipa=inlining (but not under -analyzer-ipa=none).
llvm-svn: 161916
C++11 [class.cdtor]p4: When a virtual function is called directly or
indirectly from a constructor or from a destructor, including during
the construction or destruction of the class’s non-static data members,
and the object to which the call applies is the object under
construction or destruction, the function called is the final overrider
in the constructor's or destructor's class and not one overriding it in
a more-derived class.
llvm-svn: 161915
While there is now some duplication between SimpleCall and the CXXInstanceCall
sub-hierarchy, this is much better than copy-and-pasting the devirtualization
logic shared by both instance methods and destructors.
An unfortunate side effect is that there is no longer a single CallEvent type
that corresponds to "calls written as CallExprs". For the most part this is a
good thing, but the checker callback eval::Call still takes a CallExpr rather
than a CallEvent (since we're not sure if we want to allow checkers to
evaluate other kinds of calls). A mistake here will be caught by a cast<> in
CheckerManager::runCheckersForEvalCall.
No functionality change.
llvm-svn: 161809
Virtual base regions are never layered, so simply stripping them off won't
necessarily get you to the correct casted class. Instead, what we want is
the same logic for evaluating dynamic_cast: strip off base regions if possible,
but add new base regions if necessary.
llvm-svn: 161808
This can occur with multiple inheritance, which jumps from one parent to
the other, and with virtual inheritance, since virtual base regions always
wrap the actual object and can't be nested within other base regions.
This also exposed some incorrect logic for multiple inheritance: even if B
is known not to derive from C, D might still derive from both of them.
llvm-svn: 161798
...and /do/ strip CXXBaseObjectRegions when casting to a virtual base class.
This allows us to enforce the invariant that a CXXBaseObjectRegion can always
provide an offset for its base region if its base region has a known class
type, by only allowing virtual bases and direct non-virtual bases to form
CXXBaseObjectRegions.
This does mean some slight problems for our modeling of dynamic_cast, which
needs to be resolved by finding a path from the current region to the class
we're trying to cast to.
llvm-svn: 161797
This was causing a crash when we tried to re-apply a base object region to
itself. It probably also caused incorrect offset calculations in RegionStore.
PR13569 / <rdar://problem/12076683>
llvm-svn: 161710
This mostly affects pure virtual methods, but would also affect parent
methods defined inline in the header when analyzing the child's source file.
llvm-svn: 161709
when we don't need to split.
In some cases we know that a method cannot have a different
implementation in a subclass:
- the class is declared in the main file (private)
- all the method declarations (including the ones coming from super
classes) are in the main file.
This can be improved further, but might be enough for the heuristic.
(When we are too aggressive splitting the state, efficiency suffers.
When we fail to split the state coverage might suffer.)
llvm-svn: 161681
Both methods need to clear out existing bindings and provide a new default
binding. Originally KillStruct always provided UnknownVal as the default,
but it's allowed symbolic values for quite some time (for handling returned
structs in C).
No functionality change.
llvm-svn: 161637
This should speed up activities that need to access bindings by cluster,
such as invalidation and dead-bindings cleaning. In some cases all we save
is the cost of building the region cluster map, but other times we can
actually avoid traversing the rest of the store.
In casual testing, this produced a speedup of nearly 10% analyzing SQLite,
with /less/ memory used.
llvm-svn: 161636
This makes it faster to access and invalidate bindings with symbolic offsets
by only computing this information once.
No intended functionality change.
llvm-svn: 161635
An ASTContext's RecordLayoutInfo can only be used to look up offsets of
direct base classes, and we need the offset to make non-symbolic bindings
in RegionStore. This change makes sure that we have one layer of
CXXBaseObjectRegion for each base we are casting through.
This was causing crashes on an internal buildbot.
llvm-svn: 161621
This is an initial (unoptimized) version. We split the path when
inlining ObjC instance methods. On one branch we always assume that the
type information for the given memory region is precise. On the other we
assume that we don't have the exact type info. It is important to check
since the class could be subclassed and the method can be overridden. If
we always inline we can loose coverage.
Had to refactor some of the call eval functions.
llvm-svn: 161552
Unfortunately, generalized region printing is very difficult:
- ElementRegions are used both for casting and as actual elements.
- Accessing values through a pointer means going through an intermediate
SymbolRegionValue; symbolic regions are untyped.
- Referring to implicitly-defined variables like 'this' and 'self' could be
very confusing if they come from another stack frame.
We fall back to simply not printing the region name if we can't be sure it
will print well. This will allow us to improve in the future.
llvm-svn: 161512
The main blocker on this (besides the previous commit) was that
ScanReachableSymbols was not looking through LazyCompoundVals.
Once that was fixed, it's easy enough to clear out malloc data on return,
just like we do when we bind to a global region.
<rdar://problem/10872635>
llvm-svn: 161511
RegionStore currently uses a (Region, Offset) pair to describe the locations
of memory bindings. However, this representation breaks down when we have
regions like 'array[index]', where 'index' is unknown. We used to store this
as (SubRegion, 0); now we mark them specially as (SubRegion, SYMBOLIC).
Furthermore, ProgramState::scanReachableSymbols depended on the existence of
a sub-region map, but RegionStore's implementation doesn't provide for such
a thing. Moving the store-traversing logic of scanReachableSymbols into the
StoreManager allows us to eliminate the notion of SubRegionMap altogether.
This fixes some particularly awkward broken test cases, now in
array-struct-region.c.
llvm-svn: 161510
Instead of sprinkling dynamic type info propagation throughout
ExprEngine, the added checker would add the more precise type
information on known APIs (Ex: ObjC alloc, new) and propagate
the type info in other cases (ex: ObjC init method, casts (the second is
not implemented yet)).
Add handling of ObjC alloc, new and init to the checker.
llvm-svn: 161357
Like base constructors, delegating constructors require no further
processing in the CFGInitializer node.
Also, add PrettyStackTraceLoc to the initializer and destructor logic
so we can get better stack traces in the future.
llvm-svn: 161283
Because of this, we would previously emit NO path notes when a parameter
is constrained to null (because there are no stores). Now we show where we
made the assumption, which is much more useful.
llvm-svn: 161280
The visitor walks back through the ExplodedGraph as expected, but
it wasn't actually keeping track of when a value was assigned. This
meant that it only worked when the value was assigned when the variable
was defined.
Tests in the next commit (dependent on another change).
llvm-svn: 161276
In the following code, find the type of the symbolic receiver by
following it and updating the dynamic type info in the state when we
cast the symbol from id to MyClass *.
MyClass *a = [[self alloc] init];
return 5/[a testSelf];
llvm-svn: 161264