Replace variable and functions names, as well as comments that contain whitelist with
more inclusive terms.
Reviewed By: aaron.ballman, martong
Differential Revision: https://reviews.llvm.org/D112642
Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.
This patch fixes that typo and demonstrated the correct behavior with
tests.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D112558
It seems like protobuf crashed the `std::string` checker.
Somehow it acquired `UnknownVal` as the sole `std::string` constructor
parameter, causing a crash in the `castAs<Loc>()`.
This patch addresses this.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D112551
This patch adds a checker checking `std::string` operations.
At first, it only checks the `std::string` single `const char *`
constructor for nullness.
If It might be `null`, it will constrain it to non-null and place a note
tag there.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111247
It turns out llvm::isa<> is variadic, and we could have used this at a
lot of places.
The following patterns:
x && isa<T1>(x) || isa<T2>(x) ...
Will be replaced by:
isa_and_non_null<T1, T2, ...>(x)
Sometimes it caused further simplifications, when it would cause even
more code smell.
Aside from this, keep in mind that within `assert()` or any macro
functions, we need to wrap the isa<> expression within a parenthesis,
due to the parsing of the comma.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111982
If the `assume-controlled-environment` is `true`, we should expect `getenv()`
to succeed, and the result should not be considered tainted.
By default, the option will be `false`.
Reviewed By: NoQ, martong
Differential Revision: https://reviews.llvm.org/D111296
The `getenv()` function might return `NULL` just like any other function.
However, in case of `getenv()` a state-split seems justified since the
programmer should expect the failure of this function.
`secure_getenv(const char *name)` behaves the same way but is not handled
right now.
Note that `std::getenv()` is also not handled.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D111245
Clarify the message provided when the analyzer catches the use of memory
that is allocated with size zero.
Differential Revision: https://reviews.llvm.org/D111655
Modify the IfStmt node to suppoort constant evaluated expressions.
Add a new ExpressionEvaluationContext::ImmediateFunctionContext to
keep track of immediate function contexts.
This proved easier/better/probably more efficient than walking the AST
backward as it allows diagnosing nested if consteval statements.
Adding trackExpressionValue to the checker so it tracks the value of the
implicit cast's DeclRefExpression up to initialization/assignment. This
way the report becomes cleaner.
Differential Revision: https://reviews.llvm.org/D109836
D105819 Added NoOwnershipChangeVisitor, but it is only registered when an
off-by-default, hidden checker option was enabled. The reason behind this was
that it grossly overestimated the set of functions that really needed a note:
std::string getTrainName(const Train *T) {
return T->name;
} // note: Retuning without changing the ownership of or deallocating memory
// Umm... I mean duh? Nor would I expect this function to do anything like that...
void foo() {
Train *T = new Train("Land Plane");
print(getTrainName(T)); // note: calling getTrainName / returning from getTrainName
} // warn: Memory leak
This patch adds a heuristic that guesses that any function that has an explicit
operator delete call could have be responsible for deallocating the memory that
ended up leaking. This is waaaay too conservative (see the TODOs in the new
function), but it safer to err on the side of too little than too much, and
would allow us to enable the option by default *now*, and add refinements
one-by-one.
Differential Revision: https://reviews.llvm.org/D108753
D105553 added NoStateChangeFuncVisitor, an abstract class to aid in creating
notes such as "Returning without writing to 'x'", or "Returning without changing
the ownership status of allocated memory". Its clients need to define, among
other things, what a change of state is.
For code like this:
f() {
g();
}
foo() {
f();
h();
}
We'd have a path in the ExplodedGraph that looks like this:
-- <g> -->
/ \
--- <f> --------> --- <h> --->
/ \ / \
-------- <foo> ------ <foo> -->
When we're interested in whether f neglected to change some property,
NoStateChangeFuncVisitor asks these questions:
÷×~
-- <g> -->
ß / \$ @&#*
--- <f> --------> --- <h> --->
/ \ / \
-------- <foo> ------ <foo> -->
Has anything changed in between # and *?
Has anything changed in between & and *?
Has anything changed in between @ and *?
...
Has anything changed in between $ and *?
Has anything changed in between × and ~?
Has anything changed in between ÷ and ~?
...
Has anything changed in between ß and *?
...
This is a rather thorough line of questioning, which is why in D105819, I was
only interested in whether state *right before* and *right after* a function
call changed, and early returned to the CallEnter location:
if (!CurrN->getLocationAs<CallEnter>())
return;
Except that I made a typo, and forgot to negate the condition. So, in this
patch, I'm fixing that, and under the same hood allow all clients to decide to
do this whole-function check instead of the thorough one.
Differential Revision: https://reviews.llvm.org/D108695
D105553 added NoStateChangeFuncVisitor, an abstract class to aid in creating
notes such as "Returning without writing to 'x'", or "Returning without changing
the ownership status of allocated memory". Its clients need to define, among
other things, what a change of state is.
For code like this:
f() {
g();
}
foo() {
f();
h();
}
We'd have a path in the ExplodedGraph that looks like this:
-- <g> -->
/ \
--- <f> --------> --- <h> --->
/ \ / \
-------- <foo> ------ <foo> -->
When we're interested in whether f neglected to change some property,
NoStateChangeFuncVisitor asks these questions:
÷×~
-- <g> -->
ß / \$ @&#*
--- <f> --------> --- <h> --->
/ \ / \
-------- <foo> ------ <foo> -->
Has anything changed in between # and *?
Has anything changed in between & and *?
Has anything changed in between @ and *?
...
Has anything changed in between $ and *?
Has anything changed in between × and ~?
Has anything changed in between ÷ and ~?
...
Has anything changed in between ß and *?
...
This is a rather thorough line of questioning, which is why in D105819, I was
only interested in whether state *right before* and *right after* a function
call changed, and early returned to the CallEnter location:
if (!CurrN->getLocationAs<CallEnter>())
return;
Except that I made a typo, and forgot to negate the condition. So, in this
patch, I'm fixing that, and under the same hood allow all clients to decide to
do this whole-function check instead of the thorough one.
Differential Revision: https://reviews.llvm.org/D108695
MallocOverflow works in two phases:
1) Collects suspicious malloc calls, whose argument is a multiplication
2) Filters the aggregated list of suspicious malloc calls by iterating
over the BasicBlocks of the CFG looking for comparison binary
operators over the variable constituting in any suspicious malloc.
Consequently, it suppressed true-positive cases when the comparison
check was after the malloc call.
In this patch the checker will consider the relative position of the
relation check to the malloc call.
E.g.:
```lang=C++
void *check_after_malloc(int n, int x) {
int *p = NULL;
if (x == 42)
p = malloc(n * sizeof(int)); // Previously **no** warning, now it
// warns about this.
// The check is after the allocation!
if (n > 10) {
// Do something conditionally.
}
return p;
}
```
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D107804
Not only global variables can hold references to dead stack variables.
Consider this example:
void write_stack_address_to(char **q) {
char local;
*q = &local;
}
void test_stack() {
char *p;
write_stack_address_to(&p);
}
The address of 'local' is assigned to 'p', which becomes a dangling
pointer after 'write_stack_address_to()' returns.
The StackAddrEscapeChecker was looking for bindings in the store which
referred to variables of the popped stack frame, but it only considered
global variables in this regard. This patch relaxes this, catching
stack variable bindings as well.
---
This patch also works for temporary objects like:
struct Bar {
const int &ref;
explicit Bar(int y) : ref(y) {
// Okay.
} // End of the constructor call, `ref` is dangling now. Warning!
};
void test() {
Bar{33}; // Temporary object, so the corresponding memregion is
// *not* a VarRegion.
}
---
The return value optimization aka. copy-elision might kick in but that
is modeled by passing an imaginary CXXThisRegion which refers to the
parent stack frame which is supposed to be the 'return slot'.
Objects residing in the 'return slot' outlive the scope of the inner
call, thus we should expect no warning about them - except if we
explicitly disable copy-elision.
Reviewed By: NoQ, martong
Differential Revision: https://reviews.llvm.org/D107078
This is a rather common feedback we get from out leak checkers: bug reports are
really short, and are contain barely any usable information on what the analyzer
did to conclude that a leak actually happened.
This happens because of our bug report minimizing effort. We construct bug
reports by inspecting the ExplodedNodes that lead to the error from the bottom
up (from the error node all the way to the root of the exploded graph), and mark
entities that were the cause of a bug, or have interacted with it as
interesting. In order to make the bug report a bit less verbose, whenever we
find an entire function call (from CallEnter to CallExitEnd) that didn't talk
about any interesting entity, we prune it (click here for more info on bug
report generation). Even if the event to highlight is exactly this lack of
interaction with interesting entities.
D105553 generalized the visitor that creates notes for these cases. This patch
adds a new kind of NoStateChangeVisitor that leaves notes in functions that
took a piece of dynamically allocated memory that later leaked as parameter,
and didn't change its ownership status.
Differential Revision: https://reviews.llvm.org/D105553
This patch:
- Fixes how the std-namespace test is written in SmartPtrModelling
(now accounts for functions with no Decl available)
- Adds the smart pointer checker flag check where it was missing
Differential Revision: https://reviews.llvm.org/D106296
The checker warns if a stream is read that is already in end-of-file
(EOF) state.
The commit adds indication of the last location where the EOF flag is set
on the stream.
Reviewed By: Szelethus
Differential Revision: https://reviews.llvm.org/D104925
This patch handles the `std::swap` function specialization
for `std::unique_ptr`. Implemented to be very similar to
how `swap` method is handled
Differential Revision: https://reviews.llvm.org/D104300
This patch handles the `<<` operator defined for `std::unique_ptr` in
the std namespace (ignores custom overloads of the operator).
Differential Revision: https://reviews.llvm.org/D105421
This patch handles all the comparision methods (defined via overloaded
operators) on std::unique_ptr. These operators compare the underlying
pointers, which is modelled by comparing the corresponding inner-pointer
SVal. There is also a special case for comparing the same pointer.
Differential Revision: https://reviews.llvm.org/D104616
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
Fix offset calculation routines in padding checker to avoid assertion
errors described in bugzilla issue 50426. The fields that are subojbects
of zero size, marked with [[no_unique_address]] or empty bitfields will
be excluded from padding calculation routines.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104097
This is mostly a mechanical change, but a testcase that contains
parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp)
isn't touched.
The checker contains check for passing a NULL stream argument.
This change should make more easy to identify where the passed pointer
becomes NULL.
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D104640
Additionally, this commit completely removes any uses of
FindLastStoreBRVisitor from the analyzer except for the
one in Tracker.
The next step is actually removing this class altogether
from the header file.
Differential Revision: https://reviews.llvm.org/D103618
This renames the expression value categories from rvalue to prvalue,
keeping nomenclature consistent with C++11 onwards.
C++ has the most complicated taxonomy here, and every other language
only uses a subset of it, so it's less confusing to use the C++ names
consistently, and mentally remap to the C names when working on that
context (prvalue -> rvalue, no xvalues, etc).
Renames:
* VK_RValue -> VK_PRValue
* Expr::isRValue -> Expr::isPRValue
* SK_QualificationConversionRValue -> SK_QualificationConversionPRValue
* JSON AST Dumper Expression nodes value category: "rvalue" -> "prvalue"
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103720
The majority of all `addVisitor` callers follow the same pattern:
addVisitor(std::make_unique<SomeVisitor>(arg1, arg2, ...));
This patches introduces additional overload for `addVisitor` to simplify
that pattern:
addVisitor<SomeVisitor>(arg1, arg2, ...);
Differential Revision: https://reviews.llvm.org/D103457
Since we can report memory leaks on one variable, while the originally
allocated object was stored into another one, we should explain
how did it get there.
rdar://76645710
Differential Revision: https://reviews.llvm.org/D100852
When reporting leaks, we try to attach the leaking object to some
variable, so it's easier to understand. Before the patch, we always
tried to use the first variable that stored the object in question.
This can get very confusing for the user, if that variable doesn't
contain that object at the moment of the actual leak. In many cases,
the warning is dismissed as false positive and it is effectively a
false positive when we fail to properly explain the warning to the
user.
This patch addresses the bigest issue in cases like this. Now we
check if the variable still contains the leaking symbolic object.
If not, we look for the last variable to actually hold it and use
that variable instead.
rdar://76645710
Differential Revision: https://reviews.llvm.org/D100839
Allocation site is the key location for the leak checker. It is a
uniqueing location for the report and a source of information for
the warning's message.
Before this patch, we calculated and used it twice in bug report and
in bug report visitor. Such duplication is not only harmful
performance-wise (not much, but still), but also design-wise. Because
changing something about the end piece of the report should've been
repeated for description as well.
Differential Revision: https://reviews.llvm.org/D100626
When we report an argument constraint violation, we should track those
other arguments that participate in the evaluation of the violation. By
default, we depend only on the argument that is constrained, however,
there are some special cases like the buffer size constraint that might
be encoded in another argument(s).
Differential Revision: https://reviews.llvm.org/D101358