Commit Graph

4704 Commits

Author SHA1 Message Date
Vince Bridgers 985888411d [analyzer] Refactor makeNull to makeNullWithWidth (NFC)
Usages of makeNull need to be deprecated in favor of makeNullWithWidth
for architectures where the pointer size should not be assumed. This can
occur when pointer sizes can be of different sizes, depending on address
space for example. See https://reviews.llvm.org/D118050 as an example.

This was uncovered initially in a downstream compiler project, and
tested through those systems tests.

steakhal performed systems testing across a large set of open source
projects.

Co-authored-by: steakhal
Resolves: https://github.com/llvm/llvm-project/issues/53664

Reviewed By: NoQ, steakhal

Differential Revision: https://reviews.llvm.org/D119601
2022-03-22 07:35:13 -05:00
Mike Rice 6bd8dc91b8 [OpenMP] Initial parsing/sema for the 'omp target teams loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target teams loop directive.

Differential Revision: https://reviews.llvm.org/D122028
2022-03-18 13:48:32 -07:00
Mike Rice 79f661edc1 [OpenMP] Initial parsing/sema for the 'omp teams loop' construct
Adds basic parsing/sema/serialization support for the #pragma omp teams loop
directive.

Differential Revision: https://reviews.llvm.org/D121713
2022-03-16 14:39:18 -07:00
phyBrackets 90a6e35478 [analyzer][NFC] Merge similar conditional paths
Reviewed By: aaron.ballman, steakhal

Differential Revision: https://reviews.llvm.org/D121045
2022-03-07 22:05:27 +05:30
Endre Fülöp 4fd6c6e65a [analyzer] Add more propagations to Taint analysis
Add more functions as taint propators to GenericTaintChecker.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D120369
2022-03-07 13:18:54 +01:00
Shivam 56eaf869be [analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions
Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D120489
2022-03-04 00:21:06 +05:30
Shivam bd1917c88a [analyzer] Done some changes to detect Uninitialized read by the char array manipulation functions
Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D120489
2022-03-03 23:21:26 +05:30
Kristóf Umann d832078904 [analyzer] Improve NoOwnershipChangeVisitor's understanding of deallocators
The problem with leak bug reports is that the most interesting event in the code
is likely the one that did not happen -- lack of ownership change and lack of
deallocation, which is often present within the same function that the analyzer
inlined anyway, but not on the path of execution on which the bug occured. We
struggle to understand that a function was responsible for freeing the memory,
but failed.

D105819 added a new visitor to improve memory leak bug reports. In addition to
inspecting the ExplodedNodes of the bug pat, the visitor tries to guess whether
the function was supposed to free memory, but failed to. Initially (in D108753),
this was done by checking whether a CXXDeleteExpr is present in the function. If
so, we assume that the function was at least party responsible, and prevent the
analyzer from pruning bug report notes in it. This patch improves this heuristic
by recognizing all deallocator functions that MallocChecker itself recognizes,
by reusing MallocChecker::isFreeingCall.

Differential Revision: https://reviews.llvm.org/D118880
2022-03-03 11:27:56 +01:00
Simon Pilgrim ca94f28d15 [clang] ExprEngine::VisitCXXNewExpr - remove superfluous nullptr tests
FD has already been dereferenced
2022-03-02 15:59:10 +00:00
Kristóf Umann 32ac21d049 [NFC][analyzer] Allow CallDescriptions to be matched with CallExprs
Since CallDescriptions can only be matched against CallEvents that are created
during symbolic execution, it was not possible to use it in syntactic-only
contexts. For example, even though InnerPointerChecker can check with its set of
CallDescriptions whether a function call is interested during analysis, its
unable to check without hassle whether a non-analyzer piece of code also calls
such a function.

The patch adds the ability to use CallDescriptions in syntactic contexts as
well. While we already have that in Signature, we still want to leverage the
ability to use dynamic information when we have it (function pointers, for
example). This could be done with Signature as well (StdLibraryFunctionsChecker
does it), but it makes it even less of a drop-in replacement.

Differential Revision: https://reviews.llvm.org/D119004
2022-03-01 17:13:04 +01:00
Balázs Kéri d8a2afb244 [clang][analyzer] Add modeling of 'errno'.
Add a checker to maintain the system-defined value 'errno'.
The value is supposed to be set in the future by existing or
new checkers that evaluate errno-modifying function calls.

Reviewed By: NoQ, steakhal

Differential Revision: https://reviews.llvm.org/D120310
2022-03-01 08:20:33 +01:00
Dawid Jurczak b3e2dac27c [NFC] Don't pass temporary LangOptions to Lexer
Since https://reviews.llvm.org/D120334 we shouldn't pass temporary LangOptions to Lexer.
This change fixes stack-use-after-scope UB in LocalizationChecker found by sanitizer-x86_64-linux-fast buildbot
and resolve similar issue in HeaderIncludes.
2022-02-28 20:43:28 +01:00
Endre Fülöp 34a7387986 [analyzer] Add more sources to Taint analysis
Add more functions as taint sources to GenericTaintChecker.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D120236
2022-02-28 11:33:02 +01:00
Aaron Ballman f9e8e92cf5 Revert "[clang][analyzer] Add modeling of 'errno'."
This reverts commit 29b512ba32.

This broke several build bots:

https://lab.llvm.org/buildbot/#/builders/86/builds/30183
https://lab.llvm.org/buildbot/#/builders/216/builds/488
2022-02-25 07:21:01 -05:00
Balázs Kéri 29b512ba32 [clang][analyzer] Add modeling of 'errno'.
Add a checker to maintain the system-defined value 'errno'.
The value is supposed to be set in the future by existing or
new checkers that evaluate errno-modifying function calls.

Reviewed By: NoQ, steakhal

Differential Revision: https://reviews.llvm.org/D120310
2022-02-25 12:42:55 +01:00
Fangrui Song ecff9b65b5 [analyzer] Just use default capture after 7fd60ee6e0 2022-02-24 10:06:11 -08:00
Fangrui Song 7fd60ee6e0 [analyzer] Fix -Wunused-lambda-capture in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-02-24 00:13:13 -08:00
Balazs Benics 7036413dc2 Revert "Revert "[analyzer] Fix taint rule of fgets and setproctitle_init""
This reverts commit 2acead35c1.

Let's try `REQUIRES: asserts`.
2022-02-23 12:55:31 +01:00
Balazs Benics a848a5cf2f Revert "Revert "[analyzer] Fix taint propagation by remembering to the location context""
This reverts commit d16c5f4192.

Let's try `REQUIRES: asserts`.
2022-02-23 12:53:07 +01:00
Balazs Benics fa0a80e017 Revert "Revert "[analyzer] Add failing test case demonstrating buggy taint propagation""
This reverts commit b8ae323cca.

Let's try `REQUIRES: asserts`.
2022-02-23 10:48:06 +01:00
Artem Dergachev e0e174845b [analyzer] Fix a crash in NoStateChangeVisitor with body-farmed stack frames.
LocationContext::getDecl() isn't useful for obtaining the "farmed" body because
the (synthetic) body statement isn't actually attached to the (natural-grown)
declaration in the AST.

Differential Revision: https://reviews.llvm.org/D119509
2022-02-17 10:13:34 -08:00
Balazs Benics b3c0014e5a Revert "Revert "[analyzer] Prevent misuses of -analyze-function""
This reverts commit 620d99b7ed.

Let's see if removing the two offending RUN lines makes this patch pass.
Not ideal to drop tests but, it's just a debugging feature, probably not
that important.
2022-02-16 10:33:21 +01:00
Balazs Benics b8ae323cca Revert "[analyzer] Add failing test case demonstrating buggy taint propagation"
This reverts commit 744745ae19.

I'm reverting this since this patch caused a build breakage.

https://lab.llvm.org/buildbot/#/builders/91/builds/3818
2022-02-14 18:45:46 +01:00
Balazs Benics d16c5f4192 Revert "[analyzer] Fix taint propagation by remembering to the location context"
This reverts commit b099e1e562.

I'm reverting this since the head of the patch stack caused a build
breakage.

https://lab.llvm.org/buildbot/#/builders/91/builds/3818
2022-02-14 18:45:46 +01:00
Balazs Benics 2acead35c1 Revert "[analyzer] Fix taint rule of fgets and setproctitle_init"
This reverts commit bf5963bf19.

I'm reverting this since the head of the patch stack caused a build
breakage.

https://lab.llvm.org/buildbot/#/builders/91/builds/3818
2022-02-14 18:45:46 +01:00
Balazs Benics bf5963bf19 [analyzer] Fix taint rule of fgets and setproctitle_init
There was a typo in the rule.
`{{0}, ReturnValueIndex}` meant that the discrete index is `0` and the
variadic index is `-1`.
What we wanted instead is that both `0` and `-1` are in the discrete index
list.

Instead of this, we wanted to express that both `0` and the
`ReturnValueIndex` is in the discrete arg list.

The manual inspection revealed that `setproctitle_init` also suffered a
probably incomplete propagation rule.

Reviewed By: Szelethus, gamesh411

Differential Revision: https://reviews.llvm.org/D119129
2022-02-14 16:55:55 +01:00
Balazs Benics b099e1e562 [analyzer] Fix taint propagation by remembering to the location context
Fixes the issue D118987 by mapping the propagation to the callsite's
LocationContext.
This way we can keep track of the in-flight propagations.

Note that empty propagation sets won't be inserted.

Reviewed By: NoQ, Szelethus

Differential Revision: https://reviews.llvm.org/D119128
2022-02-14 16:55:55 +01:00
Balazs Benics 744745ae19 [analyzer] Add failing test case demonstrating buggy taint propagation
Recently we uncovered a serious bug in the `GenericTaintChecker`.
It was already flawed before D116025, but that was the patch that turned
this silent bug into a crash.

It happens if the `GenericTaintChecker` has a rule for a function, which
also has a definition.

  char *fgets(char *s, int n, FILE *fp) {
    nested_call();   // no parameters!
    return (char *)0;
  }

  // Within some function:
  fgets(..., tainted_fd);

When the engine inlines the definition and finds a function call within
that, the `PostCall` event for the call will get triggered sooner than the
`PostCall` for the original function.
This mismatch violates the assumption of the `GenericTaintChecker` which
wants to propagate taint information from the `PreCall` event to the
`PostCall` event, where it can actually bind taint to the return value
**of the same call**.

Let's get back to the example and go through step-by-step.
The `GenericTaintChecker` will see the `PreCall<fgets(..., tainted_fd)>`
event, so it would 'remember' that it needs to taint the return value
and the buffer, from the `PostCall` handler, where it has access to the
return value symbol.
However, the engine will inline fgets and the `nested_call()` gets
evaluated subsequently, which produces an unimportant
`PreCall<nested_call()>`, then a `PostCall<nested_call()>` event, which is
observed by the `GenericTaintChecker`, which will unconditionally mark
tainted the 'remembered' arg indexes, trying to access a non-existing
argument, resulting in a crash.
If it doesn't crash, it will behave completely unintuitively, by marking
completely unrelated memory regions tainted, which is even worse.

The resulting assertion is something like this:
  Expr.h: const Expr *CallExpr::getArg(unsigned int) const: Assertion
          `Arg < getNumArgs() && "Arg access out of range!"' failed.

The gist of the backtrace:
  CallExpr::getArg(unsigned int) const
  SimpleFunctionCall::getArgExpr(unsigned int)
  CallEvent::getArgSVal(unsigned int) const
  GenericTaintChecker::checkPostCall(const CallEvent &, CheckerContext&) const

Prior to D116025, there was a check for the argument count before it
applied taint, however, it still suffered from the same underlying
issue/bug regarding propagation.

This path does not intend to fix the bug, rather start a discussion on
how to fix this.

---

Let me elaborate on how I see this problem.

This pre-call, post-call juggling is just a workaround.
The engine should by itself propagate taint where necessary right where
it invalidates regions.
For the tracked values, which potentially escape, we need to erase the
information we know about them; and this is exactly what is done by
invalidation.
However, in the case of taint, we basically want to approximate from the
opposite side of the spectrum.
We want to preserve taint in most cases, rather than cleansing them.

Now, we basically sanitize all escaping tainted regions implicitly,
since invalidation binds a fresh conjured symbol for the given region,
and that has not been associated with taint.

IMO this is a bad default behavior, we should be more aggressive about
preserving taint if not further spreading taint to the reachable
regions.

We have a couple of options for dealing with it (let's call it //tainting
policy//):
  1) Taint only the parameters which were tainted prior to the call.
  2) Taint the return value of the call, since it likely depends on the
     tainted input - if any arguments were tainted.
  3) Taint all escaped regions - (maybe transitively using the cluster
     algorithm) - if any arguments were tainted.
  4) Not taint anything - this is what we do right now :D

The `ExprEngine` should not deal with taint on its own. It should be done
by a checker, such as the `GenericTaintChecker`.
However, the `Pre`-`PostCall` checker callbacks are not designed for this.
`RegionChanges` would be a much better fit for modeling taint propagation.
What we would need in the `RegionChanges` callback is the `State` prior
invalidation, the `State` after the invalidation, and a `CheckerContext` in
which the checker can create transitions, where it would place `NoteTags`
for the modeled taint propagations and report errors if a taint sink
rule gets violated.
In this callback, we could query from the prior State, if the given
value was tainted; then act and taint if necessary according to the
checker's tainting policy.

By using RegionChanges for this, we would 'fix' the mentioned
propagation bug 'by-design'.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D118987
2022-02-14 16:55:55 +01:00
phyBrackets 6745b6a0f1 [analyzer][NFCi] Use the correct BugType in CStringChecker.
There is different bug types for different types of bugs  but the **emitAdditionOverflowbug** seems to use bugtype **BT_NotCSting** but actually it have to use **BT_AdditionOverflow** .

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D119462
2022-02-14 20:54:59 +05:30
Balazs Benics abc873694f [analyzer] Restrict CallDescription fuzzy builtin matching
`CallDescriptions` for builtin functions relaxes the match rules
somewhat, so that the `CallDescription` will match for calls that have
some prefix or suffix. This was achieved by doing a `StringRef::contains()`.
However, this is somewhat problematic for builtins that are substrings
of each other.

Consider the following:

`CallDescription{ builtin, "memcpy"}` will match for
`__builtin_wmemcpy()` calls, which is unfortunate.

This patch addresses/works around the issue by checking if the
characters around the function's name are not part of the 'name'
semantically. In other words, to accept a match for `"memcpy"` the call
should not have alphanumeric (`[a-zA-Z]`) characters around the 'match'.

So, `CallDescription{ builtin, "memcpy"}` will not match on:

 - `__builtin_wmemcpy: there is a `w` alphanumeric character before the match.
 - `__builtin_memcpyFOoBar_inline`: there is a `F` character after the match.
 - `__builtin_memcpyX_inline`: there is an `X` character after the match.

But it will still match for:
 - `memcpy`: exact match
 - `__builtin_memcpy`: there is an _ before the match
 - `__builtin_memcpy_inline`: there is an _ after the match
 - `memcpy_inline_builtinFooBar`: there is an _ after the match

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D118388
2022-02-11 10:45:18 +01:00
Sylvestre Ledru f2c2e924e7 Fix a typo (occured => occurred)
Reported:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1005195
2022-02-08 21:35:26 +01:00
Balazs Benics 620d99b7ed Revert "[analyzer] Prevent misuses of -analyze-function"
This reverts commit 841817b1ed.

Ah, it still fails on build bots for some reason.
Pinning the target triple was not enough.
2022-02-08 17:42:46 +01:00
Balazs Benics 841817b1ed [analyzer] Prevent misuses of -analyze-function
Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.

This patch introduces some logic checking common misuses involving
`-analyze-function`.

Reviewed-By: martong

Differential Revision: https://reviews.llvm.org/D118690
2022-02-08 17:27:57 +01:00
Jun Zhang 65adf7c211
[NFC][Analyzer] Use range based for loop.
Use range base loop loop to improve code readability.

Differential Revision: https://reviews.llvm.org/D119103
2022-02-07 15:45:58 +08:00
Rashmi Mudduluru faabdfcf7f [analyzer] Add support for __attribute__((returns_nonnull)).
Differential Revision: https://reviews.llvm.org/D118657
2022-02-02 11:46:52 -08:00
Balazs Benics e99abc5d8a Revert "[analyzer] Prevent misuses of -analyze-function"
This reverts commit 9d6a615973.

Exit Code: 1

Command Output (stderr):
--
/scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp:53:21: error: CHECK-EMPTY-NOT: excluded string found in input // CHECK-EMPTY-NOT: Every top-level function was skipped.
                    ^
<stdin>:1:1: note: found here
Every top-level function was skipped.
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input file: <stdin>
Check file: /scratch/buildbot/bothome/clang-ve-ninja/llvm-project/clang/test/Analysis/analyze-function-guide.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        1: Every top-level function was skipped.
not:53     !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  error: no match expected
        2: Pass the -analyzer-display-progress for tracking which functions are analyzed.
>>>>>>
2022-02-02 11:44:27 +01:00
Balazs Benics 9d6a615973 [analyzer] Prevent misuses of -analyze-function
Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.

This patch introduces some logic checking common misuses involving
`-analyze-function`.

Reviewed-By: martong

Differential Revision: https://reviews.llvm.org/D118690
2022-02-02 11:31:22 +01:00
Tres Popp 262cc74e0b Fix pair construction with an implicit constructor inside. 2022-01-18 18:01:52 +01:00
Endre Fülöp 17f74240e6 [analyzer][NFC] Refactor GenericTaintChecker to use CallDescriptionMap
GenericTaintChecker now uses CallDescriptionMap to describe the possible
operation in code which trigger the introduction (sources), the removal
(filters), the passing along (propagations) and detection (sinks) of
tainted values.

Reviewed By: steakhal, NoQ

Differential Revision: https://reviews.llvm.org/D116025
2022-01-18 16:04:04 +01:00
Denys Petrov d835dd4cf5 [analyzer] Produce SymbolCast symbols for integral types in SValBuilder::evalCast
Summary: Produce SymbolCast for integral types in `evalCast` function. Apply several simplification techniques while producing the symbols. Added a boolean option `handle-integral-cast-for-ranges` under `-analyzer-config` flag. Disabled the feature by default.

Differential Revision: https://reviews.llvm.org/D105340
2022-01-18 16:08:04 +02:00
Kazu Hirata 17d4bd3d78 [clang] Fix bugprone argument comments (NFC)
Identified with bugprone-argument-comment.
2022-01-09 00:19:49 -08:00
Kazu Hirata 40446663c7 [clang] Use true/false instead of 1/0 (NFC)
Identified with modernize-use-bool-literals.
2022-01-09 00:19:47 -08:00
Kazu Hirata d1b127b5b7 [clang] Remove unused forward declarations (NFC) 2022-01-08 11:56:40 -08:00
Qiu Chaofan c2cc70e4f5 [NFC] Fix endif comments to match with include guard 2022-01-07 15:52:59 +08:00
Kazu Hirata d677a7cb05 [clang] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
2022-01-02 10:20:23 -08:00
Kazu Hirata 298367ee6e [clang] Use nullptr instead of 0 or NULL (NFC)
Identified with modernize-use-nullptr.
2021-12-29 08:34:20 -08:00
Kazu Hirata 6c335b1a45 [clang] Remove unused "using" (NFC)
Identified by misc-unused-using-decls.
2021-12-27 20:48:21 -08:00
Kazu Hirata 0542d15211 Remove redundant string initialization (NFC)
Identified with readability-redundant-string-init.
2021-12-26 09:39:26 -08:00
Kazu Hirata 34558b039b [StaticAnalyzer] Remove redundant declaration isStdSmartPtr (NFC)
An identical declaration is present just a couple of lines above the
line being removed in this patch.

Identified with readability-redundant-declaration.
2021-12-25 00:35:41 -08:00
Sami Tolvanen ec2e26eaf6 [Clang] Add __builtin_function_start
Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as operating system kernels, which may
need the address of an actual function body without the jump table
indirection.

This change adds the __builtin_function_start() builtin, which
accepts an argument that can be constant-evaluated to a function,
and returns the address of the function body.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Depends on D108478

Reviewed By: pcc, rjmccall

Differential Revision: https://reviews.llvm.org/D108479
2021-12-20 12:55:33 -08:00
Kazu Hirata 713ee230f8 [clang] Use llvm::reverse (NFC) 2021-12-17 16:51:42 -08:00
Denys Petrov da8bd972a3 [analyzer][NFC] Change return value of StoreManager::attemptDownCast function from SVal to Optional<SVal>
Summary: Refactor return value of `StoreManager::attemptDownCast` function by removing the last parameter `bool &Failed` and replace the return value `SVal` with `Optional<SVal>`.  Make the function consistent with the family of `evalDerivedToBase` by renaming it to `evalBaseToDerived`. Aligned the code on the call side with these changes.

Differential Revision: https://reviews.llvm.org/
2021-12-17 13:03:47 +02:00
Gabor Marton bd9e23943a [analyzer] Expand conversion check to check more expressions for overflow and underflow
This expands checking for more expressions. This will check underflow
and loss of precision when using call expressions like:

  void foo(unsigned);
  int i = -1;
  foo(i);

This also includes other expressions as well, so it can catch negative
indices to std::vector since it uses unsigned integers for [] and .at()
function.

Patch by: @pfultz2

Differential Revision: https://reviews.llvm.org/D46081
2021-12-15 11:41:34 +01:00
Denys Petrov 6a399bf4b3 [analyzer] Implemented RangeSet::Factory::unite function to handle intersections and adjacency
Summary: Handle intersected and adjacent ranges uniting them into a single one.
Example:
intersection [0, 10] U [5, 20] = [0, 20]
adjacency [0, 10] U [11, 20] = [0, 20]

Differential Revision: https://reviews.llvm.org/D99797
2021-12-10 18:48:02 +02:00
Logan Smith 715c72b4fb [NFC][analyzer] Return underlying strings directly instead of OS.str()
This avoids an unnecessary copy required by 'return OS.str()', allowing
instead for NRVO or implicit move. The .str() call (which flushes the
stream) is no longer required since 65b13610a5,
which made raw_string_ostream unbuffered by default.

Differential Revision: https://reviews.llvm.org/D115374
2021-12-09 16:05:46 -08:00
Gabor Marton 978431e80b [Analyzer] SValBuilder: Simlify a SymExpr to the absolute simplest form
Move the SymExpr simplification fixpoint logic into SValBuilder.

Differential Revision: https://reviews.llvm.org/D114938
2021-12-07 10:02:32 +01:00
Balazs Benics a6816b957d [analyzer][solver] Fix assertion on (NonLoc, Op, Loc) expressions
Previously, the `SValBuilder` could not encounter expressions of the
following kind:

  NonLoc OP Loc
  Loc OP NonLoc

Where the `Op` is other than `BO_Add`.

As of now, due to the smarter simplification and the fixedpoint
iteration, it turns out we can.
It can happen if the `Loc` was perfectly constrained to a concrete
value (`nonloc::ConcreteInt`), thus the simplifier can do
constant-folding in these cases as well.

Unfortunately, this could cause assertion failures, since we assumed
that the operator must be `BO_Add`, causing a crash.

---

In the patch, I decided to preserve the original behavior (aka. swap the
operands (if the operator is commutative), but if the `RHS` was a
`loc::ConcreteInt` call `evalBinOpNN()`.

I think this interpretation of the arithmetic expression is closer to
reality.

I also tried naively introducing a separate handler for
`loc::ConcreteInt` RHS, before doing handling the more generic `Loc` RHS
case. However, it broke the `zoo1backwards()` test in the `nullptr.cpp`
file. This highlighted for me the importance to preserve the original
behavior for the `BO_Add` at least.

PS: Sorry for introducing yet another branch into this `evalBinOpXX`
madness. I've got a couple of ideas about refactoring these.
We'll see if I can get to it.

The test file demonstrates the issue and makes sure nothing similar
happens. The `no-crash` annotated lines show, where we crashed before
applying this patch.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D115149
2021-12-06 18:38:58 +01:00
Balazs Benics 9873ef409c [analyzer] Ignore flex generated files
Some projects [1,2,3] have flex-generated files besides bison-generated
ones.
Unfortunately, the comment `"/* A lexical scanner generated by flex */"`
generated by the tools is not necessarily at the beginning of the file,
thus we need to quickly skim through the file for this needle string.

Luckily, StringRef can do this operation in an efficient way.

That being said, now the bison comment is not required to be at the very
beginning of the file. This allows us to detect a couple more cases
[4,5,6].

Alternatively, we could say that we only allow whitespace characters
before matching the bison/flex header comment. That would prevent the
(probably) unnecessary string search in the buffer. However, I could not
verify that these tools would actually respect this assumption.

Additionally to this, e.g. the Twin project [1] has other non-whitespace
characters (some preprocessor directives) before the flex-generated
header comment. So the heuristic in the previous paragraph won't work
with that.
Thus, I would advocate the current implementation.

According to my measurement, this patch won't introduce measurable
performance degradation, even though we will do 2 linear scans.

I introduce the ignore-bison-generated-files and
ignore-flex-generated-files to disable skipping these files.
Both of these options are true by default.

[1]: https://github.com/cosmos72/twin/blob/master/server/rcparse_lex.cpp#L7
[2]: 22362cdcf9/sandbox/count-words/lexer.c (L6)
[3]: 11abdf6462/lab1/lex.yy.c (L6)

[4]: 47f5b2cfe2/B_yacc/1/y1.tab.h (L2)
[5]: 71d1bf9b1e/src/VBox/Additions/x11/x11include/xorg-server-1.8.0/parser.h (L2)
[6]: 3f773ceb13/Framework/OpenEars.framework/Versions/A/Headers/jsgf_parser.h (L2)

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D114510
2021-12-06 10:20:17 +01:00
Gabor Marton 20f8733d4b [Analyzer][solver] Simplification: Do a fixpoint iteration before the eq class merge
This reverts commit f02c5f3478 and
addresses the issue mentioned in D114619 differently.

Repeating the issue here:
Currently, during symbol simplification we remove the original member
symbol from the equivalence class (`ClassMembers` trait). However, we
keep the reverse link (`ClassMap` trait), in order to be able the query
the related constraints even for the old member. This asymmetry can lead
to a problem when we merge equivalence classes:
```
ClassA: [a, b]   // ClassMembers trait,
a->a, b->a       // ClassMap trait, a is the representative symbol
```
Now let,s delete `a`:
```
ClassA: [b]
a->a, b->a
```
Let's merge ClassA into the trivial class `c`:
```
ClassA: [c, b]
c->c, b->c, a->a
```
Now, after the merge operation, `c` and `a` are actually in different
equivalence classes, which is inconsistent.

This issue manifests in a test case (added in D103317):
```
void recurring_symbol(int b) {
  if (b * b != b)
    if ((b * b) * b * b != (b * b) * b)
      if (b * b == 1)
}
```
Before the simplification we have these equivalence classes:
```
trivial EQ1: [b * b != b]
trivial EQ2: [(b * b) * b * b != (b * b) * b]
```

During the simplification with `b * b == 1`, EQ1 is merged with `1 != b`
`EQ1: [b * b != b, 1 != b]` and we remove the complex symbol, so
`EQ1: [1 != b]`
Then we start to simplify the only symbol in EQ2:
`(b * b) * b * b != (b * b) * b --> 1 * b * b != 1 * b --> b * b != b`
But `b * b != b` is such a symbol that had been removed previously from
EQ1, thus we reach the above mentioned inconsistency.

This patch addresses the issue by making it impossible to synthesise a
symbol that had been simplified before. We achieve this by simplifying
the given symbol to the absolute simplest form.

Differential Revision: https://reviews.llvm.org/D114887
2021-12-01 22:23:41 +01:00
Gabor Marton 0a17896fe6 [Analyzer][Core] Make SValBuilder to better simplify svals with 3 symbols in the tree
Add the capability to simplify more complex constraints where there are 3
symbols in the tree. In this change I extend simplifySVal to query constraints
of children sub-symbols in a symbol tree. (The constraint for the parent is
asked in getKnownValue.)

Differential Revision: https://reviews.llvm.org/D103317
2021-11-30 11:24:59 +01:00
Gabor Marton f02c5f3478 [Analyzer][solver] Do not remove the simplified symbol from the eq class
Currently, during symbol simplification we remove the original member symbol
from the equivalence class (`ClassMembers` trait). However, we keep the
reverse link (`ClassMap` trait), in order to be able the query the
related constraints even for the old member. This asymmetry can lead to
a problem when we merge equivalence classes:
```
ClassA: [a, b]   // ClassMembers trait,
a->a, b->a       // ClassMap trait, a is the representative symbol
```
Now lets delete `a`:
```
ClassA: [b]
a->a, b->a
```
Let's merge the trivial class `c` into ClassA:
```
ClassA: [c, b]
c->c, b->c, a->a
```
Now after the merge operation, `c` and `a` are actually in different
equivalence classes, which is inconsistent.

One solution to this problem is to simply avoid removing the original
member and this is what this patch does.

Other options I have considered:
1) Always merge the trivial class into the non-trivial class. This might
   work most of the time, however, will fail if we have to merge two
   non-trivial classes (in that case we no longer can track equivalences
   precisely).
2) In `removeMember`, update the reverse link as well. This would cease
   the inconsistency, but we'd loose precision since we could not query
   the constraints for the removed member.

Differential Revision: https://reviews.llvm.org/D114619
2021-11-30 11:13:13 +01:00
Balazs Benics af37d4b6fe [analyzer][NFC] Refactor AnalysisConsumer::getModeForDecl()
I just read this part of the code, and I found the nested ifs less
readable.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D114441
2021-11-29 10:39:36 +01:00
Zarko Todorovski d42a6432aa [NFC][clang]Inclusive language: remove remaining uses of sanity
Missed some uses of sanity check in previous commits.
2021-11-24 14:20:13 -05:00
Gabor Marton 12887a2024 [Analyzer][Core] Better simplification in SimpleSValBuilder::evalBinOpNN
Make the SValBuilder capable to simplify existing
SVals based on a newly added constraints when evaluating a BinOp.

Before this patch, we called `simplify` only in some edge cases.
However, we can and should investigate the constraints in all cases.

Differential Revision: https://reviews.llvm.org/D113753
2021-11-23 16:38:01 +01:00
Gabor Marton ffc32efd1c [Analyzer][Core] Simplify IntSym in SValBuilder
Make the SimpleSValBuilder capable to simplify existing IntSym
expressions based on a newly added constraint on the sub-expression.

Differential Revision: https://reviews.llvm.org/D113754
2021-11-22 17:33:43 +01:00
Zarko Todorovski d8e5a0c42b [clang][NFC] Inclusive terms: replace some uses of sanity in clang
Rewording of comments to avoid using `sanity test, sanity check`.

Reviewed By: aaron.ballman, Quuxplusone

Differential Revision: https://reviews.llvm.org/D114025
2021-11-19 14:58:35 -05:00
Balazs Benics d5de568cc7 [analyzer][NFC] MaybeUInt -> MaybeCount
I forgot to include this in D113594

Differential Revision: https://reviews.llvm.org/D113594
2021-11-19 18:36:55 +01:00
Balazs Benics e6ef134f3c [analyzer][NFC] Use enum for CallDescription flags
Yeah, let's prefer a slightly stronger type representing this.

Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D113595
2021-11-19 18:32:13 +01:00
Balazs Benics 97f1bf15b1 [analyzer][NFC] Consolidate the inner representation of CallDescriptions
`CallDescriptions` have a `RequiredArgs` and `RequiredParams` members,
but they are of different types, `unsigned` and `size_t` respectively.
In the patch I use only `unsigned` for both, that should be large enough
anyway.
I also introduce the `MaybeUInt` type alias for `Optional<unsigned>`.

Additionally, I also avoid the use of the //smart// less-than operator.

  template <typename T>
  constexpr bool operator<=(const Optional<T> &X, const T &Y);

Which would check if the optional **has** a value and compare the data
only after. I found it surprising, thus I think we are better off
without it.

Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D113594
2021-11-19 18:32:13 +01:00
Balazs Benics de9d7e42ac [analyzer][NFC] CallDescription should own the qualified name parts
Previously, CallDescription simply referred to the qualified name parts
by `const char*` pointers.
In the future we might want to dynamically load and populate
`CallDescriptionMaps`, hence we will need the `CallDescriptions` to
actually **own** their qualified name parts.

Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D113593
2021-11-19 18:32:13 +01:00
Balazs Benics 9ad0a90baa [analyzer][NFC] Demonstrate the use of CallDescriptionSet
Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D113592
2021-11-19 18:32:13 +01:00
Balazs Benics f18da190b0 [analyzer][NFC] Switch to using CallDescription::matches() instead of isCalled()
This patch replaces each use of the previous API with the new one.
In variadic cases, it will use the ADL `matchesAny(Call, CDs...)`
variadic function.
Also simplifies some code involving such operations.

Reviewed By: martong, xazax.hun

Differential Revision: https://reviews.llvm.org/D113591
2021-11-19 18:32:13 +01:00
Balazs Benics 6c512703a9 [analyzer][NFC] Introduce CallDescription::matches() in addition to isCalled()
This patch introduces `CallDescription::matches()` member function,
accepting a `CallEvent`.
Semantically, `Call.isCalled(CD)` is the same as `CD.matches(Call)`.

The patch also introduces the `matchesAny()` variadic free function template.
It accepts a `CallEvent` and at least one `CallDescription` to match
against.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D113590
2021-11-19 18:32:13 +01:00
Balazs Benics d448fcd9b2 [analyzer][NFC] Introduce CallDescriptionSets
Sometimes we only want to decide if some function is called, and we
don't care which of the set.
This `CallDescriptionSet` will have the same behavior, except
instead of `lookup()` returning a pointer to the mapped value,
the `contains()` returns `bool`.
Internally, it uses the `CallDescriptionMap<bool>` for implementing the
behavior. It is preferred, to reuse the generic
`CallDescriptionMap::lookup()` logic, instead of duplicating it.
The generic version might be improved by implementing a hash lookup or
something along those lines.

Reviewed By: martong, Szelethus

Differential Revision: https://reviews.llvm.org/D113589
2021-11-19 18:32:13 +01:00
Kazu Hirata 74115602e8 [clang] Use range-based for loops with llvm::reverse (NFC) 2021-11-17 19:40:48 -08:00
Balazs Benics 0b9d3a6e53 [analyzer][NFC] Separate CallDescription from CallEvent
`CallDescriptions` deserve its own translation unit.
This patch simply moves the corresponding parts.
Also includes the `CallDescription.h` where it's necessary.

Reviewed By: martong, xazax.hun, Szelethus

Differential Revision: https://reviews.llvm.org/D113587
2021-11-15 19:10:46 +01:00
Denys Petrov f0bc7d2488 [analyzer] Fix region cast between the same types with different qualifiers.
Summary: Specifically, this fixes the case when we get an access to array element through the pointer to element. This covers several FIXME's. in https://reviews.llvm.org/D111654.
Example:
  const int arr[4][2];
  const int *ptr = arr[1]; // Fixes this.
The issue is that `arr[1]` is `int*` (&Element{Element{glob_arr5,1 S64b,int[2]},0 S64b,int}), and `ptr` is `const int*`. We don't take qualifiers into account. Consequently, we doesn't match the types as the same ones.

Differential Revision: https://reviews.llvm.org/D113480
2021-11-15 19:23:00 +02:00
Kazu Hirata d0ac215dd5 [clang] Use isa instead of dyn_cast (NFC) 2021-11-14 09:32:40 -08:00
Gabor Marton 01c9700aaa [analyzer][solver] Remove reference to RangedConstraintManager
We no longer need a reference to RangedConstraintManager, we call top
level `State->assume` functions.

Differential Revision: https://reviews.llvm.org/D113261
2021-11-12 11:44:49 +01:00
Gabor Marton 806329da07 [analyzer][solver] Iterate to a fixpoint during symbol simplification with constants
D103314 introduced symbol simplification when a new constant constraint is
added. Currently, we simplify existing equivalence classes by iterating over
all existing members of them and trying to simplify each member symbol with
simplifySVal.

At the end of such a simplification round we may end up introducing a
new constant constraint. Example:
```
  if (a + b + c != d)
    return;
  if (c + b != 0)
    return;
  // Simplification starts here.
  if (b != 0)
    return;
```
The `c == 0` constraint is the result of the first simplification iteration.
However, we could do another round of simplification to reach the conclusion
that `a == d`. Generally, we could do as many new iterations until we reach a
fixpoint.

We can reach to a fixpoint by recursively calling `State->assume` on the
newly simplified symbol. By calling `State->assume` we re-ignite the
whole assume machinery (along e.g with adjustment handling).

Why should we do this? By reaching a fixpoint in simplification we are capable
of discovering infeasible states at the moment of the introduction of the
**first** constant constraint.
Let's modify the previous example just a bit, and consider what happens without
the fixpoint iteration.
```
  if (a + b + c != d)
    return;
  if (c + b != 0)
    return;
  // Adding a new constraint.
  if (a == d)
    return;
  // This brings in a contradiction.
  if (b != 0)
    return;
  clang_analyzer_warnIfReached(); // This produces a warning.
              // The path is already infeasible...
  if (c == 0) // ...but we realize that only when we evaluate `c == 0`.
    return;
```
What happens currently, without the fixpoint iteration? As the inline comments
suggest, without the fixpoint iteration we are doomed to realize that we are on
an infeasible path only after we are already walking on that. With fixpoint
iteration we can detect that before stepping on that. With fixpoint iteration,
the `clang_analyzer_warnIfReached` does not warn in the above example b/c
during the evaluation of `b == 0` we realize the contradiction. The engine and
the checkers do rely on that either `assume(Cond)` or `assume(!Cond)` should be
feasible. This is in fact assured by the so called expensive checks
(LLVM_ENABLE_EXPENSIVE_CHECKS). The StdLibraryFuncionsChecker is notably one of
the checkers that has a very similar assertion.

Before this patch, we simply added the simplified symbol to the equivalence
class. In this patch, after we have added the simplified symbol, we remove the
old (more complex) symbol from the members of the equivalence class
(`ClassMembers`). Removing the old symbol is beneficial because during the next
iteration of the simplification we don't have to consider again the old symbol.

Contrary to how we handle `ClassMembers`, we don't remove the old Sym->Class
relation from the `ClassMap`. This is important for two reasons: The
constraints of the old symbol can still be found via it's equivalence class
that it used to be the member of (1). We can spare one removal and thus one
additional tree in the forest of `ClassMap` (2).

Performance and complexity: Let us assume that in a State we have N non-trivial
equivalence classes and that all constraints and disequality info is related to
non-trivial classes. In the worst case, we can simplify only one symbol of one
class in each iteration. The number of symbols in one class cannot grow b/c we
replace the old symbol with the simplified one. Also, the number of the
equivalence classes can decrease only, b/c the algorithm does a merge operation
optionally. We need N iterations in this case to reach the fixpoint. Thus, the
steps needed to be done in the worst case is proportional to `N*N`. Empirical
results (attached) show that there is some hardly noticeable run-time and peak
memory discrepancy compared to the baseline. In my opinion, these differences
could be the result of measurement error.
This worst case scenario can be extended to that cases when we have trivial
classes in the constraints and in the disequality map are transforming to such
a State where there are only non-trivial classes, b/c the algorithm does merge
operations. A merge operation on two trivial classes results in one non-trivial
class.

Differential Revision: https://reviews.llvm.org/D106823
2021-11-12 11:44:49 +01:00
Denys Petrov a12bfac292 [analyzer] Retrieve a value from list initialization of multi-dimensional array declaration.
Summary: Add support of multi-dimensional arrays in `RegionStoreManager::getBindingForElement`. Handle nested ElementRegion's getting offsets and checking for being in bounds. Get values from the nested initialization lists using obtained offsets.

Differential Revision: https://reviews.llvm.org/D111654
2021-11-08 16:17:55 +02:00
Balazs Benics 9b5c9c469d [analyzer] Dump checker name if multiple checkers evaluate the same call
Previously, if accidentally multiple checkers `eval::Call`-ed the same
`CallEvent`, in debug builds the analyzer detected this and crashed
with the message stating this. Unfortunately, the message did not state
the offending checkers violating this invariant.
This revision addresses this by printing a more descriptive message
before aborting.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D112889
2021-11-02 14:42:14 +01:00
Kazu Hirata 4db2e4cebe Use {DenseSet,SetVector,SmallPtrSet}::contains (NFC) 2021-10-30 19:00:19 -07:00
Zarko Todorovski 8659b241ae [clang][NFC] Inclusive terms: Replace uses of whitelist in clang/lib/StaticAnalyzer
Replace variable and functions names, as well as comments that contain whitelist with
more inclusive terms.

Reviewed By: aaron.ballman, martong

Differential Revision: https://reviews.llvm.org/D112642
2021-10-29 16:51:36 -04:00
Denys Petrov 1deccd05ba [analyzer] Retrieve a character from StringLiteral as an initializer for constant arrays.
Summary: Assuming that values of constant arrays never change, we can retrieve values for specific position(index) right from the initializer, if presented. Retrieve a character code by index from StringLiteral which is an initializer of constant arrays in global scope.

This patch has a known issue of getting access to characters past the end of the literal. The declaration, in which the literal is used, is an implicit cast of kind `array-to-pointer`. The offset should be in literal length's bounds. This should be distinguished from the states in the Standard C++20 [dcl.init.string] 9.4.2.3. Example:
  const char arr[42] = "123";
  char c = arr[41]; // OK
  const char * const str = "123";
  char c = str[41]; // NOK

Differential Revision: https://reviews.llvm.org/D107339
2021-10-29 19:44:37 +03:00
Mike Rice 6f9c25167d [OpenMP] Initial parsing/sema for the 'omp loop' construct
Adds basic parsing/sema/serialization support for the #pragma omp loop
directive.

Differential Revision: https://reviews.llvm.org/D112499
2021-10-28 08:26:43 -07:00
Balazs Benics 49285f43e5 [analyzer] sprintf is a taint propagator not a source
Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.

This patch fixes that typo and demonstrated the correct behavior with
tests.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D112558
2021-10-28 11:03:02 +02:00
Gabor Marton a8297ed994 [Analyzer][solver] Handle adjustments in constraint assignor remainder
We can reuse the "adjustment" handling logic in the higher level
of the solver by calling `State->assume`.

Differential Revision: https://reviews.llvm.org/D112296
2021-10-27 17:14:34 +02:00
Gabor Marton 888af47095 [Analyzer][solver] Simplification: reorganize equalities with adjustment
Initiate the reorganization of the equality information during symbol
simplification. E.g., if we bump into `c + 1 == 0` during simplification
then we'd like to express that `c == -1`. It makes sense to do this only
with `SymIntExpr`s.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D111642
2021-10-27 16:48:55 +02:00
Balazs Benics c18407217e [analyzer] Fix StringChecker for Unknown params
It seems like protobuf crashed the `std::string` checker.
Somehow it acquired `UnknownVal` as the sole `std::string` constructor
parameter, causing a crash in the `castAs<Loc>()`.

This patch addresses this.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D112551
2021-10-26 18:15:00 +02:00
Denys Petrov 3b1165ba3d [analyzer] Retrieve incomplete array extent from its redeclaration.
Summary: Fix a case when the extent can not be retrieved correctly from incomplete array declaration. Use redeclaration to get the array extent.

Differential Revision: https://reviews.llvm.org/D111542
2021-10-25 15:14:10 +03:00
Denys Petrov 44e803ef6d [analyzer][NFCI] Move a block from `getBindingForElement` to separate functions
Summary:
1. Improve readability by moving deeply nested block of code from RegionStoreManager::getBindingForElement to new separate functions:
- getConstantValFromConstArrayInitializer;
- getSValFromInitListExpr.
2. Handle the case when index is a symbolic value. Write specific test cases.
3. Add test cases when there is no initialization expression presented.
This patch implies to make next patches clearer and easier for review process.

Differential Revision: https://reviews.llvm.org/D106681
2021-10-25 15:14:10 +03:00
Balazs Benics e1fdec875f [analyzer] Add std::string checker
This patch adds a checker checking `std::string` operations.
At first, it only checks the `std::string` single `const char *`
constructor for nullness.
If It might be `null`, it will constrain it to non-null and place a note
tag there.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111247
2021-10-25 11:15:40 +02:00
Balazs Benics f9db6a44eb Revert "[analyzer][solver] Introduce reasoning for not equal to operator"
This reverts commit cac8808f15.

 #5 0x00007f28ec629859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859)
 #6 0x00007f28ec629729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729)
 #7 0x00007f28ec63af36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
 #8 0x00007f28ecc2cc46 llvm::APInt::compareSigned(llvm::APInt const&) const (libLLVMSupport.so.14git+0xeac46)
 #9 0x00007f28e7bbf957 (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) (libclangStaticAnalyzerCore.so.14git+0x1df957)
 #10 0x00007f28e7bbf2db (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const*) (libclangStaticAnalyzerCore.so.14git+0x1df2db)
 #11 0x00007f28e7bb2b5e (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, llvm::APSInt const&, llvm::APSInt const&) (libclangStaticAnalyzerCore.so.14git+0x1d2b5e)
 #12 0x00007f28e7bc67af clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, bool) (libclangStaticAnalyzerCore.so.14git+0x1e67af)
 #13 0x00007f28e7be3578 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x203578)
 #14 0x00007f28e7be33d8 clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (libclangStaticAnalyzerCore.so.14git+0x2033d8)
 #15 0x00007f28e7be32fb clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (libclangStaticAnalyzerCore.so.14git+0x2032fb)
 #16 0x00007f28e7b15dbc clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) (libclangStaticAnalyzerCore.so.14git+0x135dbc)
 #17 0x00007f28e7b4780f clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const*) (libclangStaticAnalyzerCore.so.14git+0x16780f)

This is known to be triggered on curl, tinyxml2, tmux, twin and on xerces.
But @bjope also reported similar crashes.
So, I'm reverting it to make our internal bots happy again.

Differential Revision: https://reviews.llvm.org/D106102
2021-10-23 21:01:59 +02:00
Kazu Hirata d8e4170b0a Ensure newlines at the end of files (NFC) 2021-10-23 08:45:29 -07:00
Manas cac8808f15 [analyzer][solver] Introduce reasoning for not equal to operator
Prior to this, the solver was only able to verify whether two symbols
are equal/unequal, only when constants were involved. This patch allows
the solver to work over ranges as well.

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D106102

Patch by: @manas (Manas Gupta)
2021-10-22 12:00:08 +02:00
Gabor Marton 5f8dca0235 [Analyzer] Extend ConstraintAssignor to handle remainder op
Summary:
`a % b != 0` implies that `a != 0` for any `a` and `b`. This patch
extends the ConstraintAssignor to do just that. In fact, we could do
something similar with division and in case of multiplications we could
have some other inferences, but I'd like to keep these for future
patches.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51940

Reviewers: noq, vsavchenko, steakhal, szelethus, asdenyspetrov

Subscribers:

Differential Revision: https://reviews.llvm.org/D110357
2021-10-22 10:47:25 +02:00
Gabor Marton e2a2c8328f [Analyzer][NFC] Add RangedConstraintManager to ConstraintAssignor
In this patch we store a reference to `RangedConstraintManager` in the
`ConstraintAssignor`. This way it is possible to call back and reuse some
functions of it. This patch is exclusively needed for its child patches,
it is not intended to be a standalone patch.

Differential Revision: https://reviews.llvm.org/D111640
2021-10-22 10:46:28 +02:00
Gabor Marton 01b4ddbfbb [Analyzer][NFC] Move RangeConstraintManager's def before ConstraintAssignor's def
In this patch we simply move the definition of RangeConstraintManager before
the definition of ConstraintAssignor. This patch is exclusively needed for it's
child patch, so in the child the diff would be clean and the review would be
easier.

Differential Revision: https://reviews.llvm.org/D110387
2021-10-22 10:46:28 +02:00
Simon Pilgrim 7562f3df89 InvalidPtrChecker - don't dereference a dyn_cast<> - use cast<> instead.
Avoid dereferencing a nullptr returned by dyn_cast<>, by using cast<> instead which asserts that the cast is valid.
2021-10-20 18:06:00 +01:00
Balazs Benics 16be17ad4b [analyzer][NFC] Refactor llvm::isa<> usages in the StaticAnalyzer
It turns out llvm::isa<> is variadic, and we could have used this at a
lot of places.

The following patterns:
  x && isa<T1>(x) || isa<T2>(x) ...
Will be replaced by:
  isa_and_non_null<T1, T2, ...>(x)

Sometimes it caused further simplifications, when it would cause even
more code smell.

Aside from this, keep in mind that within `assert()` or any macro
functions, we need to wrap the isa<> expression within a parenthesis,
due to the parsing of the comma.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111982
2021-10-20 17:43:31 +02:00
Kazu Hirata 0abb5d293c [Sema, StaticAnalyzer] Use StringRef::contains (NFC) 2021-10-20 08:02:36 -07:00
Balazs Benics 72d04d7b2b [analyzer] Allow matching non-CallExprs using CallDescriptions
Fallback to stringification and string comparison if we cannot compare
the `IdentifierInfo`s, which is the case for C++ overloaded operators,
constructors, destructors, etc.

Examples:
  { "std", "basic_string", "basic_string", 2} // match the 2 param std::string constructor
  { "std", "basic_string", "~basic_string" }  // match the std::string destructor
  { "aaa", "bbb", "operator int" } // matches the struct bbb conversion operator to int

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111535
2021-10-18 14:57:24 +02:00
Balazs Benics 3ec7b91141 [analyzer][NFC] Refactor CallEvent::isCalled()
Refactor the code to make it more readable.

It will set up further changes, and improvements to this code in
subsequent patches.
This is a non-functional change.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111534
2021-10-18 14:57:24 +02:00
Kazu Hirata d245f2e859 [clang] Use llvm::erase_if (NFC) 2021-10-17 13:50:29 -07:00
Kazu Hirata 6a154e606e [clang] Use llvm::is_contained (NFC) 2021-10-15 10:07:08 -07:00
Artem Dergachev 12cbc8cbf0 [analyzer] Fix property access kind detection inside parentheses.
'(self.prop)' produces a surprising AST where ParenExpr
resides inside `PseudoObjectExpr.

This breaks ObjCMethodCall::getMessageKind() which in turn causes us
to perform unnecessary dynamic dispatch bifurcation when evaluating
body-farmed property accessors, which in turn causes us
to explore infeasible paths.
2021-10-14 21:07:19 -07:00
Gabor Marton ac3edc5af0 [analyzer][solver] Handle simplification to ConcreteInt
The solver's symbol simplification mechanism was not able to handle cases
when a symbol is simplified to a concrete integer. This patch adds the
capability.

E.g., in the attached lit test case, the original symbol is `c + 1` and
it has a `[0, 0]` range associated with it. Then, a new condition `c == 0`
is assumed, so a new range constraint `[0, 0]` comes in for `c` and
simplification kicks in. `c + 1` becomes `0 + 1`, but the associated
range is `[0, 0]`, so now we are able to realize the contradiction.

Differential Revision: https://reviews.llvm.org/D110913
2021-10-14 17:53:29 +02:00
Kazu Hirata e567f37dab [clang] Use llvm::is_contained (NFC) 2021-10-13 20:41:55 -07:00
Balazs Benics edde4efc66 [analyzer] Introduce the assume-controlled-environment config option
If the `assume-controlled-environment` is `true`, we should expect `getenv()`
to succeed, and the result should not be considered tainted.
By default, the option will be `false`.

Reviewed By: NoQ, martong

Differential Revision: https://reviews.llvm.org/D111296
2021-10-13 10:50:26 +02:00
Balazs Benics 7fc150309d [analyzer] Bifurcate on getenv() calls
The `getenv()` function might return `NULL` just like any other function.
However, in case of `getenv()` a state-split seems justified since the
programmer should expect the failure of this function.

`secure_getenv(const char *name)` behaves the same way but is not handled
right now.
Note that `std::getenv()` is also not handled.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111245
2021-10-13 10:50:26 +02:00
Artem Dergachev f3ec9d8501 [analyzer] Fix non-obvious analyzer warning: Use of zero-allocated memory.
Clarify the message provided when the analyzer catches the use of memory
that is allocated with size zero.

Differential Revision: https://reviews.llvm.org/D111655
2021-10-12 10:41:00 -07:00
Gabor Marton b8f6c85a83 [analyzer][NFC] Add RangeSet::dump
This tiny change improves the debugging experience of the solver a lot!

Differential Revision: https://reviews.llvm.org/D110911
2021-10-06 18:45:07 +02:00
Gabor Marton 792be5df92 [analyzer][solver] Fix CmpOpTable handling bug
There is an error in the implementation of the logic of reaching the `Unknonw` tristate in CmpOpTable.

```
void cmp_op_table_unknownX2(int x, int y, int z) {
  if (x >= y) {
                    // x >= y    [1, 1]
    if (x + z < y)
      return;
                    // x + z < y [0, 0]
    if (z != 0)
      return;
                    // x < y     [0, 0]
    clang_analyzer_eval(x > y);  // expected-warning{{TRUE}} expected-warning{{FALSE}}
  }
}
```
We miss the `FALSE` warning because the false branch is infeasible.

We have to exploit simplification to discover the bug. If we had `x < y`
as the second condition then the analyzer would return the parent state
on the false path and the new constraint would not be part of the State.
But adding `z` to the condition makes both paths feasible.

The root cause of the bug is that we reach the `Unknown` tristate
twice, but in both occasions we reach the same `Op` that is `>=` in the
test case. So, we reached `>=` twice, but we never reached `!=`, thus
querying the `Unknonw2x` column with `getCmpOpStateForUnknownX2` is
wrong.

The solution is to ensure that we reached both **different** `Op`s once.

Differential Revision: https://reviews.llvm.org/D110910
2021-10-06 18:28:03 +02:00
Vince Bridgers b29186c08a [analyzer] canonicalize special case of structure/pointer deref
This simple change addresses a special case of structure/pointer
aliasing that produced different symbolvals, leading to false positives
during analysis.

The reproducer is as simple as this.

```lang=C++
struct s {
  int v;
};

void foo(struct s *ps) {
  struct s ss = *ps;
  clang_analyzer_dump(ss.v); // reg_$1<int Element{SymRegion{reg_$0<struct s *ps>},0 S64b,struct s}.v>
  clang_analyzer_dump(ps->v); //reg_$3<int SymRegion{reg_$0<struct s *ps>}.v>
  clang_analyzer_eval(ss.v == ps->v); // UNKNOWN
}
```

Acks: Many thanks to @steakhal and @martong for the group debug session.

Reviewed By: steakhal, martong

Differential Revision: https://reviews.llvm.org/D110625
2021-10-06 05:18:27 -05:00
Corentin Jabot 424733c12a Implement if consteval (P1938)
Modify the IfStmt node to suppoort constant evaluated expressions.

Add a new ExpressionEvaluationContext::ImmediateFunctionContext to
keep track of immediate function contexts.

This proved easier/better/probably more efficient than walking the AST
backward as it allows diagnosing nested if consteval statements.
2021-10-05 08:04:14 -04:00
Zurab Tsinadze 811b1736d9 [analyzer] Add InvalidPtrChecker
This patch introduces a new checker: `alpha.security.cert.env.InvalidPtr`

Checker finds usage of invalidated pointers related to environment.

Based on the following SEI CERT Rules:
ENV34-C: https://wiki.sei.cmu.edu/confluence/x/8tYxBQ
ENV31-C: https://wiki.sei.cmu.edu/confluence/x/5NUxBQ

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D97699
2021-10-04 17:08:34 +02:00
Jay Foad d933adeaca [APInt] Stop using soft-deprecated constructors and methods in clang. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in clang.

Differential Revision: https://reviews.llvm.org/D110808
2021-10-04 09:38:11 +01:00
Denys Petrov 98a95d4844 [analyzer] Retrieve a value from list initialization of constant array declaration in a global scope.
Summary: Fix the point that we didn't take into account array's dimension. Retrieve a value of global constant array by iterating through its initializer list.

Differential Revision: https://reviews.llvm.org/D104285

Fixes: https://bugs.llvm.org/show_bug.cgi?id=50604
2021-09-24 12:37:58 +03:00
Nico Weber 9197834535 Revert "Fix CLANG_ENABLE_STATIC_ANALYZER=OFF building all analyzer source"
This reverts commit 6d7b3d6b3a.
Breaks running cmake with `-DCLANG_ENABLE_STATIC_ANALYZER=OFF`
without turning off CLANG_TIDY_ENABLE_STATIC_ANALYZER.
See comments on https://reviews.llvm.org/D109611 for details.
2021-09-20 16:18:03 -04:00
Alex Richardson 6d7b3d6b3a Fix CLANG_ENABLE_STATIC_ANALYZER=OFF building all analyzer source
Since https://reviews.llvm.org/D87118, the StaticAnalyzer directory is
added unconditionally. In theory this should not cause the static analyzer
sources to be built unless they are referenced by another target. However,
the clang-cpp target (defined in clang/tools/clang-shlib) uses the
CLANG_STATIC_LIBS global property to determine which libraries need to
be included. To solve this issue, this patch avoids adding libraries to
that property if EXCLUDE_FROM_ALL is set.

In case something like this comes up again: `cmake --graphviz=targets.dot`
is quite useful to see why a target is included as part of `ninja all`.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D109611
2021-09-20 12:55:56 +01:00
alokmishra.besu 000875c127 OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-18 13:40:44 -05:00
Nico Weber 31cca21565 Revert "OpenMP 5.0 metadirective"
This reverts commit c7d7b98e52.
Breaks tests on macOS, see comment on https://reviews.llvm.org/D91944
2021-09-18 09:10:37 -04:00
alokmishra.besu 347f3c186d OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-17 16:30:06 -05:00
cchen 7efb825382 Revert "OpenMP 5.0 metadirective"
This reverts commit c7d7b98e52.
2021-09-17 16:14:16 -05:00
cchen c7d7b98e52 OpenMP 5.0 metadirective
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.

A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h

Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91944
2021-09-17 16:03:13 -05:00
Gabor Marton 96ec9b6ff2 [Analyzer] ConversionChecker: track back the cast expression
Adding trackExpressionValue to the checker so it tracks the value of the
implicit cast's DeclRefExpression up to initialization/assignment. This
way the report becomes cleaner.

Differential Revision: https://reviews.llvm.org/D109836
2021-09-16 11:42:54 +02:00
Kristóf Umann 9d359f6c73 [analyzer] MallocChecker: Add notes from NoOwnershipChangeVisitor only when a function "intents", but doesn't change ownership, enable by default
D105819 Added NoOwnershipChangeVisitor, but it is only registered when an
off-by-default, hidden checker option was enabled. The reason behind this was
that it grossly overestimated the set of functions that really needed a note:

std::string getTrainName(const Train *T) {
  return T->name;
} // note: Retuning without changing the ownership of or deallocating memory
// Umm... I mean duh? Nor would I expect this function to do anything like that...

void foo() {
  Train *T = new Train("Land Plane");
  print(getTrainName(T)); // note: calling getTrainName / returning from getTrainName
} // warn: Memory leak

This patch adds a heuristic that guesses that any function that has an explicit
operator delete call could have be responsible for deallocating the memory that
ended up leaking. This is waaaay too conservative (see the TODOs in the new
function), but it safer to err on the side of too little than too much, and
would allow us to enable the option by default *now*, and add refinements
one-by-one.

Differential Revision: https://reviews.llvm.org/D108753
2021-09-13 15:01:20 +02:00
Kristóf Umann 0213d7ec0c [analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it
Fix a compilation error due to a missing 'template' keyword.

Differential Revision: https://reviews.llvm.org/D108695
2021-09-13 13:50:01 +02:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Balazs Benics b97a96400a [analyzer] SValBuilder should have an easy access to AnalyzerOptions
`SVB.getStateManager().getOwningEngine().getAnalysisManager().getAnalyzerOptions()`
is quite a mouthful and might involve a few pointer indirections to get
such a simple thing like an analyzer option.

This patch introduces an `AnalyzerOptions` reference to the `SValBuilder`
abstract class, while refactors a few cases to use this /simpler/ accessor.

Reviewed By: martong, Szelethus

Differential Revision: https://reviews.llvm.org/D108824
2021-09-04 10:19:57 +02:00
Balazs Benics 91c07eb8ee [analyzer] Ignore single element arrays in getStaticSize() conditionally
Quoting https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html:
> In the absence of the zero-length array extension, in ISO C90 the contents
> array in the example above would typically be declared to have a single
> element.

We should not assume that the size of the //flexible array member// field has
a single element, because in some cases they use it as a fallback for not
having the //zero-length array// language extension.
In this case, the analyzer should return `Unknown` as the extent of the field
instead.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D108230
2021-09-04 10:19:57 +02:00
Jessica Paquette b9e57e0305 Revert "[analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it"
This reverts commit a375bfb5b7.

This was causing a bot to crash:

https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/23380/
2021-09-03 10:28:07 -07:00
Kristóf Umann a375bfb5b7 [analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it
D105553 added NoStateChangeFuncVisitor, an abstract class to aid in creating
notes such as "Returning without writing to 'x'", or "Returning without changing
the ownership status of allocated memory". Its clients need to define, among
other things, what a change of state is.

For code like this:

f() {
  g();
}

foo() {
  f();
  h();
}

We'd have a path in the ExplodedGraph that looks like this:

             -- <g> -->
            /          \
         ---     <f>    -------->        --- <h> --->
        /                        \      /            \
--------        <foo>             ------    <foo>     -->

When we're interested in whether f neglected to change some property,
NoStateChangeFuncVisitor asks these questions:

                       ÷×~
                -- <g> -->
           ß   /          \$    @&#*
            ---     <f>    -------->        --- <h> --->
           /                        \      /            \
   --------        <foo>             ------    <foo>     -->

Has anything changed in between # and *?
Has anything changed in between & and *?
Has anything changed in between @ and *?
...
Has anything changed in between $ and *?
Has anything changed in between × and ~?
Has anything changed in between ÷ and ~?
...
Has anything changed in between ß and *?
...
This is a rather thorough line of questioning, which is why in D105819, I was
only interested in whether state *right before* and *right after* a function
call changed, and early returned to the CallEnter location:

if (!CurrN->getLocationAs<CallEnter>())
  return;
Except that I made a typo, and forgot to negate the condition. So, in this
patch, I'm fixing that, and under the same hood allow all clients to decide to
do this whole-function check instead of the thorough one.

Differential Revision: https://reviews.llvm.org/D108695
2021-09-03 13:50:18 +02:00
Kristóf Umann 3891b45a06 Revert "[analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it"
This reverts commit 7d0e62bfb7.
2021-09-02 17:19:49 +02:00
Kristóf Umann 7d0e62bfb7 [analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it
D105553 added NoStateChangeFuncVisitor, an abstract class to aid in creating
notes such as "Returning without writing to 'x'", or "Returning without changing
the ownership status of allocated memory". Its clients need to define, among
other things, what a change of state is.

For code like this:

f() {
  g();
}

foo() {
  f();
  h();
}

We'd have a path in the ExplodedGraph that looks like this:

             -- <g> -->
            /          \
         ---     <f>    -------->        --- <h> --->
        /                        \      /            \
--------        <foo>             ------    <foo>     -->

When we're interested in whether f neglected to change some property,
NoStateChangeFuncVisitor asks these questions:

                       ÷×~
                -- <g> -->
           ß   /          \$    @&#*
            ---     <f>    -------->        --- <h> --->
           /                        \      /            \
   --------        <foo>             ------    <foo>     -->

Has anything changed in between # and *?
Has anything changed in between & and *?
Has anything changed in between @ and *?
...
Has anything changed in between $ and *?
Has anything changed in between × and ~?
Has anything changed in between ÷ and ~?
...
Has anything changed in between ß and *?
...
This is a rather thorough line of questioning, which is why in D105819, I was
only interested in whether state *right before* and *right after* a function
call changed, and early returned to the CallEnter location:

if (!CurrN->getLocationAs<CallEnter>())
  return;
Except that I made a typo, and forgot to negate the condition. So, in this
patch, I'm fixing that, and under the same hood allow all clients to decide to
do this whole-function check instead of the thorough one.

Differential Revision: https://reviews.llvm.org/D108695
2021-09-02 16:56:32 +02:00
Fanbo Meng ae206db2d6 [SystemZ][z/OS] Create html report file with text flag
Change OF_None to OF_Text flag in file creation, same reasoning as https://reviews.llvm.org/D97785

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D108998
2021-08-31 11:52:04 -04:00
Balazs Benics 68088563fb [analyzer] MallocOverflow should consider comparisons only preceding malloc
MallocOverflow works in two phases:

1) Collects suspicious malloc calls, whose argument is a multiplication
2) Filters the aggregated list of suspicious malloc calls by iterating
   over the BasicBlocks of the CFG looking for comparison binary
   operators over the variable constituting in any suspicious malloc.

Consequently, it suppressed true-positive cases when the comparison
check was after the malloc call.
In this patch the checker will consider the relative position of the
relation check to the malloc call.

E.g.:

```lang=C++
void *check_after_malloc(int n, int x) {
  int *p = NULL;
  if (x == 42)
    p = malloc(n * sizeof(int)); // Previously **no** warning, now it
                                 // warns about this.

  // The check is after the allocation!
  if (n > 10) {
    // Do something conditionally.
  }
  return p;
}
```

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D107804
2021-08-27 14:41:26 +02:00
Balazs Benics 6ad47e1c4f [analyzer] Catch leaking stack addresses via stack variables
Not only global variables can hold references to dead stack variables.
Consider this example:

  void write_stack_address_to(char **q) {
    char local;
    *q = &local;
  }

  void test_stack() {
    char *p;
    write_stack_address_to(&p);
  }

The address of 'local' is assigned to 'p', which becomes a dangling
pointer after 'write_stack_address_to()' returns.

The StackAddrEscapeChecker was looking for bindings in the store which
referred to variables of the popped stack frame, but it only considered
global variables in this regard. This patch relaxes this, catching
stack variable bindings as well.

---

This patch also works for temporary objects like:

  struct Bar {
    const int &ref;
    explicit Bar(int y) : ref(y) {
      // Okay.
    } // End of the constructor call, `ref` is dangling now. Warning!
  };

  void test() {
    Bar{33}; // Temporary object, so the corresponding memregion is
             // *not* a VarRegion.
  }

---

The return value optimization aka. copy-elision might kick in but that
is modeled by passing an imaginary CXXThisRegion which refers to the
parent stack frame which is supposed to be the 'return slot'.
Objects residing in the 'return slot' outlive the scope of the inner
call, thus we should expect no warning about them - except if we
explicitly disable copy-elision.

Reviewed By: NoQ, martong

Differential Revision: https://reviews.llvm.org/D107078
2021-08-27 11:31:16 +02:00
Artem Dergachev 7309359928 [analyzer] Fix scan-build report deduplication.
The previous behavior was to deduplicate reports based on md5 of the
html file. This algorithm might have worked originally but right now
HTML reports contain information rich enough to make them virtually
always distinct which breaks deduplication entirely.

The new strategy is to (finally) take advantage of IssueHash - the
stable report identifier provided by clang that is the same if and only if
the reports are duplicates of each other.

Additionally, scan-build no longer performs deduplication on its own.
Instead, the report file name is now based on the issue hash,
and clang instances will silently refuse to produce a new html file
when a duplicate already exists. This eliminates the problem entirely.

The '-analyzer-config stable-report-filename' option is deprecated
because report filenames are no longer unstable. A new option is
introduced, '-analyzer-config verbose-report-filename', to produce
verbose file names that look similar to the old "stable" file names.
The old option acts as an alias to the new option.

Differential Revision: https://reviews.llvm.org/D105167
2021-08-26 13:34:29 -07:00
Balazs Benics e5646b9254 Revert "Revert "[analyzer] Ignore IncompleteArrayTypes in getStaticSize() for FAMs""
This reverts commit df1f4e0cc6.

Now the test case explicitly specifies the target triple.
I decided to use x86_64 for that matter, to have a fixed
bitwidth for `size_t`.

Aside from that, relanding the original changes of:
https://reviews.llvm.org/D105184
2021-08-25 17:19:06 +02:00
Balazs Benics df1f4e0cc6 Revert "[analyzer] Ignore IncompleteArrayTypes in getStaticSize() for FAMs"
This reverts commit 360ced3b8f.
2021-08-25 16:43:25 +02:00
Balazs Benics 360ced3b8f [analyzer] Ignore IncompleteArrayTypes in getStaticSize() for FAMs
Currently only `ConstantArrayType` is considered for flexible array
members (FAMs) in `getStaticSize()`.
However, `IncompleteArrayType` also shows up in practice as FAMs.

This patch will ignore the `IncompleteArrayType` and return Unknown
for that case as well. This way it will be at least consistent with
the current behavior until we start modeling them accurately.

I'm expecting that this will resolve a bunch of false-positives
internally, caused by the `ArrayBoundV2`.

Reviewed By: ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D105184
2021-08-25 16:12:17 +02:00
Simon Pilgrim ae691648b4 Fix unknown parameter Wdocumentation warning. NFC. 2021-08-19 15:40:10 +01:00
Denys Petrov 9dabacd09f [analyzer] Adjust JS code of analyzer's HTML report for IE support.
Summary: Change and replace some functions which IE does not support. This patch is made as a continuation of D92928 revision. Also improve hot keys behavior.

Differential Revision: https://reviews.llvm.org/D107366
2021-08-17 19:32:34 +03:00
Rong Xu 9b8425e42c Reapply commit b7425e956
The commit b7425e956: [NFC] fix typos
is harmless but was reverted by accident. Reapply.
2021-08-16 12:18:40 -07:00
Kostya Kortchinsky 80ed75e7fb Revert "[NFC] Fix typos"
This reverts commit b7425e956b.
2021-08-16 11:13:05 -07:00
Rong Xu b7425e956b [NFC] Fix typos
s/senstive/senstive/g
2021-08-16 10:15:30 -07:00
Kristóf Umann 2d3668c997 [analyzer] MallocChecker: Add a visitor to leave a note on functions that could have, but did not change ownership on leaked memory
This is a rather common feedback we get from out leak checkers: bug reports are
really short, and are contain barely any usable information on what the analyzer
did to conclude that a leak actually happened.

This happens because of our bug report minimizing effort. We construct bug
reports by inspecting the ExplodedNodes that lead to the error from the bottom
up (from the error node all the way to the root of the exploded graph), and mark
entities that were the cause of a bug, or have interacted with it as
interesting. In order to make the bug report a bit less verbose, whenever we
find an entire function call (from CallEnter to CallExitEnd) that didn't talk
about any interesting entity, we prune it (click here for more info on bug
report generation). Even if the event to highlight is exactly this lack of
interaction with interesting entities.

D105553 generalized the visitor that creates notes for these cases. This patch
adds a new kind of NoStateChangeVisitor that leaves notes in functions that
took a piece of dynamically allocated memory that later leaked as parameter,
and didn't change its ownership status.

Differential Revision: https://reviews.llvm.org/D105553
2021-08-16 16:19:00 +02:00
Kristóf Umann c019142a89 [analyzer][NFC] Split the main logic of NoStoreFuncVisitor to an abstract NoStateChangeVisitor class
Preceding discussion on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2021-June/068450.html

NoStoreFuncVisitor is a rather unique visitor. As VisitNode is invoked on most
other visitors, they are looking for the point where something changed -- change
on a value, some checker-specific GDM trait, a new constraint.
NoStoreFuncVisitor, however, looks specifically for functions that *didn't*
write to a MemRegion of interesting. Quoting from its comments:

/// Put a diagnostic on return statement of all inlined functions
/// for which  the region of interest \p RegionOfInterest was passed into,
/// but not written inside, and it has caused an undefined read or a null
/// pointer dereference outside.

It so happens that there are a number of other similar properties that are
worth checking. For instance, if some memory leaks, it might be interesting why
a function didn't take ownership of said memory:

void sink(int *P) {} // no notes

void f() {
  sink(new int(5)); // note: Memory is allocated
                    // Well hold on, sink() was supposed to deal with
                    // that, this must be a false positive...
} // warning: Potential memory leak [cplusplus.NewDeleteLeaks]

In here, the entity of interest isn't a MemRegion, but a symbol. The property
that changed here isn't a change of value, but rather liveness and GDM traits
managed by MalloChecker.

This patch moves some of the logic of NoStoreFuncVisitor to a new abstract
class, NoStateChangeFuncVisitor. This is mostly calculating and caching the
stack frames in which the entity of interest wasn't changed.

Descendants of this interface have to define 3 things:

* What constitutes as a change to an entity (this is done by overriding
wasModifiedBeforeCallExit)
* What the diagnostic message should be (this is done by overriding
maybeEmitNoteFor.*)
* What constitutes as the entity of interest being passed into the function (this
is also done by overriding maybeEmitNoteFor.*)

Differential Revision: https://reviews.llvm.org/D105553
2021-08-16 15:03:22 +02:00
Balázs Kéri 9f517fd11e [clang][analyzer] Improve bug report in alpha.security.ReturnPtrRange
Add some notes and track of bad return value.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D107051
2021-08-11 13:04:55 +02:00
Christopher Di Bella c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Vince Bridgers d39ebdae67 [analyzer] Cleanup a FIXME in SValBuilder.cpp
This change follows up on a FIXME submitted with D105974. This change simply let's the reference case fall through to return a concrete 'true'
instead of a nonloc pointer of appropriate length set to NULL.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D107720
2021-08-10 16:12:52 -05:00
Valeriy Savchenko 9e02f58780 [analyzer] Highlight arrows for currently selected event
In some cases, when the execution path of the diagnostic
goes back and forth, arrows can overlap and create a mess.
Dimming arrows that are not relevant at the moment, solves this issue.
They are still visible, but don't draw too much attention.

Differential Revision: https://reviews.llvm.org/D92928
2021-08-02 19:15:01 +03:00
Valeriy Savchenko 97bcafa28d [analyzer] Add control flow arrows to the analyzer's HTML reports
This commit adds a very first version of this feature.
It is off by default and has to be turned on by checking the
corresponding box.  For this reason, HTML reports still keep
control notes (aka grey bubbles).

Further on, we plan on attaching arrows to events and having all arrows
not related to a currently selected event barely visible.  This will
help with reports where control flow goes back and forth (eg in loops).
Right now, it can get pretty crammed with all the arrows.

Differential Revision: https://reviews.llvm.org/D92639
2021-08-02 19:15:00 +03:00
Matheus Izvekov 0c7cd4a873 [clang] NFC: refactor multiple implementations of getDecltypeForParenthesizedExpr
This cleanup patch refactors a bunch of functional duplicates of
getDecltypeForParenthesizedExpr into a common implementation.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: aaronpuchert

Differential Revision: https://reviews.llvm.org/D100713
2021-07-28 23:27:43 +02:00
Gabor Marton 4761321d49 [Analyzer][solver][NFC] print constraints deterministically (ordered by their string representation)
This change is an extension to D103967 where I added dump methods for
(dis)equality classes of the State. There, the (dis)equality classes and their
contents are dumped in an ordered fashion, they are ordered based on their
string representation. This is very useful once we start to use FileCheck to
test the State dump in certain tests.

Differential Revision: https://reviews.llvm.org/D106642
2021-07-26 16:27:23 +02:00
Gabor Marton 44fa31fa6d [Analyzer][solver] Fix inconsistent equivalence class data
https://bugs.llvm.org/show_bug.cgi?id=51109

When we merged two classes, `*this` became an obsolete representation of
the new `State`. This is b/c the member relations had changed during the
previous merge of another member of the same class in a way that `*this`
had no longer any members. (`mergeImpl` might keep the member relations
to `Other` and could dissolve `*this`.)

Differential Revision: https://reviews.llvm.org/D106285
2021-07-23 14:25:32 +02:00
Deep Majumder 80068ca623 [analyzer] Fix for faulty namespace test in SmartPtrModelling
This patch:
- Fixes how the std-namespace test is written in SmartPtrModelling
(now accounts for functions with no Decl available)
- Adds the smart pointer checker flag check where it was missing

Differential Revision: https://reviews.llvm.org/D106296
2021-07-21 18:23:35 +05:30
Gabor Marton 732a8a9dfb [Analyzer][solver][NFC] Add explanatory comments to trivial eq classes
Differential Revision: https://reviews.llvm.org/D106370
2021-07-21 11:59:56 +02:00
Balázs Kéri 90cb5297ad [clang][analyzer] Improve report of file read at EOF condition (alpha.unix.Stream checker).
The checker warns if a stream is read that is already in end-of-file
(EOF) state.
The commit adds indication of the last location where the EOF flag is set
on the stream.

Reviewed By: Szelethus

Differential Revision: https://reviews.llvm.org/D104925
2021-07-21 08:54:11 +02:00
Deep Majumder d825309352 [analyzer] Handle std::make_unique
Differential Revision: https://reviews.llvm.org/D103750
2021-07-18 19:54:28 +05:30
Deep Majumder 0cd98bef1b [analyzer] Handle std::swap for std::unique_ptr
This patch handles the `std::swap` function specialization
for `std::unique_ptr`. Implemented to be very similar to
how `swap` method is handled

Differential Revision: https://reviews.llvm.org/D104300
2021-07-18 14:38:55 +05:30
Vince Bridgers 918bda1241 [analyzer] Do not assume that all pointers have the same bitwidth as void*
This change addresses this assertion that occurs in a downstream
compiler with a custom target.

```APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"'```

No covering test case is susbmitted with this change since this crash
cannot be reproduced using any upstream supported target. The test case
that exposes this issue is as simple as:

```lang=c++
  void test(int * p) {
    int * q = p-1;
    if (q) {}
    if (q) {} // crash
    (void)q;
  }
```

The custom target that exposes this problem supports two address spaces,
16-bit `char`s, and a `_Bool` type that maps to 16-bits. There are no upstream
supported targets with similar attributes.

The assertion appears to be happening as a result of evaluating the
`SymIntExpr` `(reg_$0<int * p>) != 0U` in `VisitSymIntExpr` located in
`SimpleSValBuilder.cpp`. The `LHS` is evaluated to `32b` and the `RHS` is
evaluated to `16b`. This eventually leads to the assertion in `APInt.h`.

While this change addresses the crash and passes LITs, two follow-ups
are required:
  1) The remainder of `getZeroWithPtrWidth()` and `getIntWithPtrWidth()`
     should be cleaned up following this model to prevent future
     confusion.
  2) We're not sure why references are found along with the modified
     code path, that should not be the case. A more principled
     fix may be found after some further comprehension of why this
     is the case.

Acks: Thanks to @steakhal and @martong for the discussions leading to this
fix.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D105974
2021-07-16 03:22:57 -05:00
Deep Majumder 13fe78212f [analyzer] Handle << operator for std::unique_ptr
This patch handles the `<<` operator defined for `std::unique_ptr` in
    the std namespace (ignores custom overloads of the operator).

    Differential Revision: https://reviews.llvm.org/D105421
2021-07-16 12:34:30 +05:30
Deep Majumder 48688257c5 [analyzer] Model comparision methods of std::unique_ptr
This patch handles all the comparision methods (defined via overloaded
operators) on std::unique_ptr. These operators compare the underlying
pointers, which is modelled by comparing the corresponding inner-pointer
SVal. There is also a special case for comparing the same pointer.

Differential Revision: https://reviews.llvm.org/D104616
2021-07-16 09:54:05 +05:30
Gabor Marton d0d37fcc4e [Analyzer][solver] Remove unused functions
../../git/llvm-project/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2395:17: warning: 'clang::ento::ProgramStateRef {anonymous}::RangeConstraintManager::setRange(clang::ento::ProgramStateRef, {anonymous}::EquivalenceClass, clang::ento::RangeSet)' defined but not used [-Wunused-function]
../../git/llvm-project/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2384:10: warning: 'clang::ento::RangeSet {anonymous}::RangeConstraintManager::getRange(clang::ento::ProgramStateRef, {anonymous}::EquivalenceClass)' defined but not used [-Wunused-function]

Differential Revision: https://reviews.llvm.org/D106063
2021-07-15 16:36:01 +02:00
Balázs Kéri b0d38ad0bc [clang][Analyzer] Add symbol uninterestingness to bug report.
`PathSensitiveBughReport` has a function to mark a symbol as interesting but
it was not possible to clear this flag. This can be useful in some cases,
so the functionality is added.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D105637
2021-07-15 10:02:18 +02:00
Gabor Marton bdf31471c7 [Analyzer][solver] Add dump methods for (dis)equality classes.
This proved to be very useful during debugging.

Differential Revision: https://reviews.llvm.org/D103967
2021-07-14 13:45:02 +02:00
Valeriy Savchenko 60bd8cbc0c [analyzer][solver][NFC] Refactor how we detect (dis)equalities
This patch simplifies the way we deal with (dis)equalities.
Due to the symmetry between constraint handler and range inferrer,
we can have very similar implementations of logic handling
questions about (dis)equality and assumptions involving (dis)equality.

It also helps us to remove one more visitor, and removes uncertainty
that we got all the right places to put `trackNE` and `trackEQ`.

Differential Revision: https://reviews.llvm.org/D105693
2021-07-13 21:00:30 +03:00
Valeriy Savchenko f26deb4e6b [analyzer][solver][NFC] Introduce ConstraintAssignor
The new component is a symmetric response to SymbolicRangeInferrer.
While the latter is the unified component, which answers all the
questions what does the solver knows about a particular symbolic
expression, assignor associates new constraints (aka "assumes")
with symbolic expressions and can imply additional knowledge that
the solver can extract and use later on.

- Why do we need it and why is SymbolicRangeInferrer not enough?

As it is noted before, the inferrer only helps us to get the most
precise range information based on the existing knowledge and on the
mathematical foundations of different operations that symbolic
expressions actually represent.  It doesn't introduce new constraints.

The assignor, on the other hand, can impose constraints on other
symbols using the same domain knowledge.

- But for some expressions, SymbolicRangeInferrer looks into constraints
  for similar expressions, why can't we do that for all the cases?

That's correct!  But in order to do something like this, we should
have a finite number of possible "similar expressions".

Let's say we are asked about `$a - $b` and we know something about
`$b - $a`.  The inferrer can invert this expression and check
constraints for `$b - $a`.  This is simple!
But let's say we are asked about `$a` and we know that `$a * $b != 0`.
In this situation, we can imply that `$a != 0`, but the inferrer shouldn't
try every possible symbolic expression `X` to check if `$a * X` or
`X * $a` is constrained to non-zero.

With the assignor mechanism, we can catch this implication right at
the moment we associate `$a * $b` with non-zero range, and set similar
constraints for `$a` and `$b` as well.

Differential Revision: https://reviews.llvm.org/D105692
2021-07-13 21:00:30 +03:00
SharmaRithik cad9b7f708 [analyzer] Print time taken to analyze each function
Summary: This patch is a part of an attempt to obtain more
timer data from the analyzer. In this patch, we try to use
LLVM::TimeRecord to save time before starting the analysis
and to print the time that a specific function takes while
getting analyzed.

The timer data is printed along with the
-analyzer-display-progress outputs.

ANALYZE (Syntax): test.c functionName : 0.4 ms
ANALYZE (Path,  Inline_Regular): test.c functionName : 2.6 ms
Authored By: RithikSharma
Reviewer: NoQ, xazax.hun, teemperor, vsavchenko
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D105565
2021-07-13 04:52:47 +00:00
Abbas Sabra 1af97c9d0b [analyzer] LoopUnrolling: fix crash when a loop counter is captured in a lambda by reference
Reviewed By: vsavchenko

Differential Revision: https://reviews.llvm.org/D102273
2021-07-12 17:06:07 +03:00
Balazs Benics d3e14fafc6 [analyzer][NFC] Display the correct function name even in crash dumps
The `-analyzer-display-progress` displayed the function name of the
currently analyzed function. It differs in C and C++. In C++, it
prints the argument types as well in a comma-separated list.
While in C, only the function name is displayed, without the brackets.
E.g.:

  C++: foo(), foo(int, float)
  C:   foo

In crash traces, the analyzer dumps the location contexts, but the
string is not enough for `-analyze-function` in C++ mode.
This patch addresses the issue by dumping the proper function names
even in stack traces.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D105708
2021-07-12 09:06:46 +02:00
David Blaikie 1def2579e1 PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
2021-07-08 13:37:57 -07:00
Valeriy Savchenko 6017cb31bb [analyzer][solver] Use all sources of constraints
Prior to this patch, we always gave priority to constraints that we
actually know about symbols in question.  However, these can get
outdated and we can get better results if we look at all possible
sources of knowledge, including sub-expressions.

Differential Revision: https://reviews.llvm.org/D105436
2021-07-06 11:09:08 +03:00
Georgy Komarov c558b1fca7
[analyzer] Fix calculating offset for fields with an empty type
Fix offset calculation routines in padding checker to avoid assertion
errors described in bugzilla issue 50426. The fields that are subojbects
of zero size, marked with [[no_unique_address]] or empty bitfields will
be excluded from padding calculation routines.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D104097
2021-07-04 06:57:11 +03:00
Balazs Benics 55662b24a4 [analyzer][NFC] Inline ExprEngine::handleLVectorSplat()
It seems like ExprEngine::handleLVectorSplat() was used at only 2
places. It might be better to directly inline them for readability.

It seems like these cases were not covered by tests according to my
coverage measurement, so I'm adding tests as well, demonstrating that no
behavior changed.
Besides that, I'm handling CK_MatrixCast similarly to how the rest of
the unhandled casts are evaluated.

Differential Revision: https://reviews.llvm.org/D105125

Reviewed by: NoQ
2021-07-01 10:54:28 +02:00
Balazs Benics aa454dda2e [analyzer] LValueToRValueBitCasts should evaluate to an r-value
Previously `LValueToRValueBitCast`s were modeled in the same way how
a regular `BitCast` was. However, this should not produce an l-value.
Modeling bitcasts accurately is tricky, so it's probably better to
model this expression by binding a fresh conjured value.

The following code should not result in a diagnostic:
```lang=C++
  __attribute__((always_inline))
  static inline constexpr unsigned int_castf32_u32(float __A) {
    return __builtin_bit_cast(unsigned int, __A); // no-warning
  }
```

Previously, it reported
`Address of stack memory associated with local variable '__A' returned
to caller [core.StackAddressEscape]`.

Differential Revision: https://reviews.llvm.org/D105017

Reviewed by: NoQ, vsavchenko
2021-07-01 10:54:22 +02:00
Balazs Benics 3dae01911b [analyzer] Make CheckerManager::hasPathSensitiveCheckers() complete again
It turns out that the CheckerManager::hasPathSensitiveCheckers() missed
checking for the BeginFunctionCheckers.
It seems like other callbacks are also missing:
 - ObjCMessageNilCheckers
 - BeginFunctionCheckers
 - NewAllocatorCheckers
 - PointerEscapeCheckers
 - EndOfTranslationUnitCheckers

In this patch, I wanted to use a fold-expression, but until C++17
arrives we are left with the old-school method.

When I tried to write a unittest I observed an interesting behavior. I
subscribed only to the BeginFunction event, it was not fired.
However, when I also defined the PreCall with an empty handler, suddenly
both fired.
I could add this test demonstrating the issue, but I don't think it
would serve much value in a long run. I don't expect regressions for
this.

However, I think it would be great to enforce the completeness of this
list in a runtime check.
I could not come up with a solution for this though.

PS: Thank you @Szelethus for helping me debugging this.

Differential Revision: https://reviews.llvm.org/D105101

Reviewed by: vsavchenko
2021-06-29 16:35:07 +02:00
Valeriy Savchenko 159024ce23 [analyzer] Implement getType for SVal
This commit adds a function to the top-class of SVal hierarchy to
provide type information about the value.  That can be extremely
useful when this is the only piece of information that the user is
actually caring about.

Additionally, this commit introduces a testing framework for writing
unit-tests for symbolic values.

Differential Revision: https://reviews.llvm.org/D104550
2021-06-29 12:11:19 +03:00
Nico Weber d5402a2fee Revert "[Analyzer][solver] Add dump methods for (dis)equality classes."
This reverts commit 6f3b775c3e.
Test fails flakily, see comments on https://reviews.llvm.org/D103967

Also revert follow-up "[Analyzer] Attempt to fix windows bots test
failure b/c of new-line"
This reverts commit fe0e861a4d.
2021-06-28 11:32:57 -04:00
Valeriy Savchenko 8474bb13c3 [analyzer][solver][NFC] Simplify function signatures
Since RangeSet::Factory actually contains BasicValueFactory, we can
remove value factory from many function signatures inside the solver.

Differential Revision: https://reviews.llvm.org/D105005
2021-06-28 14:20:06 +03:00
Gabor Marton 6f3b775c3e [Analyzer][solver] Add dump methods for (dis)equality classes.
This proved to be very useful during debugging.

Differential Revision: https://reviews.llvm.org/D103967
2021-06-28 12:57:14 +02:00
Valeriy Savchenko d646157146 [analyzer] Fix assertion failure on code with transparent unions
rdar://76948312

Differential Revision: https://reviews.llvm.org/D104716
2021-06-25 23:09:16 +03:00
Gabor Marton 0646e36254 [Analyzer][solver] Fix crashes during symbol simplification
Consider the code
```
  void f(int a0, int b0, int c)
  {
      int a1 = a0 - b0;
      int b1 = (unsigned)a1 + c;
      if (c == 0) {
          int d = 7L / b1;
      }
  }
```
At the point of divisiion by `b1` that is considered to be non-zero,
which results in a new constraint for `$a0 - $b0 + $c`. The type
of this sym is unsigned, however, the simplified sym is `$a0 -
$b0` and its type is signed. This is probably the result of the
inherent improper handling of casts. Anyway, Range assignment
for constraints use this type information. Therefore, we must
make sure that first we simplify the symbol and only then we
assign the range.

Differential Revision: https://reviews.llvm.org/D104844
2021-06-25 11:49:26 +02:00
Martin Storsjö e5c7c171e5 [clang] Rename StringRef _lower() method calls to _insensitive()
This is mostly a mechanical change, but a testcase that contains
parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp)
isn't touched.
2021-06-25 00:22:01 +03:00
Balázs Kéri d7227a5bc7 [clang][Analyzer] Track null stream argument in alpha.unix.Stream .
The checker contains check for passing a NULL stream argument.
This change should make more easy to identify where the passed pointer
becomes NULL.

Reviewed By: NoQ

Differential Revision: https://reviews.llvm.org/D104640
2021-06-22 11:16:56 +02:00
Tomasz Kamiński cc2ef19556 [analyzer] Handle NTTP invocation in CallContext.getCalleeDecl()
This fixes a crash in MallocChecker for the situation when operator new (delete) is invoked via NTTP  and makes the behavior of CallContext.getCalleeDecl(Expr) identical to CallEvent.getDecl().

Reviewed By: vsavchenko

Differential Revision: https://reviews.llvm.org/D103025
2021-06-18 16:32:19 +03:00
Kirstóf Umann 9cca5c1391 [analyzer] Make checker silencing work for non-pathsensitive bug reports
D66572 separated BugReport and BugReporter into basic and path sensitive
versions. As a result, checker silencing, which worked deep in the path
sensitive report generation facilities became specific to it. DeadStoresChecker,
for instance, despite being in the static analyzer, emits non-pathsensitive
reports, and was impossible to silence.

This patch moves the corresponding code before the call to the virtual function
generateDiagnosticForConsumerMap (which is overriden by the specific kinds of
bug reporters). Although we see bug reporting as relatively lightweight compared
to the analysis, this will get rid of several steps we used to throw away.

Quoting from D65379:

At a very high level, this consists of 3 steps:

For all BugReports in the same BugReportEquivClass, collect all their error
nodes in a set. With that set, create a new, trimmed ExplodedGraph whose leafs
are all error nodes.
Until a valid report is found, construct a bug path, which is yet another
ExplodedGraph, that is linear from a given error node to the root of the graph.
Run all visitors on the constructed bug path. If in this process the report got
invalidated, start over from step 2.
Checker silencing used to kick in after all of these. Now it does before any of
them :^)

Differential Revision: https://reviews.llvm.org/D102914

Change-Id: Ice42939304516f2bebd05a1ea19878b89c96a25d
2021-06-17 10:27:34 +02:00
Valeriy Savchenko eadd54f274 [analyzer] Decouple NoteTag from its Factory
This allows us to create other types of tags that carry useful
bits of information alongside.

Differential Revision: https://reviews.llvm.org/D104135
2021-06-15 11:58:13 +03:00
Valeriy Savchenko 16f7a952ec [analyzer] Simplify the process of producing notes for stores
Differential Revision: https://reviews.llvm.org/D104046
2021-06-15 11:37:36 +03:00
Valeriy Savchenko 6e6a26b8f0 [analyzer] Extract InlinedFunctionCallHandler
Differential Revision: https://reviews.llvm.org/D103961
2021-06-15 11:37:36 +03:00
Valeriy Savchenko 2e490676ea [analyzer] Extract InterestingLValueHandler
Differential Revision: https://reviews.llvm.org/D103917
2021-06-15 11:37:36 +03:00
Valeriy Savchenko 40cb73bd20 [analyzer] Extract ArrayIndexHandler
One interesting problem was discovered here.  When we do interrupt
Tracker's track flow, we want to interrupt only it and not all the
other flows recursively.

Differential Revision: https://reviews.llvm.org/D103914
2021-06-15 11:37:36 +03:00
Valeriy Savchenko 1639dcb279 [analyzer] Extract NilReceiverHandler
Differential Revision: https://reviews.llvm.org/D103902
2021-06-15 11:37:36 +03:00
Valeriy Savchenko 85f475c979 [analyzer] Extract ControlDependencyHandler
Differential Revision: https://reviews.llvm.org/D103677
2021-06-15 11:37:36 +03:00
Valeriy Savchenko bbebf38b73 [analyzer] Refactor StoreSiteFinder and extract DefaultStoreHandler
After this patch, custom StoreHandlers will also work as expected.

Differential Revision: https://reviews.llvm.org/D103644
2021-06-15 11:37:35 +03:00
Gabor Marton 8ddbb442b6 [Analyzer][solver] Simplify existing eq classes and constraints when a new constraint is added
Update `setConstraint` to simplify existing equivalence classes when a
new constraint is added. In this patch we iterate over all existing
equivalence classes and constraints and try to simplfy them with
simplifySVal. This solves problematic cases where we have two symbols in
the tree, e.g.:
```
int test_rhs_further_constrained(int x, int y) {
  if (x + y != 0)
    return 0;
  if (y != 0)
    return 0;
  clang_analyzer_eval(x + y == 0); // expected-warning{{TRUE}}
  clang_analyzer_eval(y == 0);     // expected-warning{{TRUE}}
  return 0;
}
```

Differential Revision: https://reviews.llvm.org/D103314
2021-06-14 12:19:09 +02:00
Simon Pilgrim 61cdaf66fe [ADT] Remove APInt/APSInt toString() std::string variants
<string> is currently the highest impact header in a clang+llvm build:

https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html

One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps.

This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code.

Differential Revision: https://reviews.llvm.org/D103888
2021-06-11 13:19:15 +01:00