Commit Graph

73 Commits

Author SHA1 Message Date
Kazu Hirata a41fbb1fc2 [clang/unittests] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 12:14:19 -08:00
Yitzhak Mandelbaum 84dd12b290 [clang][dataflow] Add widening API and implement it for built-in boolean model.
* Adds API support for widening of lattice elements and environments,
* Updates the algorithm to apply widening where appropriate,
* Implements widening for boolean values. In the process, moves the unsoundness
  of comparison from the default implementation of
  `Environment::ValueModel::compare` to model-specific handling inside
  `DataflowEnvironment::equivalentTo`. This change is intended to clarify
  the source and location of unsoundess.

This patch is a replacement for, and was based substantially on, https://reviews.llvm.org/D131645.

Differential Revision: https://reviews.llvm.org/D137948
2022-11-22 16:09:28 +00:00
Yitzhak Mandelbaum 0b12efc7a4 [clang][dataflow] Add support for nested method calls.
Extend the context-sensitive analysis to handle a call to a method (of the same
class) from within a method. That, is a member-call expression through `this`.

Differential Revision: https://reviews.llvm.org/D134432
2022-09-22 19:16:31 +00:00
Wei Yi Tee 9cbdef6103 [clang][dataflow] Replace usage of the deprecated overload of `checkDataflow`.
Updated files:
- `ChromiumCheckModelTest.cpp`.
- `MatchSwitchTest.cpp`.
- `MultiVarConstantPropagationTest.cpp`.
- `SingleVarConstantPropagationTest.cpp`.
- `TestingSupportTest.cpp`.
- `TransferTest.cpp`.

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D133865
2022-09-16 16:19:07 +00:00
Wei Yi Tee 8dd14c427a [clang][dataflow] Use `StringMap` for storing analysis states at annotated points instead of `vector<pair<string, StateT>>`.
Reviewed By: gribozavr2, sgatev, ymandel

Differential Revision: https://reviews.llvm.org/D132763
2022-09-01 14:09:43 +00:00
Sam Estep 2efc8f8d65 [clang][dataflow] Add an option for context-sensitive depth
This patch adds a `Depth` field (default value 2) to `ContextSensitiveOptions`, allowing context-sensitive analysis of functions that call other functions. This also requires replacing the `DeclCtx` field on `Environment` with a `CallString` field that contains a vector of decl contexts, to ensure that the analysis doesn't try to analyze recursive or mutually recursive calls (which would result in a crash, due to the way we handle `StorageLocation`s).

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D131809
2022-08-15 19:58:40 +00:00
Sam Estep b3f1a6bf10 [clang][dataflow] Encode options using llvm::Optional
This patch restructures `DataflowAnalysisOptions` and `TransferOptions` to use `llvm::Optional`, in preparation for adding more sub-options to the `ContextSensitiveOptions` struct introduced here.

Reviewed By: sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131779
2022-08-12 16:29:41 +00:00
Sam Estep d09d4bd66c [clang][dataflow] Don't crash when caller args are missing storage locations
This patch modifies `Environment`'s `pushCall` method to pass over arguments that are missing storage locations, instead of crashing.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D131600
2022-08-11 13:00:42 +00:00
Sam Estep eb91fd5cbc [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-11 12:46:20 +00:00
Evgenii Stepanov 7587065043 Revert "[clang][dataflow] Analyze constructor bodies"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 000c8fef86.
2022-08-10 14:21:56 -07:00
Evgenii Stepanov 26089d4da4 Revert "[clang][dataflow] Don't crash when caller args are missing storage locations"
https://lab.llvm.org/buildbot/#/builders/74/builds/12713

This reverts commit 43b298ea12.
2022-08-10 14:21:46 -07:00
Sam Estep 43b298ea12 [clang][dataflow] Don't crash when caller args are missing storage locations
This patch modifies `Environment`'s `pushCall` method to pass over arguments that are missing storage locations, instead of crashing.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D131600
2022-08-10 17:50:34 +00:00
Sam Estep 000c8fef86 [clang][dataflow] Analyze constructor bodies
This patch adds the ability to context-sensitively analyze constructor bodies, by changing `pushCall` to allow both `CallExpr` and `CXXConstructExpr`, and extracting the main context-sensitive logic out of `VisitCallExpr` into a new `transferInlineCall` method which is now also called at the end of `VisitCXXConstructExpr`.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131438
2022-08-10 14:01:45 +00:00
Sam Estep 8611a77ee7 [clang][dataflow] Analyze method bodies
This patch adds the ability to context-sensitively analyze method bodies, by moving `ThisPointeeLoc` from `DataflowAnalysisContext` to `Environment`, and adding code in `pushCall` to set it.

Reviewed By: ymandel, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D131170
2022-08-04 17:45:47 +00:00
Sam Estep 0eaecbbc23 [clang][dataflow] Handle return statements
This patch adds a `ReturnLoc` field to the `Environment`, serving a similar to the `ThisPointeeLoc` field in the `DataflowAnalysisContext`. It then uses that (along with a new `VisitReturnStmt` method in `TransferVisitor`) to handle non-`void`-returning functions in context-sensitive analysis.

Reviewed By: ymandel, sgatev

Differential Revision: https://reviews.llvm.org/D130600
2022-08-04 17:42:19 +00:00
Sam Estep a6ddc68487 [clang][dataflow] Handle multiple context-sensitive calls to the same function
This patch enables context-sensitive analysis of multiple different calls to the same function (see the `ContextSensitiveSetBothTrueAndFalse` example in the `TransferTest` suite) by replacing the `Environment` copy-assignment with a call to the new `popCall` method, which  `std::move`s some fields but specifically does not move `DeclToLoc` and `ExprToLoc` from the callee back to the caller.

To enable this, the `StorageLocation` for a given parameter needs to be stable across different calls to the same function, so this patch also improves the modeling of parameter initialization, using `ReferenceValue` when necessary (for arguments passed by reference).

This approach explicitly does not work for recursive calls, because we currently only plan to use this context-sensitive machinery to support specialized analysis models we write, not analysis of arbitrary callees.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130726
2022-07-29 19:40:19 +00:00
Sam Estep 300fbf56f8 [clang][dataflow] Analyze calls to in-TU functions
This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `FIXME` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:54:27 +00:00
Sam Estep cc9aa157a8 Revert "[clang][dataflow] Analyze calls to in-TU functions"
This reverts commit fa2b83d07e.
2022-07-26 17:30:09 +00:00
Sam Estep fa2b83d07e [clang][dataflow] Analyze calls to in-TU functions
Depends On D130305

This patch adds initial support for context-sensitive analysis of simple functions whose definition is available in the translation unit, guarded by the `ContextSensitive` flag in the new `TransferOptions` struct. When this option is true, the `VisitCallExpr` case in the builtin transfer function has a fallthrough case which checks for a direct callee with a body. In that case, it constructs a CFG from that callee body, uses the new `pushCall` method on the `Environment` to make an environment to analyze the callee, and then calls `runDataflowAnalysis` with a `NoopAnalysis` (disabling context-sensitive analysis on that sub-analysis, to avoid problems with recursion). After the sub-analysis completes, the `Environment` from its exit block is simply assigned back to the environment at the callsite.

The `pushCall` method (which currently only supports non-method functions with some restrictions) first calls `initGlobalVars`, then maps the `SourceLocation`s for all the parameters to the existing source locations for the corresponding arguments from the callsite.

This patch adds a few tests to check that this context-sensitive analysis works on simple functions. More sophisticated functionality will be added later; the most important next step is to explicitly model context in some fields of the `DataflowAnalysisContext` class, as mentioned in a `TODO` comment in the `pushCall` implementation.

Reviewed By: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D130306
2022-07-26 17:27:19 +00:00
Sam Estep 32dcb759c3 [clang][dataflow] Move NoopAnalysis from unittests to include
This patch moves `Analysis/FlowSensitive/NoopAnalysis.h` from `clang/unittests/` to `clang/include/clang/`, so that we can use it for doing context-sensitive analysis.

Reviewed By: ymandel, gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D130304
2022-07-22 14:11:32 +00:00
Eric Li f10d271ae2 [clang][dataflow] Handle null pointers of type std::nullptr_t
Treat `std::nullptr_t` as a regular scalar type to avoid tripping
assertions when analyzing code that uses `std::nullptr_t`.

Differential Revision: https://reviews.llvm.org/D129097
2022-07-05 13:49:26 +00:00
Sam Estep 1d83a16bd3 [clang][dataflow] Replace TEST_F with TEST where possible
Many of our tests are currently written using `TEST_F` where the test fixture class doesn't have any `SetUp` or `TearDown` methods, and just one helper method. In those cases, this patch deletes the class and pulls its method out into a standalone function, using `TEST` instead of `TEST_F`.

There are still a few test files leftover in `clang/unittests/Analysis/FlowSensitive/` that use `TEST_F`:

- `DataflowAnalysisContextTest.cpp` because the class contains a `Context` field which is used
- `DataflowEnvironmentTest.cpp` because the class contains an `Environment` field which is used
- `SolverTest.cpp` because the class contains a `Vals` field which is used
- `TypeErasedDataflowAnalysisTest.cpp` because there are several different classes which all share the same method name

Reviewed By: ymandel, sgatev

Differential Revision: https://reviews.llvm.org/D128924
2022-06-30 16:03:33 +00:00
Stanislav Gatev 8207c2a660 [clang][dataflow] Handle `for` statements without conditions
Handle `for` statements without conditions.

Differential Revision: https://reviews.llvm.org/D128833

Reviewed-by: xazax.hun, gribozavr2, li.zhe.hua
2022-06-30 07:00:35 +00:00
Wei Yi Tee b611376e7e [clang][dataflow] Singleton pointer values for null pointers.
When a `nullptr` is assigned to a pointer variable, it is wrapped in a `ImplicitCastExpr` with cast kind `CK_NullTo(Member)Pointer`. This patch assigns singleton pointer values representing null to these expressions.

For each pointee type, a singleton null `PointerValue` is created and stored in the `NullPointerVals` map of the `DataflowAnalysisContext` class. The pointee type is retrieved from the implicit cast expression, and used to initialise the `PointeeLoc` field of the `PointerValue`. The `PointeeLoc` created is not mapped to any `Value`, reflecting the absence of value indicated by null pointers.

Reviewed By: gribozavr2, sgatev, xazax.hun

Differential Revision: https://reviews.llvm.org/D128056
2022-06-27 14:17:34 +02:00
Stanislav Gatev e363c5963d [clang][dataflow] Extend flow condition in the body of a do/while loop
Extend flow condition in the body of a do/while loop.

Differential Revision: https://reviews.llvm.org/D128183

Reviewed-by: gribozavr2, xazax.hun
2022-06-20 17:31:00 +00:00
Stanislav Gatev 83232099cb [clang][dataflow] Extend flow condition in the body of a for loop
Extend flow condition in the body of a for loop.

Differential Revision: https://reviews.llvm.org/D128060
2022-06-20 05:48:45 +00:00
Stanislav Gatev ba53906cef [clang][dataflow] Add support for comma binary operator
Add support for comma binary operator.

Differential Revision: https://reviews.llvm.org/D128013

Reviewed-by: ymandel, xazax.hun
2022-06-17 17:48:21 +00:00
Wei Yi Tee 97d69cdaf3 [clang][dataflow] Rename `getPointeeLoc` to `getReferentLoc` for ReferenceValue.
We distinguish between the referent location for `ReferenceValue` and pointee location for `PointerValue`. The former must be non-empty but the latter may be empty in the case of a `nullptr`

Reviewed By: gribozavr2, sgatev

Differential Revision: https://reviews.llvm.org/D127745
2022-06-15 00:53:30 +02:00
Wei Yi Tee a1b2b7d979 [clang][dataflow] Remove IndirectionValue class, moving PointeeLoc field into PointerValue and ReferenceValue
This patch precedes a future patch to make PointeeLoc for PointerValue possibly empty (for nullptr), by using a pointer instead of a reference type.
ReferenceValue should maintain a non-empty PointeeLoc reference.

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D127312
2022-06-09 01:29:16 +02:00
Stanislav Gatev 0e286b77cf [clang][dataflow] Add transfer functions for structured bindings
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Differential Revision: https://reviews.llvm.org/D120495

Reviewed-by: ymandel, xazax.hun
2022-06-02 08:02:26 +00:00
Yitzhak Mandelbaum 3682e22ef4 [clang][dataflow] Improve handling of constructor initializers.
Currently, we assert that `CXXCtorInitializer`s are field initializers. Replace
the assertion with an early return. Otherwise, we crash every time we process a
constructor with a non-field (e.g. base class) initializer.

Differential Revision: https://reviews.llvm.org/D126419
2022-05-27 13:54:11 +00:00
Yitzhak Mandelbaum 67136d0e8f [clang][dataflow] Remove private-field filtering from `StorageLocation` creation.
The API for `AggregateStorageLocation` does not allow for missing fields (it asserts). Therefore, it is incorrect to filter out any fields at location-creation time which may be accessed by the code. Currently, we limit filtering to private, base-calss fields on the assumption that those can never be accessed. However, `friend` declarations can invalidate that assumption, thereby breaking our invariants.

This patch removes said field filtering to avoid violating the invariant of "no missing fields" for `AggregateStorageLocation`.

Differential Revision: https://reviews.llvm.org/D126420
2022-05-27 13:39:53 +00:00
Eric Li 5520c58390 [clang][dataflow] Fix incorrect CXXThisExpr pointee for lambdas
When constructing the `Environment`, the `this` pointee is established
for a `CXXMethodDecl` by looking at its parent. However, inside of
lambdas, a `CXXThisExpr` refers to the captured `this` coming from the
enclosing member function.

When establishing the `this` pointee for a function, we check whether
the function is a lambda, and check for an enclosing member function
to establish the `this` pointee storage location.

Differential Revision: https://reviews.llvm.org/D126413
2022-05-25 20:58:02 +00:00
Eric Li 33b598a808 [clang][dataflow] Relax assert on existence of `this` pointee storage
Support for unions is incomplete (per 99f7d55e) and the `this` pointee
storage location is not set for unions. The assert in
`VisitCXXThisExpr` is then guaranteed to trigger when analyzing member
functions of a union.

This commit changes the assert to an early-return. Any expression may
be undefined, and so having a value for the `CXXThisExpr` is not a
postcondition of the transfer function.

Differential Revision: https://reviews.llvm.org/D126405
2022-05-25 20:58:02 +00:00
Yitzhak Mandelbaum 2f93bbb9cd [clang][dataflow] Relax `Environment` comparison operation.
Ignore `MemberLocToStruct` in environment comparison. As an ancillary data
structure, including it is redundant. We also can generate environments which
differ in their `MemberLocToStruct` but are otherwise equivalent.

Differential Revision: https://reviews.llvm.org/D126314
2022-05-24 20:58:18 +00:00
Eric Li 5bbef2e3ff [clang][dataflow] Fix double visitation of nested logical operators
Sub-expressions that are logical operators are not spelled out
separately in basic blocks, so we need to manually visit them when we
encounter them. We do this in both the `TerminatorVisitor`
(conditionally) and the `TransferVisitor` (unconditionally), which can
cause cause an expression to be visited twice when the binary
operators are nested 2+ times.

This changes the visit in `TransferVisitor` to check if it has been
evaluated before trying to visit the sub-expression.

Differential Revision: https://reviews.llvm.org/D125821
2022-05-17 20:28:48 +00:00
Yitzhak Mandelbaum eb2131bdba [clang][dataflow] Do not crash on missing `Value` for struct-typed variable init.
Remove constraint that an initializing expression of struct type must have an
associated `Value`. This invariant is not and will not be guaranteed by the
framework, because of potentially uninitialized fields.

Differential Revision: https://reviews.llvm.org/D123961
2022-04-19 20:52:29 +00:00
Yitzhak Mandelbaum bbcf11f5af [clang][dataflow] Weaken abstract comparison to enable loop termination.
Currently, when the framework is used with an analysis that does not override
`compareEquivalent`, it does not terminate for most loops. The root cause is the
interaction of (the default implementation of) environment comparison
(`compareEquivalent`) and the means by which locations and values are
allocated. Specifically, the creation of certain values (including: reference
and pointer values; merged values) results in allocations of fresh locations in
the environment. As a result, analysis of even trivial loop bodies produces
different (if isomorphic) environments, on identical inputs. At the same time,
the default analysis relies on strict equality (versus some relaxed notion of
equivalence). Together, when the analysis compares these isomorphic, yet
unequal, environments, to determine whether the successors of the given block
need to be (re)processed, the result is invariably "yes", thus preventing loop
analysis from reaching a fixed point.

There are many possible solutions to this problem, including equivalence that is
less than strict pointer equality (like structural equivalence) and/or the
introduction of an explicit widening operation. However, these solutions will
require care to be implemented correctly. While a high priority, it seems more
urgent that we fix the current default implentation to allow
termination. Therefore, this patch proposes, essentially, to change the default
comparison to trivally equate any two values. As a result, we can say precisely
that the analysis will process the loop exactly twice -- once to establish an
initial result state and the second to produce an updated result which will
(always) compare equal to the previous. While clearly unsound -- we are not
reaching a fix point of the transfer function, in practice, this level of
analysis will find many practical issues where a single iteration of the loop
impacts abstract program state.

Note, however, that the change to the default `merge` operation does not affect
soundness, because the framework already produces a fresh (sound) abstraction of
the value when the two values are distinct. The previous setting was overly
conservative.

Differential Revision: https://reviews.llvm.org/D123586
2022-04-13 19:49:50 +00:00
Yitzhak Mandelbaum d002495b94 [clang][dataflow] Support integral casts
Adds support for implicit casts `CK_IntegralCast` and `CK_IntegralToBoolean`.

Differential Revision: https://reviews.llvm.org/D123037
2022-04-05 13:55:32 +00:00
Yitzhak Mandelbaum 506ec85ba8 [clang][dataflow] Add support for clang's `__builtin_expect`.
This patch adds basic modeling of `__builtin_expect`, just to propagate the
(first) argument, making the call transparent.

Driveby: adds tests for proper handling of other builtins.

Differential Revision: https://reviews.llvm.org/D122908
2022-04-04 12:20:43 +00:00
Yitzhak Mandelbaum 01db10365e [clang][dataflow] Add support for correlation of boolean (tracked) values
This patch extends the join logic for environments to explicitly handle
boolean values. It creates the disjunction of both source values, guarded by the
respective flow conditions from each input environment. This change allows the
framework to reason about boolean correlations across multiple branches (and
subsequent joins).

Differential Revision: https://reviews.llvm.org/D122838
2022-04-01 17:25:49 +00:00
Yitzhak Mandelbaum ef1e1b3106 [clang][dataflow] Add support for (built-in) (in)equality operators
Adds logical interpretation of built-in equality operators, `==` and `!=`.s

Differential Revision: https://reviews.llvm.org/D122830
2022-04-01 17:13:21 +00:00
Yitzhak Mandelbaum 36d4e84427 [clang][dataflow] Fix handling of base-class fields.
Currently, the framework does not track derived class access to base
fields. This patch adds that support and a corresponding test.

Differential Revision: https://reviews.llvm.org/D122273
2022-04-01 15:01:32 +00:00
Stanislav Gatev cf63e9d4ca [clang][dataflow] Add support for nested composite bool expressions
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Differential Revision: https://reviews.llvm.org/D121455
2022-03-14 17:18:30 +00:00
Stanislav Gatev 3dd7877b27 Revert "[clang][dataflow] Move dataflow testing support out of unittests"
This reverts commit 26bbde2612.
2022-03-09 15:38:51 +00:00
Stanislav Gatev 26bbde2612 [clang][dataflow] Move dataflow testing support out of unittests
This enables tests out of clang/unittests/Analysis/FlowSensitive to
use the testing support utilities.

Reviewed-by: ymandel, gribozavr2

Differential Revision: https://reviews.llvm.org/D121285
2022-03-09 15:31:02 +00:00
Stanislav Gatev e0cc28dfdc Revert "[clang][dataflow] Add analysis that detects unsafe accesses to optionals"
This reverts commit ce205cffdf.
2022-03-09 09:51:03 +00:00
Stanislav Gatev ce205cffdf [clang][dataflow] Add analysis that detects unsafe accesses to optionals
Adds a dataflow analysis that detects unsafe accesses to values of type
`std::optional`, `absl::optional`, or `base::Optional`.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D121197
2022-03-09 09:42:51 +00:00
Yitzhak Mandelbaum 18c84e2d32 [clang][dataflow] Fix nullptr dereferencing error.
When pre-initializing fields in the environment, the code assumed that all
fields of a struct would be initialized. However, given limits on value
construction, that assumption is incorrect. This patch changes the code to drop
that assumption and thereby avoid dereferencing a nullptr.

Differential Revision: https://reviews.llvm.org/D121158
2022-03-08 03:01:31 +00:00
Stanislav Gatev 1e5715857a [clang][dataflow] Extend flow conditions from block terminators
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120984
2022-03-07 17:50:44 +00:00