Commit Graph

4687 Commits

Author SHA1 Message Date
isuckatcs 996b092c5e [analyzer] Lambda capture non-POD type array
This patch introduces a new `ConstructionContext` for
lambda capture. This `ConstructionContext` allows the
analyzer to construct the captured object directly into
it's final region, and makes it possible to capture
non-POD arrays.

Differential Revision: https://reviews.llvm.org/D129967
2022-07-26 09:40:25 +02:00
isuckatcs 8a13326d18 [analyzer] ArrayInitLoopExpr with array of non-POD type
This patch introduces the evaluation of ArrayInitLoopExpr
in case of structured bindings and implicit copy/move
constructor. The idea is to call the copy constructor for
every element in the array. The parameter of the copy
constructor is also manually selected, as it is not a part
of the CFG.

Differential Revision: https://reviews.llvm.org/D129496
2022-07-26 09:07:22 +02:00
Kazu Hirata 3f3930a451 Remove redundaunt virtual specifiers (NFC)
Identified with tidy-modernize-use-override.
2022-07-25 23:00:59 -07:00
Kazu Hirata ae002f8bca Use isa instead of dyn_cast (NFC) 2022-07-25 23:00:58 -07:00
Balázs Kéri 94ca2beccc [clang][analyzer] Added partial wide character support to CStringChecker
Support for functions wmemcpy, wcslen, wcsnlen is added to the checker.
Documentation and tests are updated and extended with the new functions.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D130091
2022-07-25 09:23:14 +02:00
Kazu Hirata 95a932fb15 Remove redundaunt override specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 22:28:11 -07:00
Kazu Hirata a210f404da [clang] Remove redundant virtual specifies (NFC)
Identified with modernize-use-override.
2022-07-24 22:02:58 -07:00
Kazu Hirata 9e88cbcc40 Use any_of (NFC) 2022-07-24 14:48:11 -07:00
Denys Petrov a364987368 [analyzer][NFC] Use `SValVisitor` instead of explicit helper functions
Summary: Get rid of explicit function splitting in favor of specifically designed Visitor. Move logic from a family of `evalCastKind` and `evalCastSubKind` helper functions to `SValVisitor`.

Differential Revision: https://reviews.llvm.org/D130029
2022-07-19 23:10:00 +03:00
serge-sans-paille f764dc99b3 [clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.

Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.

n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member

This takes into account two specificities of clang: array bounds as macro id
disqualify FAM, as well as non standard layout.

Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
2022-07-18 12:45:52 +02:00
Denys Petrov bc08c3cb7f [analyzer] Add new function `clang_analyzer_value` to ExprInspectionChecker
Summary: Introduce a new function 'clang_analyzer_value'. It emits a report that in turn prints a RangeSet or APSInt associated with SVal. If there is no associated value, prints "n/a".
2022-07-15 20:07:04 +03:00
Denys Petrov 82f76c0477 [analyzer][NFC] Tidy up handler-functions in SymbolicRangeInferrer
Summary: Sorted some handler-functions into more appropriate visitor functions of the SymbolicRangeInferrer.
- Spread `getRangeForNegatedSub` body over several visitor functions: `VisitSymExpr`, `VisitSymIntExpr`, `VisitSymSymExpr`.
- Moved `getRangeForComparisonSymbol` from `infer` to `VisitSymSymExpr`.

Differential Revision: https://reviews.llvm.org/D129678
2022-07-15 19:24:57 +03:00
Fangrui Song 3c849d0aef Modernize Optional::{getValueOr,hasValue} 2022-07-15 01:20:39 -07:00
Jonas Devlieghere 888673b6e3
Revert "[clang] Implement ElaboratedType sugaring for types written bare"
This reverts commit 7c51f02eff because it
stills breaks the LLDB tests. This was  re-landed without addressing the
issue or even agreement on how to address the issue. More details and
discussion in https://reviews.llvm.org/D112374.
2022-07-14 21:17:48 -07:00
Matheus Izvekov 7c51f02eff
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could exposed a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-15 04:16:55 +02:00
isuckatcs b032e3ff61 [analyzer] Evaluate construction of non-POD type arrays
Introducing the support for evaluating the constructor
of every element in an array. The idea is to record the
index of the current array member being constructed and
create a loop during the analysis. We looping over the
same CXXConstructExpr as many times as many elements
the array has.

Differential Revision: https://reviews.llvm.org/D127973
2022-07-14 23:30:21 +02:00
Ella Ma 32fe1a4be9 [analyzer] Fixing SVal::getType returns Null Type for NonLoc::ConcreteInt in boolean type
In method `TypeRetrievingVisitor::VisitConcreteInt`, `ASTContext::getIntTypeForBitwidth` is used to get the type for `ConcreteInt`s.
However, the getter in ASTContext cannot handle the boolean type with the bit width of 1, which will make method `SVal::getType` return a Null `Type`.
In this patch, a check for this case is added to fix this problem by returning the bool type directly when the bit width is 1.

Differential Revision: https://reviews.llvm.org/D129737
2022-07-14 22:00:38 +08:00
Kazu Hirata cb2c8f694d [clang] Use value instead of getValue (NFC) 2022-07-13 23:39:33 -07:00
einvbri 1d7e58cfad [analyzer] Fix use of length in CStringChecker
CStringChecker is using getByteLength to get the length of a string
literal. For targets where a "char" is 8-bits, getByteLength() and
getLength() will be equal for a C string, but for targets where a "char"
is 16-bits getByteLength() returns the size in octets.

This is verified in our downstream target, but we have no way to add a
test case for this case since there is no target supporting 16-bit
"char" upstream. Since this cannot have a test case, I'm asserted this
change is "correct by construction", and visually inspected to be
correct by way of the following example where this was found.

The case that shows this fails using a target with 16-bit chars is here.
getByteLength() for the string literal returns 4, which fails when
checked against "char x[4]". With the change, the string literal is
evaluated to a size of 2 which is a correct number of "char"'s for a
16-bit target.

```
void strcpy_no_overflow_2(char *y) {
  char x[4];
  strcpy(x, "12"); // with getByteLength(), returns 4 using 16-bit chars
}
```

This change exposed that embedded nulls within the string are not
handled. This is documented as a FIXME for a future fix.

```
    void strcpy_no_overflow_3(char *y) {
      char x[3];
      strcpy(x, "12\0");
    }

```

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D129269
2022-07-13 19:19:23 -05:00
Jonas Devlieghere 3968936b92
Revert "[clang] Implement ElaboratedType sugaring for types written bare"
This reverts commit bdc6974f92 because it
breaks all the LLDB tests that import the std module.

  import-std-module/array.TestArrayFromStdModule.py
  import-std-module/deque-basic.TestDequeFromStdModule.py
  import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py
  import-std-module/forward_list.TestForwardListFromStdModule.py
  import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py
  import-std-module/list.TestListFromStdModule.py
  import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py
  import-std-module/queue.TestQueueFromStdModule.py
  import-std-module/stack.TestStackFromStdModule.py
  import-std-module/vector.TestVectorFromStdModule.py
  import-std-module/vector-bool.TestVectorBoolFromStdModule.py
  import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py
  import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/
2022-07-13 09:20:30 -07:00
Kazu Hirata 53daa177f8 [clang, clang-tools-extra] Use has_value instead of hasValue (NFC) 2022-07-12 22:47:41 -07:00
Matheus Izvekov bdc6974f92
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-13 02:10:09 +02:00
Gabor Marton 2df120784a [analyzer] Fix assertion in simplifySymbolCast
Depends on D128068.
Added a new test code that fails an assertion in the baseline.
That is because `getAPSIntType` works only with integral types.

Differential Revision: https://reviews.llvm.org/D126779
2022-07-05 19:00:23 +02:00
Gabor Marton 5d7fa481cf [analyzer] Do not emit redundant SymbolCasts
In `RegionStore::getBinding` we call `evalCast` unconditionally to align
the stored value's type to the one that is being queried. However, the
stored type might be the same, so we may end up having redundant
`SymbolCasts` emitted.

The solution is to check whether the `to` and `from` type are the same
in `makeNonLoc`.

Note, we can't just do type equivalence check at the beginning of `evalCast`
because when `evalCast` is called from `getBinding` then the original type
(`OriginalTy`) is not set, so one operand is missing for the comparison. In
`evalCastSubKind(nonloc::SymbolVal)` when the original type is not set,
we get the `from` type via `SymbolVal::getType()`.

Differential Revision: https://reviews.llvm.org/D128068
2022-07-05 18:42:34 +02:00
Fazlay Rabbi 38bcd483dd [OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd' construct
This patch gives basic parsing and semantic support for
"parallel masked taskloop simd" construct introduced in
OpenMP 5.1 (section 2.16.10)

Differential Revision: https://reviews.llvm.org/D128946
2022-07-01 08:57:15 -07:00
Fazlay Rabbi d64ba896d3 [OpenMP] Initial parsing and sema support for 'parallel masked taskloop' construct
This patch gives basic parsing and semantic support for
"parallel masked taskloop" construct introduced in
OpenMP 5.1 (section 2.16.9)

Differential Revision: https://reviews.llvm.org/D128834
2022-06-30 11:44:17 -07:00
Corentin Jabot 64ab2b1dcc Improve handling of static assert messages.
Instead of dumping the string literal (which
quotes it and escape every non-ascii symbol),
we can use the content of the string when it is a
8 byte string.

Wide, UTF-8/UTF-16/32 strings are still completely
escaped, until we clarify how these entities should
behave (cf https://wg21.link/p2361).

`FormatDiagnostic` is modified to escape
non printable characters and invalid UTF-8.

This ensures that unicode characters, spaces and new
lines are properly rendered in static messages.
This make clang more consistent with other implementation
and fixes this tweet
https://twitter.com/jfbastien/status/1298307325443231744 :)

Of note, `PaddingChecker` did print out new lines that were
later removed by the diagnostic printing code.
To be consistent with its tests, the new lines are removed
from the diagnostic.

Unicode tables updated to both use the Unicode definitions
and the Unicode 14.0 data.

U+00AD SOFT HYPHEN is still considered a print character
to match existing practices in terminals, in addition of
being considered a formatting character as per Unicode.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D108469
2022-06-29 14:57:35 +02:00
isuckatcs 9d2e830737 [analyzer] Fix BindingDecl evaluation for reference types
The case when the bound variable is reference type in a
BindingDecl wasn't handled, which lead to false positives.

Differential Revision: https://reviews.llvm.org/D128716
2022-06-29 13:01:19 +02:00
Fazlay Rabbi 73e5d7bdff [OpenMP] Initial parsing and sema support for 'masked taskloop simd' construct
This patch gives basic parsing and semantic support for
"masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.8)

Differential Revision: https://reviews.llvm.org/D128693
2022-06-28 15:27:49 -07:00
Corentin Jabot a774ba7f60 Revert "Improve handling of static assert messages."
This reverts commit 870b6d2183.

This seems to break some libc++ tests, reverting while investigating
2022-06-29 00:03:23 +02:00
Corentin Jabot 870b6d2183 Improve handling of static assert messages.
Instead of dumping the string literal (which
quotes it and escape every non-ascii symbol),
we can use the content of the string when it is a
8 byte string.

Wide, UTF-8/UTF-16/32 strings are still completely
escaped, until we clarify how these entities should
behave (cf https://wg21.link/p2361).

`FormatDiagnostic` is modified to escape
non printable characters and invalid UTF-8.

This ensures that unicode characters, spaces and new
lines are properly rendered in static messages.
This make clang more consistent with other implementation
and fixes this tweet
https://twitter.com/jfbastien/status/1298307325443231744 :)

Of note, `PaddingChecker` did print out new lines that were
later removed by the diagnostic printing code.
To be consistent with its tests, the new lines are removed
from the diagnostic.

Unicode tables updated to both use the Unicode definitions
and the Unicode 14.0 data.

U+00AD SOFT HYPHEN is still considered a print character
to match existing practices in terminals, in addition of
being considered a formatting character as per Unicode.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D108469
2022-06-28 22:26:00 +02:00
Vitaly Buka cdfa15da94 Revert "[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays"
This reverts D126864 and related fixes.

This reverts commit 572b08790a.
This reverts commit 886715af96.
2022-06-27 14:03:09 -07:00
Kazu Hirata 97afce08cb [clang] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 22:26:24 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Fazlay Rabbi 42bb88e2aa [OpenMP] Initial parsing and sema support for 'masked taskloop' construct
This patch gives basic parsing and semantic support for "masked taskloop"
construct introduced in OpenMP 5.1 (section 2.16.7)

Differential Revision: https://reviews.llvm.org/D128478
2022-06-24 10:00:08 -07:00
serge-sans-paille 886715af96 [clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.

Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.

n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member
n = 3: any trailing array member of undefined size is a flexible array member (strict c99 conformance)

Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
2022-06-24 16:13:29 +02:00
isuckatcs 8ef628088b [analyzer] Structured binding to arrays
Introducing structured binding to data members and more.
To handle binding to arrays, ArrayInitLoopExpr is also
evaluated, which enables the analyzer to store information
in two more cases. These are:
  - when a lambda-expression captures an array by value
  - in the implicit copy/move constructor for a class
    with an array member

Differential Revision: https://reviews.llvm.org/D126613
2022-06-23 11:38:21 +02:00
Balázs Kéri 7dc81c6244 [clang][analyzer] Fix StdLibraryFunctionsChecker 'mkdir' return value.
The functions 'mkdir', 'mknod', 'mkdirat', 'mknodat' return 0 on success
and -1 on failure. The checker modeled these functions with a >= 0
return value on success which is changed to 0 only. This fix makes
ErrnoChecker work better for these functions.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D127277
2022-06-23 11:27:26 +02:00
Balázs Kéri 957014da2d [clang][Analyzer] Add errno state to standard functions modeling.
This updates StdLibraryFunctionsChecker to set the state of 'errno'
by using the new errno_modeling functionality.
The errno value is set in the PostCall callback. Setting it in call::Eval
did not work for some reason and then every function should be
EvalCallAsPure which may be bad to do. Now the errno value and state
is not allowed to be checked in any PostCall checker callback because
it is unspecified if the errno was set already or will be set later
by this checker.

Reviewed By: martong, steakhal

Differential Revision: https://reviews.llvm.org/D125400
2022-06-21 08:56:41 +02:00
Kazu Hirata ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Kazu Hirata 0916d96d12 Don't use Optional::hasValue (NFC) 2022-06-20 20:17:57 -07:00
Kazu Hirata 064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Kazu Hirata 5413bf1bac Don't use Optional::hasValue (NFC) 2022-06-20 11:33:56 -07:00
Kazu Hirata 452db157c9 [clang] Don't use Optional::hasValue (NFC) 2022-06-20 10:51:34 -07:00
Balázs Kéri 60f3b07118 [clang][analyzer] Add checker for bad use of 'errno'.
Extend checker 'ErrnoModeling' with a state of 'errno' to indicate
the importance of the 'errno' value and how it should be used.
Add a new checker 'ErrnoChecker' that observes use of 'errno' and
finds possible wrong uses, based on the "errno state".
The "errno state" should be set (together with value of 'errno')
by other checkers (that perform modeling of the given function)
in the future. Currently only a test function can set this value.
The new checker has no user-observable effect yet.

Reviewed By: martong, steakhal

Differential Revision: https://reviews.llvm.org/D122150
2022-06-20 10:07:31 +02:00
Kazu Hirata 06decd0b41 [clang] Use value_or instead of getValueOr (NFC) 2022-06-18 23:21:34 -07:00
isuckatcs e77ac66b8c [Static Analyzer] Structured binding to data members
Introducing structured binding to data members.

Differential Revision: https://reviews.llvm.org/D127643
2022-06-17 19:50:10 +02:00
isuckatcs 92bf652d40 [Static Analyzer] Small array binding policy
If a lazyCompoundVal to a struct is bound to the store, there is a policy which decides
whether a copy gets created instead.

This patch introduces a similar policy for arrays, which is required to model structured
binding to arrays without false negatives.

Differential Revision: https://reviews.llvm.org/D128064
2022-06-17 18:56:13 +02:00
Jennifer Yu bb83f8e70b [OpenMP] Initial parsing and sema for 'parallel masked' construct
Differential Revision: https://reviews.llvm.org/D127454
2022-06-16 18:01:15 -07:00