Commit Graph

15669 Commits

Author SHA1 Message Date
Artem Belevich 9a01cca660 Add support for CUDA-11.8 and sm_{87,89,90} GPUs.
Differential Revision: https://reviews.llvm.org/D135306
2022-10-07 13:59:28 -07:00
Yaxun (Sam) Liu 107ee26130 [AMDGPU] Disable bool range metadata to workaround backend issue
Currently there is a middle-end or backend issue
https://github.com/llvm/llvm-project/issues/58176
which causes values loaded from bool pointer incorrect when
bool range metadata is emitted. Temporarily
disable bool range metadata until the backend issue
is fixed.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D135269

Fixes: SWDEV-344137
2022-10-07 10:46:04 -04:00
Jan Sjodin 4627cef113 [OpenMP][OMPIRBuilder] Migrate emitOffloadingArraysArgument from clang
This patch moves the emitOffloadingArraysArgument function and
supporting data structures to OpenMPIRBuilder. This will later be used
in flang as well. The TargetDataInfo class was split up into generic
information and clang-specific data, which remain in clang. Further
migration will be done in in the future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D134662
2022-10-07 07:03:03 -05:00
Manuel Brito 14e2592ff6 [clang][CodeGen] Use poison instead of undef as placeholder in ARM builtins [NFC]
Differential Revision: https://reviews.llvm.org/D135392
2022-10-07 12:50:59 +01:00
Joseph Huber 4aa87a131f [OpenMP][AMDGPU] Add 'uniform-work-group' attribute to OpenMP kernels
The `cl-uniform-work-group` attribute asserts that the global work-size
be a multiple of the work-group specified work group size. This should
allow optimizations. It is already present by default in the AMD
compiler and for HIP kernels so it should be safe to allow this for
OpenMP kernels by default.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135374
2022-10-06 18:22:09 -05:00
Xiang Li 2bdfececef [HLSL] Remove global ctor/dtor variable for non-lib profile.
After generated call for ctor/dtor for entry, global variable for ctor/dtor are useless.
Remove them for non-lib profiles.
Lib profile still need these in case export function used the global variable which require ctor/dtor.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D133993
2022-10-06 15:00:50 -07:00
Bill Wendling 7404b855e5 [clang][NFC] Use enum for -fstrict-flex-arrays
Use enums for the strict flex arrays flag so that it's more readable.

Differential Revision: https://reviews.llvm.org/D135107
2022-10-06 10:45:41 -07:00
Joseph Huber a8ec170e01 [OpenMP] Make the exec_mode global have protected visibility
We use protected visibility for almost everything with offloading. This
is because it provides us with the ability to read things from the host
without the expectation that it will be preempted by a shared library
load, bugs related to this have happened when offloading to the host.
This patch just makes the `exec_mode` global generated for each plugin
have protected visibility.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135285
2022-10-05 14:39:22 -05:00
David Blaikie 4769976c49 MSVC ABI: Looks like even non-aarch64 uses the MSVC/14 definition for pod/aggregate passing
Details posted here: https://reviews.llvm.org/D119051#3747201

3 cases that were inconsistent with the MSABI without this patch applied:
  https://godbolt.org/z/GY48qxh3G - field with protected member
  https://godbolt.org/z/Mb1PYhjrP - non-static data member initializer
  https://godbolt.org/z/sGvxcEPjo - defaulted copy constructor

I'm not sure what's suitable/sufficient testing for this - I did verify
the three cases above. Though if it helps to add them as explicit tests,
I can do that too.

Also, I was wondering if the other use of isTrivialForAArch64MSVC in
isPermittedToBeHomogenousAggregate could be another source of bugs - I
tried changing the function to unconditionally call
isTrivialFor(AArch64)MSVC without testing AArch64 first, but no tests
fail, so it looks like this is undertested in any case. But I had
trouble figuring out how to exercise this functionality properly to add
test coverage and then compare that to MSVC itself... - I got very
confused/turned around trying to test this, so I've given up enough to
send what I have out for review, but happy to look further into this
with help.

Differential Revision: https://reviews.llvm.org/D133817
2022-10-04 20:19:17 +00:00
David Blaikie 2e1c1d6d72 MSVC AArch64 ABI: Homogeneous aggregates
Fixes:
Protected members, HFA: https://godbolt.org/z/zqdK7vdKc
Private members, HFA: https://godbolt.org/z/zqdK7vdKc
Non-empty base, HFA: https://godbolt.org/z/PKTz59Wev
User-provided ctor, HFA: https://godbolt.org/z/sfrTddcW6

Existing correct cases:
Empty base class, NonHFA: https://godbolt.org/z/4veY9MWP3
 - correct by accident of not allowing bases at all (see non-empty base
   case/fix above for counterexample)
Polymorphic: NonHFA: https://godbolt.org/z/4veY9MWP3
Trivial copy assignment, HFA: https://godbolt.org/z/Tdecj836P
Non-trivial copy assignment, NonHFA: https://godbolt.org/z/7c4bE9Whq
Non-trivial default ctor, NonHFA: https://godbolt.org/z/Tsq1EE7b7
 - correct by accident of disallowing all user-provided ctors (see
   user-provided non-default ctor example above for counterexample)
Trivial dtor, HFA: https://godbolt.org/z/nae999aqz
Non-trivial dtor, NonHFA: https://godbolt.org/z/69oMcshb1
Empty field, NonHFA: https://godbolt.org/z/8PTxsKKMK
 - true due to checking for the absence of padding (see comment in code)

After a bunch of testing, this fixes a bunch of cases that were
incorrect. Some of the tests verify the nuances of the existing
behavior/code checks that were already present.

This was mostly motivated by cleanup from/in D133817 which itself was
motivated by D119051.

By removing the incorrect use of isTrivialForAArch64MSVC here & adding
more nuance to the homogeneous testing we can more safely/confidently
make changes to the isTrivialFor(AArch64)MSVC to more properly align
with its usage anyway.

Differential Revision: https://reviews.llvm.org/D134688
2022-10-04 20:17:29 +00:00
serge-sans-paille 3460a5d795 [clang] Unify Sema and CodeGen implementation of isFlexibleArrayMemberExpr
Turn it into a single Expr::isFlexibleArrayMemberLike method, as discussed in

        https://discourse.llvm.org/t/rfc-harmonize-flexible-array-members-handling

Keep different behavior with respect to macro / template substitution, and
harmonize sharp edges: ObjC interface now behave as C struct wrt. FAM and
-fstrict-flex-arrays.

This does not impact __builtin_object_size interactions with FAM.

Differential Revision: https://reviews.llvm.org/D134791
2022-10-04 20:42:36 +02:00
Alex Langford 266ec801fb [clang][DebugInfo] Respect fmodule-file-home-is-cwd in skeleton CUs for clang modules
When -fmodule-file-home-is-cwd and the path to the PCM is relative, we
shouldn't assume that the path to the PCM is relative to the modulemap
that produced it. To respect the option -fmodule-file-home-is-cwd, we
should assume the path is relative to the current working directory.

Reviewed By: rmaz

Differential Revision: https://reviews.llvm.org/D134911
2022-10-04 11:25:43 -07:00
Dominik Adamski 6842d35012 [OpenMP][OMPIRBuilder] Add support for order(concurrent) to OMPIRBuilder for SIMD directive
If 'order(concurrent)' clause is specified, then the iterations of SIMD loop
can be executed concurrently.

This patch adds support for LLVM IR codegen via OMPIRBuilder for SIMD loop
with 'order(concurrent)' clause. The functionality added to OMPIRBuilder is
similar to the functionality implemented in 'CodeGenFunction::EmitOMPSimdInit'.

Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134046

Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
2022-10-04 08:30:00 -05:00
Shu-Chun Weng 3933c43d90 [clang] Add cc1 option -fctor-dtor-return-this
This option forces constructors and non-deleting destructors to return
`this` pointer in C++ ABI (except for Microsoft ABI, on which this flag
has no effect).

This is similar to ARM32, Apple ARM64, or Fuchsia C++ ABI, but can be
applied to any target triple.

Differential Revision: https://reviews.llvm.org/D119209
2022-10-03 14:28:06 -07:00
David Green 781b491bba [Clang][AArch64] Support AArch64 target(..) attribute formats.
This adds support under AArch64 for the target("..") attributes. The
current parsing is very X86-shaped, this patch attempts to bring it line
with the GCC implementation from
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes.

The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
  function as per the -march=arch+feature option.
- "cpu=<cpu>" strings, that specify the target-cpu and any implied
  atributes as per the -mcpu=cpu+feature option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
  per -mtune.
- "+<feature>", "+no<feature>" enables/disables the specific feature, for
  compatibility with GCC target attributes.
- "<feature>", "no-<feature>" enabled/disables the specific feature, for
  backward compatibility with previous releases.

To do this, the parsing of target attributes has been moved into
TargetInfo to give the target the opportunity to override the existing
parsing. The only non-aarch64 change should be a minor alteration to the
error message, specifying using "CPU" to describe the cpu, not
"architecture", and the DuplicateArch/Tune from ParsedTargetAttr have
been combined into a single option.

Differential Revision: https://reviews.llvm.org/D133848
2022-10-01 15:40:59 +01:00
Ben Dunbobbin 7eee2a2d44 [IR] Don't allow DLL storage-class and local linkage
Disallow this meaningless combination. Doing so simplifies analysis
of LLVM code w.r.t t DLL storage-class, and prevents mistakes with
DLL storage class.

- Change the assembler to reject DLL storage class on symbols with
  local linkage.
- Change the bitcode reader to clear the DLL Storage class when the
  linkage is local for auto-upgrading
- Update LangRef.

There is an existing restriction on non-default visibility and local
linkage which this is modelled on.

Differential Review: https://reviews.llvm.org/D134784
2022-09-30 00:26:01 +01:00
Michael Platings dba8fced96 Fix frint ACLE intrinsic names
Although the instruction names begin "frint", the ACLE spec states that
the intrinsic names begin "__rint", without the "f".

Differential Revision: https://reviews.llvm.org/D134824
2022-09-29 09:13:07 +01:00
Arthur Eubanks 2f3d7c2cc7 [clang] Add debug info in MicrosoftCXXABI::EmitVirtualMemPtrThunk()
(Probably) fixes https://crbug.com/1355639

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D134825
2022-09-28 22:06:04 -07:00
Yonghong Song 75be0482a2 [clang][DebugInfo] Emit debuginfo for non-constant case value
Currently, clang does not emit debuginfo for the switch stmt
case value if it is an enum value. For example,
  $ cat test.c
  enum { AA = 1, BB = 2 };
  int func1(int a) {
    switch(a) {
    case AA: return 10;
    case BB: return 11;
    default: break;
    }
    return 0;
  }
  $ llvm-dwarfdump test.o | grep AA
  $
Note that gcc does emit debuginfo for the same test case.

This patch added such a support with similar implementation
to CodeGenFunction::EmitDeclRefExprDbgValue(). With this patch,
  $ clang -g -c test.c
  $ llvm-dwarfdump test.o | grep AA
                  DW_AT_name    ("AA")
  $

Differential Revision: https://reviews.llvm.org/D134705
2022-09-28 12:10:48 -07:00
Aaron Ballman 60727d8569 [C2x] implement typeof and typeof_unqual
This implements WG14 N2927 and WG14 N2930, which together define the
feature for typeof and typeof_unqual, which get the type of their
argument as either fully qualified or fully unqualified. The argument
to either operator is either a type name or an expression. If given a
type name, the type information is pulled directly from the given name.
If given an expression, the type information is pulled from the
expression. Recursive use of these operators is allowed and has the
expected behavior (the innermost operator is resolved to a type, and
that's used to resolve the next layer of typeof specifier, until a
fully resolved type is determined.

Note, we already supported typeof in GNU mode as a non-conforming
extension and we are *not* exposing typeof_unqual as a non-conforming
extension in that mode, nor are we exposing typeof or typeof_unqual as
a nonconforming extension in other language modes. The GNU variant of
typeof supports a form where the parentheses are elided from the
operator when given an expression (e.g., typeof 0 i = 12;). When in C2x
mode, we do not support this extension.

Differential Revision: https://reviews.llvm.org/D134286
2022-09-28 13:27:52 -04:00
Jennifer Yu 30cc712eb6 [Clang][OpenMP] Fix run time crash when use_device_addr is used.
It is data mapping ordering problem.

According omp spec
If one or more map clauses are present, the list item conversions that
are performed for any use_device_ptr or use_device_addr clause occur
after all variables are mapped on entry to the region according to those
map clauses.

The change is to put mapping data for use_device_addr at end of data
mapping array.

Differential Revision: https://reviews.llvm.org/D134556
2022-09-27 11:53:57 -07:00
Simon Pilgrim 75e90ea766 Fix MSVC "not all control paths return a value" warning. NFCI. 2022-09-26 10:27:38 +01:00
Nico Weber ea8371247f [clang-cl] Implement /ZH: flag
Based on a patch by Arlo Siemsen (D98438)!

Differential Revision: https://reviews.llvm.org/D134544
2022-09-25 14:43:14 -04:00
eopXD 10409bf86e [FPEnv] Remove inaccurate comments regarding signaling NaN for isless
By draft of C23 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf),
the description for isless macro under 7.12.17.3 says,

The isless macro determines whether its first argument is less than its second
argument. The value of isless(x,y) is always equal to (x)< (y); however, unlike
(x) < (y), isless(x,y) does not raise the invalid floating-point exception when
x and y are unordered and neither is a signaling NaN.

isless should trap when encountering signaling NaN.

Reviewed By: jcranmer-intel, efriedma

Differential Revision: https://reviews.llvm.org/D134407
2022-09-22 18:13:16 -07:00
Xiang Li bad2e6c830 [HLSL] clang codeGen for HLSLNumThreadsAttr
Translate HLSLNumThreadsAttr into function attribute with name "dx.numthreads" and value format as "x,y,z".

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D131799
2022-09-22 15:30:52 -07:00
Yaxun (Sam) Liu 5e25284dbc [AMDGPU] Emit module flag for all code object versions
Reviewed by: Changpeng Fang, Matt Arsenault, Brian Sumner

Differential Revision: https://reviews.llvm.org/D134355
2022-09-22 16:51:33 -04:00
Craig Topper 52708be182 [RISCV] Remove support for the unratified Zbe, Zbf, and Zbm extensions.
These extensions do not appear to be on their way to ratification.
2022-09-22 13:04:41 -07:00
Jonathan Camilleri 4cd7529e4c [clang][DebugInfo] Emit access specifiers for typedefs
The accessibility level of a typedef or using declaration in a
struct or class was being lost when producing debug information.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D134339
2022-09-22 17:08:41 +00:00
serge-sans-paille d442040292 [clang] Fix interaction between asm labels and inline builtins
One must pick the same name as the one referenced in CodeGenFunction when
generating .inline version of an inline builtin, otherwise they are not
correctly replaced.

Differential Revision: https://reviews.llvm.org/D134362
2022-09-22 09:24:47 +02:00
Craig Topper 182aa0cbe0 [RISCV] Remove support for the unratified Zbp extension.
This extension does not appear to be on its way to ratification.

Still need some follow up to simplify the RISCVISD nodes.
2022-09-21 21:22:42 -07:00
Chuanqi Xu 327141fb1d [C++] [Coroutines] Prefer aligned (de)allocation for coroutines -
implement the option2 of P2014R0

This implements the option2 of
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2014r0.pdf.

This also fixes https://github.com/llvm/llvm-project/issues/56671.

Although wg21 didn't get consensus for the direction of the problem,
we're happy to have some implementation and user experience first. And
from issue56671, the option2 should be the pursued one.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D133341
2022-09-22 11:28:29 +08:00
Michael Wyman aa4bcaab96 Remove the unused/undefined `_cmd` parameter in `objc_direct` methods.
When `objc_direct` methods were implemented, the implicit `_cmd` parameter was left as an argument to the method implementation function, but was unset by callers; if the method body referenced the `_cmd` variable, a selector load would be emitted inside the body. However, this leaves an unused argument in the ABI, and is unnecessary.

This change removes the empty/unset argument, and if `_cmd` is referenced inside an `objc_direct` method it will emit local storage for the implicit variable. From the ABI perspective, `objc_direct` methods will have the implicit `self` parameter, immediately followed by whatever explicit arguments are defined on the method, rather than having one unset/undefined register in the middle.

Differential Revision: https://reviews.llvm.org/D131424
2022-09-21 15:37:48 -07:00
Xiang Li a7e3de2450 [NFC] Fix build error ignored by MSVC. 2022-09-21 10:57:43 -07:00
Chris Bieneman bc97751a23 [NFC] Add GitHub issues to HLSL FIXME comments
In order to make this easier to track I've filed issues for each of the
HLSL FIXME comments that I can find. I may have missed some, but I want
this to be the new default mode.
2022-09-21 10:31:25 -05:00
Jennifer Yu 48ffd40ba2 [Clang][OpenMP] Codegen generation for has_device_addr claues.
This patch add codegen support for the has_device_addr clause. It use
the same logic of is_device_ptr. But passing &var instead pointer to var
to kernal.

Differential Revision: https://reviews.llvm.org/D134268
2022-09-20 21:12:30 -07:00
Craig Topper 70a64fe7b1 [RISCV] Remove support for the unratified Zbt extension.
This extension does not appear to be on its way to ratification.

Out of the unratified bitmanip extensions, this one had the
largest impact on the compiler.

Posting this patch to start a discussion about whether we should
remove these extensions. We'll talk more at the RISC-V sync meeting this
Thursday.

Reviewed By: asb, reames

Differential Revision: https://reviews.llvm.org/D133834
2022-09-20 20:26:48 -07:00
Ron Lieberman d5b5289561 revert 684f76643 [Clang][OpenMP] Codegen generation for has_device_addr claues.
breaks amdgpu buildbot
2022-09-20 01:37:27 +00:00
Phoebe Wang 46bb4b99ae [X86][fastcall][vectorcall] Move capability check before free register update
When passing arguments with `__fastcall` or `__vectorcall` in 32-bit MSVC, the following arguments have chance to be passed by register if the current one failed. `__regcall` from ICC is on the contrary: https://godbolt.org/z/4MPbzhaMG
All the three calling conversions are not supported in GCC.

Fixes: #57737

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D133920
2022-09-20 09:18:23 +08:00
Jennifer Yu 684f766431 [Clang][OpenMP] Codegen generation for has_device_addr claues.
Summary: This patch add codegen support for the has_device_addr clause.  It
use the same logic of is_device_ptr.

Differential Revision: https://reviews.llvm.org/D134186
2022-09-19 16:14:57 -07:00
Weining Lu 7d88a05cc0 [Clang][LoongArch] Implement ABI lowering
Reuse most of RISCV's implementation with several exceptions:

1. Assign signext/zeroext attribute to args passed in stack.
On RISCV, integer scalars passed in registers have signext/zeroext
when promoted, but are anyext if passed on the stack. This is defined
in early RISCV ABI specification. But after this change [1], integers
should also be signext/zeroext if passed on the stack. So I think
RISCV's ABI lowering should be updated [2].

While in LoongArch ABI spec, we can see that integer scalars narrower
than GRLEN bits are zero/sign-extended no matter passed in registers
or on the stack.

2. Zero-width bit fields are ignored.
This matches GCC's behavior but it hasn't been documented in ABI sepc.
See https://gcc.gnu.org/r12-8294.

3. `char` is signed by default.
There is another difference worth mentioning is that `char` is signed
by default on LoongArch while it is unsigned on RISCV.

This patch also adds `_BitInt` type support to LoongArch and handle it
in LoongArchABIInfo::classifyArgumentType.

[1] cec39a064e
[2] https://github.com/llvm/llvm-project/issues/57261

Differential Revision: https://reviews.llvm.org/D132285
2022-09-19 12:05:00 +08:00
Aiden Grossman c0bc461999 [Clang] Give error message for invalid profile path when compiling IR
Before this patch, when compiling an IR file (eg the .llvmbc section
from an object file compiled with -Xclang -fembed-bitcode=all) and
profile data was passed in using the -fprofile-instrument-use-path
flag, there would be no error printed (as the previous implementation
relied on the error getting caught again in the constructor of
CodeGenModule which isn't called when -x ir is set). This patch
moves the error checking directly to where the error is caught
originally rather than failing silently in setPGOUseInstrumentor and
waiting to catch it in CodeGenModule to print diagnostic information to
the user.

Regression test added.

Reviewed By: xur, mtrofin

Differential Revision: https://reviews.llvm.org/D132991
2022-09-16 19:45:57 +00:00
David Majnemer 8a868d8859 Revert "Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView""
This reverts commit cd20a18286 and adds a
"let Heading" to NoStackProtectorDocs.
2022-09-16 19:39:48 +00:00
Matheus Izvekov f4ea3bd4b2
[clang] Fixes how we represent / emulate builtin templates
We change the template specialization of builtin templates to
behave like aliases.

Though unlike real alias templates, these might still produce a canonical
TemplateSpecializationType when some important argument is dependent.

For example, we can't do anything about make_integer_seq when the
count is dependent, or a type_pack_element when the index is dependent.

We change type deduction to not try to deduce canonical TSTs of
builtin templates.

We also change those buitin templates to produce substitution sugar,
just like a real instantiation would, making the resulting type correctly
represent the template arguments used to specialize the underlying template.

And make_integer_seq will now produce a TST for the specialization
of it's first argument, which we use as the underlying type of
the builtin alias.

When performing member access on the resulting type, it's now
possible to map from a Subst* node to the template argument
as-written used in a regular fashion, without special casing.

And this fixes a bunch of bugs with relation to these builtin
templates factoring into deduction.

Fixes GH42102 and GH51928.

Depends on D133261

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D133262
2022-09-16 17:44:12 +02:00
Matheus Izvekov 67e2298311
[clang] use getCommonSugar in an assortment of places
For this patch, a simple search was performed for patterns where there are
two types (usually an LHS and an RHS) which are structurally the same, and there
is some result type which is resolved as either one of them (typically LHS for
consistency).

We change those cases to resolve as the common sugared type between those two,
utilizing the new infrastructure created for this purpose.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D111509
2022-09-16 16:36:00 +02:00
Stanislav Mekhanoshin e540965915 [AMDGPU] Added __builtin_amdgcn_ds_bvh_stack_rtn
Differential Revision: https://reviews.llvm.org/D133966
2022-09-16 02:42:09 -07:00
Navid Emamdoost 3e52c0926c Add -fsanitizer-coverage=control-flow
Reviewed By: kcc, vitalybuka, MaskRay

Differential Revision: https://reviews.llvm.org/D133157
2022-09-15 15:56:04 -07:00
Dhruva Chakrabarti 839ac62c50 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 7539e9cf81.
2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis 7539e9cf81 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6, ABataev

Differential Revision: https://reviews.llvm.org/D102107
2022-09-15 00:54:05 +00:00
Vitaly Buka c69b269111 [pipelines] Require GlobalsAA after sanitizers
Restore GlobalsAA if sanitizers inserted at early optimize callback.
The analysis can be useful for the following FunctionPassManager.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133537
2022-09-14 13:33:53 -07:00
Vitaly Buka 270c843005 [NFC][CodeGen] Remove empty line 2022-09-14 13:29:15 -07:00
Haojian Wu f6e759bd26 Remove some unused static functions in CGOpenMPRuntimeGPU.cpp, NFC 2022-09-14 17:20:02 +02:00
Joseph Huber bae1a2cf3c [OpenMP] Remove unused function after removing simplified interface
Summary:
A previous patch removed the user of this function but did not remove
the function causing unused function warnings. Remove it.
2022-09-14 10:14:43 -05:00
Joseph Huber 2d26ecb1fb [OpenMP] Remove simplified device runtime handling
The old device runtime had a "simplified" version that prevented many of
the runtime features from being initialized. The old device runtime was
deleted in LLVM 14 and is no longer in use. Selectively deactivating
features is now done using specific flags rather than the old technique.
This patch simply removes the extra logic required for handling the old
simple runtime scheme.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D133802
2022-09-14 09:41:50 -05:00
Xiang Li f712c0131f [HLSL]Add -O and -Od option for dxc mode.
Two new dxc mode options -O and -Od are added for dxc mode.
-O is just alias of existing cc1 -O option.
-Od will be lowered into -O0 and -dxc-opt-disable.

-dxc-opt-disable is cc1 option added to for build ShaderFlags.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D128845
2022-09-13 21:26:18 -07:00
Chris Bieneman a8a49923dd [HLSL] Call global destructors from entries
HLSL doesn't have a C++ runtime that supports `atexit` registration. To
enable global destructors we instead rely on the `llvm.global_dtor`
mechanism.

This change disables `atexit` generation for HLSL and updates the HLSL
code generation to call global destructors on the exit from entry
functions.

Depends on D132977.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D133518
2022-09-13 15:05:47 -05:00
Sylvestre Ledru cd20a18286 Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView"
Causing:
https://github.com/llvm/llvm-project/issues/57709

This reverts commit ab56719acd.
2022-09-13 10:53:59 +02:00
Martin Storsjö fbfe1db4a9 [clang] Explicitly set the EmulatedTLS codegen option. NFC.
Set the EmulatedTLS option based on `Triple::hasDefaultEmulatedTLS()`
if the user didn't specify it; set `ExplicitEmulatedTLS` to true
in `llvm::TargetOptions` and set `EmulatedTLS` to Clang's
opinion of what the default or preference is.

This avoids any risk of deviance between the two.

This affects one check of `getCodeGenOpts().EmulatedTLS` in
`shouldAssumeDSOLocal` in CodeGenModule, but as that check only
is done for `TT.isWindowsGNUEnvironment()`, and
`hasDefaultEmulatedTLS()` returns false for such environments
it doesn't make any current testable difference - thus NFC.

Some mingw distributions carry a downstream patch, that enables
emulated TLS by default for mingw targets in `hasDefaultEmulatedTLS()`
- and for such cases, this patch does make a difference and fixes the
detection of emulated TLS, if it is implicitly enabled.

Differential Revision: https://reviews.llvm.org/D132916
2022-09-13 10:40:54 +03:00
Fangrui Song 6f9c4851ab [MinGW] Reject explicit hidden visibility applied to dllexport and hidden/protected applied to dllimport
Hidden visibility is incompatible with dllexport.
Hidden and protected visibilities are incompatible with dllimport.
(PlayStation uses dllexport protected.)

When an explicit visibility attribute applies on a dllexport/dllimport
declaration, report a Frontend error (Sema does not compute visibility).

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D133266
2022-09-12 15:56:36 -07:00
David Majnemer ab56719acd [clang, llvm] Add __declspec(safebuffers), support it in CodeView
__declspec(safebuffers) is equivalent to
__attribute__((no_stack_protector)).  This information is recorded in
CodeView.

While we are here, add support for strict_gs_check.
2022-09-12 21:15:34 +00:00
Chris Bieneman d3c54a172d [HLSL] Call global constructors inside entry
HLSL doesn't have a runtime loader model that supports global
construction by a loader or runtime initializer. To allow us to leverage
global constructors with minimal code generation impact we put calls to
the global constructors inside the generated entry function.

Differential Revision: https://reviews.llvm.org/D132977
2022-09-09 09:01:28 -05:00
Vitaly Buka 7dc0734567 [msan] Insert simplification passes after instrumentation
This resolves TODO from D96406.
InstCombine issue is fixed with D133394.

Save 4.5% of .text on CTMark.
2022-09-09 00:33:04 -07:00
Vitaly Buka e261b03396 [sanitizers] Add experimental flag to insert sanitizers earlier 2022-09-08 19:05:17 -07:00
Joe Loser 1b3a78d1d5 [clang] Use std::size instead of llvm::array_lengthof
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.

Change call sites to use `std::size` instead. Leave the few call sites that
use a locally defined `array_lengthof` that are meant to test previous bugs
with NTTPs in clang analyzer and SemaTemplate.

Differential Revision: https://reviews.llvm.org/D133520
2022-09-08 17:20:25 -06:00
Thomas Lively ac3b8df8f2 [WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32`
As proposed in https://github.com/WebAssembly/relaxed-simd/issues/77. Only an
LLVM intrinsic and a clang builtin are implemented. Since there is no bfloat16
type, use u16 to represent the bfloats in the builtin function arguments.

Differential Revision: https://reviews.llvm.org/D133428
2022-09-08 08:07:49 -07:00
Fangrui Song bc502d9c24 Revert D133266 "[MinGW] Reject explicit non-default visibility applied to dllexport/dllimport declaration"
This reverts commit 91d8324366.

The combo dllexport protected makes sense and is used by PlayStation.
Will change the patch to allow dllexport protected.
2022-09-07 16:06:19 -07:00
Marco Elver c4842bb2e9 [Clang] Introduce -fexperimental-sanitize-metadata=
Introduces the frontend flag -fexperimental-sanitize-metadata=, which
enables SanitizerBinaryMetadata instrumentation.

The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. The plan is to open source a stable and production
quality version of GWP-TSan. The development of which, however, requires
upstream compiler support.

[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf

Until the tool has been open sourced, we mark this kind of
instrumentation as "experimental", and reserve the option to change
binary format, remove features, and similar.

Reviewed By: vitalybuka, MaskRay

Differential Revision: https://reviews.llvm.org/D130888
2022-09-07 21:25:40 +02:00
yronglin 6ed21fc515 Avoid __builtin_assume_aligned crash when the 1st arg is array type
Avoid __builtin_assume_aligned crash when the 1st arg is array type (or
string literal).

Fixes Issue #57169

Differential Revision: https://reviews.llvm.org/D133202
2022-09-07 12:46:20 -04:00
Vitaly Buka 4c18670776 [NFC][sancov] Rename ModuleSanitizerCoveragePass 2022-09-06 20:55:39 -07:00
Vitaly Buka 5e38b2a456 [NFC][msan] Rename ModuleMemorySanitizerPass 2022-09-06 20:30:35 -07:00
Chuanqi Xu 5f571eeb3f [NFC] [Frontend] Correct the use of 'auto' in SemaCoroutine and CGCoroutine
We should only use 'auto' in case we can know the type from the right
hand side of the expression. Also we need keep '*' around if the type is
a pointer actually. Few uses of 'auto' in SemaCoroutine.cpp and
CGCoroutine.cpp violates the rule. This commit tries to fix it.
2022-09-07 10:45:01 +08:00
Vitaly Buka 93600eb50c [NFC][asan] Rename ModuleAddressSanitizerPass 2022-09-06 15:02:11 -07:00
Vitaly Buka e7bac3b9fa [msan] Convert Msan to ModulePass
MemorySanitizerPass function pass violatied requirement 4 of function
pass to do not insert globals. Msan nees to insert globals for origin
tracking, and paramereters tracking.

https://llvm.org/docs/WritingAnLLVMPass.html#the-functionpass-class

Reviewed By: kstoimenov, fmayer

Differential Revision: https://reviews.llvm.org/D133336
2022-09-06 15:01:04 -07:00
Fangrui Song 91d8324366 [MinGW] Reject explicit non-default visibility applied to dllexport/dllimport declaration
dllimport/dllexport is incompatible with protected/hidden visibilities.
(Arguably dllexport semantics is compatible with protected but let's reject the
combo for simplicity.)

When an explicit visibility attribute applies on a dllexport/dllimport
declaration, report a Frontend error (Sema does not compute visibility).

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D133266
2022-09-05 10:17:19 -07:00
Kazu Hirata b7a7aeee90 [clang] Qualify auto in range-based for loops (NFC) 2022-09-03 23:27:27 -07:00
Vitaly Buka 9905dae5e1 Revert "[Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type"
Breakes windows bot.

This reverts commit 3ad2fe913a.
2022-09-03 13:12:49 -07:00
Kazu Hirata 89f1433225 Use llvm::lower_bound (NFC) 2022-09-03 11:17:37 -07:00
yronglin 3ad2fe913a [Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type
Avoid __builtin_assume_aligned crash when the 1st arg is array type(or string literal).

Open issue: https://github.com/llvm/llvm-project/issues/57169

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D133202
2022-09-03 23:26:01 +08:00
Fangrui Song 1a4d851d27 [MinGW] Ignore -fvisibility/-fvisibility-inlines-hidden for dllexport
Similar to 123ce97fac for dllimport: dllexport
expresses a non-hidden visibility intention. We can consider it explicit and
therefore it should override the global visibility setting (see AST/Decl.cpp
"NamedDecl Implementation").

Adding the special case to CodeGenModule::setGlobalVisibility is somewhat weird,
but allows we to add the code in one place instead of many in AST/Decl.cpp.

Differential Revision: https://reviews.llvm.org/D133180
2022-09-02 09:59:16 -07:00
serge-sans-paille e0746a8a8d [clang] cleanup -fstrict-flex-arrays implementation
This is a follow up to https://reviews.llvm.org/D126864, addressing some remaining
comments.

It also considers union with a single zero-length array field as FAM for each
value of -fstrict-flex-arrays.

Differential Revision: https://reviews.llvm.org/D132944
2022-09-01 15:06:21 +02:00
Chuanqi Xu 7e19d53da4 [NFC] Emit builtin coroutine calls uniforally
All the coroutine builtins were emitted in EmitCoroutineIntrinsic except
__builtin_coro_size. This patch tries to emit all the corotine builtins
uniformally.
2022-09-01 16:31:51 +08:00
Vitaly Buka 960e7a5513 [msan] Use Debug Info to point to affected fields
Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D132909
2022-08-31 13:12:17 -07:00
Sanjay Patel cdf3de45d2 [CodeGen] fix misnamed "not" operation; NFC
Seeing the wrong instruction for this name in IR is confusing.
Most of the tests are not even checking a subsequent use of
the value, so I just deleted the over-specified CHECKs.
2022-08-31 15:11:48 -04:00
Vitaly Buka c059ede28e [msan] Add more specific messages for use-after-destroy
Reviewed By: kda, kstoimenov

Differential Revision: https://reviews.llvm.org/D132907
2022-08-30 19:52:32 -07:00
Luke Nihlen c9aba60074 [clang] Don't emit debug vtable information for consteval functions
Fixes https://github.com/llvm/llvm-project/issues/55065

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D132874
2022-08-30 19:10:15 +00:00
Rong Xu db18f26567 [llvm-profdata] Handle internal linkage functions in profile supplementation
This patch has the following changes:
(1) Handling of internal linkage functions (static functions)
Static functions in FDO have a prefix of source file name, while they do not
have one in SampleFDO. Current implementation does not handle this and we are
not updating the profile for static functions. This patch fixes this.

(2) Handling of -funique-internal-linakge-symbols
Again this is for the internal linkage functions. Option
-funique-internal-linakge-symbols can now be applied to both FDO and SampleFDO
compilation. When it is used, it demangles internal linkage function names and
adds a hash value as the postfix.

When both SampleFDO and FDO profiles use this option, or both
not use this option, changes in (1) should handle this.

Here we also handle when the SampleFDO profile using this option while FDO
profile not using this option, or vice versa.

There is one case where this patch won't work: If one of the profiles used
mangled name and the other does not. For example, if the SampleFDO profile
uses clang c-compiler and without -funique-internal-linakge-symbols, while
the FDO profile uses -funique-internal-linakge-symbols. The SampleFDO profile
contains unmangled names while the FDO profile contains mangled names. If
both profiles use c++ compiler, this won't happen. We think this use case
is rare and does not justify the effort to fix.

Differential Revision: https://reviews.llvm.org/D132600
2022-08-29 16:15:12 -07:00
Yuanfang Chen 70248bfdea [Clang] Implement function attribute nouwtable
To have finer control of IR uwtable attribute generation. For target code generation,
IR nounwind and uwtable may have some interaction. However, for frontend, there are
no semantic interactions so the this new `nouwtable` is marked "SimpleHandler = 1".

Differential Revision: https://reviews.llvm.org/D132592
2022-08-29 12:12:19 -07:00
Kazu Hirata 86bc4587e1 Use std::clamp (NFC)
This patch replaces clamp idioms with std::clamp where the range is
obviously valid from the source code (that is, low <= high) to avoid
introducing undefined behavior.
2022-08-27 09:53:13 -07:00
Jun Zhang a4f84f1b2e
[CodeGen] Track DeferredDecls that have been emitted
If we run into a first usage or definition of a mangled name, and
there's a DeferredDecl that associated with it, we should remember it we
need to emit it later on.

Without this patch, clang-repl hits a JIT symbol not found error:
clang-repl> extern "C" int printf(const char *, ...);
clang-repl> auto l1 = []() { printf("ONE\n"); return 42; };
clang-repl> auto l2 = []() { printf("TWO\n"); return 17; };
clang-repl> auto r1 = l1();
ONE
clang-repl> auto r2 = l2();
TWO
clang-repl> auto r3 = l2();
JIT session error: Symbols not found: [ l2 ]
error: Failed to materialize symbols: { (main,
{ r3, orc_init_func.incr_module_5, $.incr_module_5.inits.0 }) }

Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D130831
2022-08-27 22:32:47 +08:00
Leonard Chan cdb30f7a26 [clang] Do not instrument the rtti_proxies under hwasan
We run into a duplicate symbol error when instrumenting the rtti_proxies
generated as part of the relative vtables ABI with hwasan:

```
ld.lld: error: duplicate symbol: typeinfo for icu_71::UObject
(.rtti_proxy)
>>> defined at brkiter.cpp
>>>
arm64-hwasan-shared/obj/third_party/icu/source/common/libicuuc.brkiter.cpp.o:(typeinfo
for icu_71::UObject (.rtti_proxy))
>>> defined at locavailable.cpp
>>>
arm64-hwasan-shared/obj/third_party/icu/source/common/libicuuc.locavailable.cpp.o:(.data.rel.ro..L_ZTIN6icu_717UObjectE.rtti_proxy.hwasan+0xE00000000000000)
```

The issue here is that the hwasan alias carries over the visibility and
linkage of the original proxy, so we have duplicate external symbols
that participate in linking. Similar to D132425 we can just disable
hwasan for the proxies for now.

Differential Revision: https://reviews.llvm.org/D132691
2022-08-26 18:22:17 +00:00
Leonard Chan 93e5cf6b9c [clang] Do not instrument relative vtables under hwasan
Full context in
https://bugs.fuchsia.dev/p/fuchsia/issues/detail?id=107017.

Instrumenting hwasan with globals results in a linker error under the
relative vtables abi:

```
ld.lld: error:
libunwind.cpp:(.rodata..L_ZTVN9libunwind12UnwindCursorINS_17LocalAddressSpaceENS_15Registers_arm64EEE.hwasan+0x8):
relocation R_AARCH64_PLT32 out of range: 6845471433603167792 is not in
[-2147483648, 2147483647]; references
libunwind::AbstractUnwindCursor::~AbstractUnwindCursor()
>>> defined in
libunwind/src/CMakeFiles/unwind_shared.dir/libunwind.cpp.obj
```

This is because the tag is included in the vtable address when
calculating the offset between the vtable and virtual function. A
temporary solution until we can resolve this is to just disable hwasan
instrumentation on relative vtables specifically, which can be done in
the frontend.

Differential Revision: https://reviews.llvm.org/D132425
2022-08-26 18:21:40 +00:00
Xiang Li a0ecb4a299 [HLSL] Move DXIL validation version out of ModuleFlags
Put DXIL validation version into separate NamedMetadata to avoid update ModuleFlags.

Currently DXIL validation version is saved in ModuleFlags in clang codeGen.
Then in DirectX backend, the data will be extracted from ModuleFlags and cause rebuild of ModuleFlags.
This patch will build NamedMetadata for DXIL validation version and remove the code to rebuild ModuleFlags.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D130207
2022-08-26 09:20:45 -07:00
Corentin Jabot 463e30f51f [Clang] Fix crash in coverage of if consteval.
Clang crashes when encountering an `if consteval` statement.
This is the minimum fix not to crash.
The fix is consistent with the current behavior of if constexpr,
which does generate coverage data for the discarded branches.
This is of course not correct and a better solution is
needed for both if constexpr and if consteval.
See https://github.com/llvm/llvm-project/issues/54419.

Fixes #57377

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D132723
2022-08-26 17:46:53 +02:00
Chris Bieneman 22c477f934 [HLSL] Initial codegen for SV_GroupIndex
Semantic parameters aren't passed as actual parameters, instead they are
populated from intrinsics which are generally lowered to reads from
dedicated hardware registers.

This change modifies clang CodeGen to emit the intrinsic calls and
populate the parameter's LValue with the result of the intrinsic call
for SV_GroupIndex.

The result of this is to make the actual passed argument ignored, which
will make it easy to clean up later in an IR pass.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131203
2022-08-25 11:17:54 -05:00
David Majnemer bd28bd59a3 [clang-cl] /kernel should toggle bit 30 in @feat.00
The linker is supposed to detect when an object with /kernel is linked
with another object which is not compiled with /kernel. The linker
detects this by checking bit 30 in @feat.00.
2022-08-25 14:17:26 +00:00
Zahira Ammarguellat 5def954a5b Support of expression granularity for _Float16.
Differential Revision: https://reviews.llvm.org/D113107
2022-08-25 08:26:53 -04:00
Sami Tolvanen cff5bef948 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Relands 67504c9549 with a fix for
32-bit builds.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 22:41:38 +00:00
Sami Tolvanen a79060e275 Revert "KCFI sanitizer"
This reverts commit 67504c9549 as using
PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.
2022-08-24 19:30:13 +00:00
Sami Tolvanen 67504c9549 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 18:52:42 +00:00
Vitaly Buka b5a9adf1f5 [clang] Create alloca to pass into static lambda
"this" parameter of lambda if undef, notnull and differentiable.
So we need to pass something consistent.

Any alloca will work. It will be eliminated as unused later by optimizer.

Otherwise we generate code which Msan is expected to catch.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D132275
2022-08-23 13:53:17 -07:00
Joseph Huber 2b8f722e63 [OpenMP] Add option to assert no nested OpenMP parallelism on the GPU
The OpenMP device runtime needs to support the OpenMP standard. However
constructs like nested parallelism are very uncommon in real application
yet lead to complexity in the runtime that is sometimes difficult to
optimize out. As a stop-gap for performance we should supply an argument
that selectively disables this feature. This patch adds the
`-fopenmp-assume-no-nested-parallelism` argument which explicitly
disables the usee of nested parallelism in OpenMP.

Reviewed By: carlo.bertolli

Differential Revision: https://reviews.llvm.org/D132074
2022-08-23 14:09:51 -05:00
utsumi 2e2caea37f [Clang][OpenMP] Make copyin clause on combined and composite construct work (patch by Yuichiro Utsumi (utsumi.yuichiro@fujitsu.com))
Make copyin clause on the following constructs work.

- parallel for
- parallel for simd
- parallel sections

Fixes https://github.com/llvm/llvm-project/issues/55547

Patch by Yuichiro Utsumi (utsumi.yuichiro@fujitsu.com)

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D132209
2022-08-23 07:58:35 -07:00
David Majnemer 2c923b8863 [clang-cl] Expose the /volatile:{iso,ms} choice via _ISO_VOLATILE
MSVC allows interpreting volatile loads and stores, when combined with
/volatile:iso, as having acquire/release semantics. MSVC also exposes a
define, _ISO_VOLATILE, which allows users to enquire if this feature is
enabled or disabled.
2022-08-23 14:29:52 +00:00
Yuanfang Chen f9969a3d28 [CodeGen] Sort llvm.global_ctors by lexing order before emission
Fixes https://github.com/llvm/llvm-project/issues/55804

The lexing order is already bookkept in DelayedCXXInitPosition but we
were not using it based on the wrong assumption that inline variable is
unordered. This patch fixes it by ordering entries in llvm.global_ctors
by orders in DelayedCXXInitPosition.

for llvm.global_ctors entries without a lexing order, ordering them by
the insertion order.

(This *mostly* orders the template instantiation in
https://reviews.llvm.org/D126341 intuitively, minus one tweak for which I'll
submit a separate patch.)

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D127233
2022-08-22 16:00:14 -07:00
Yaxun (Sam) Liu 9f6cb3e9fd [AMDGPU] Add builtin s_sendmsg_rtn
Reviewed by: Brian Sumner, Artem Belevich

Differential Revision: https://reviews.llvm.org/D132140

Fixes: SWDEV-352017
2022-08-22 18:29:23 -04:00
Chris Bieneman 9a478d5232 [NFC] Rename dx.shader to hlsl.shader
This metadata annotation is HLSL-specific not DirectX specific. It will
need to be attached for shaders regardless of whether they are targeting
DXIL.
2022-08-22 16:03:40 -05:00
Kazu Hirata 8b1b0d1d81 Revert "Use std::is_same_v instead of std::is_same (NFC)"
This reverts commit c5da37e42d.

This patch seems to break builds with some versions of MSVC.
2022-08-20 23:00:39 -07:00
Kazu Hirata c5da37e42d Use std::is_same_v instead of std::is_same (NFC) 2022-08-20 22:36:26 -07:00
Kazu Hirata 8e494b85a5 Use llvm::drop_begin (NFC) 2022-08-20 21:18:30 -07:00
Alex Bradbury bc53832080 [clang][RISCV] Fix incorrect ABI lowering for inherited structs under hard-float ABIs
The hard float ABIs have a rule that if a flattened struct contains
either a single fp value, or an int+fp, or fp+fp then it may be passed
in a pair of registers (if sufficient GPRs+FPRs are available).
detectFPCCEligibleStruct and the helper it calls,
detectFPCCEligibleStructHelper examine the type of the argument/return
value to determine if it complies with the requirements for this ABI
rule.

As reported in bug #57084, this logic produces incorrect results for C++
structs that inherit from other structs. This is because only the fields
of the struct were examined, but enumerating RD->fields misses any
fields in inherited C++ structs. This patch corrects that issue by
adding appropriate logic to enumerate any included base structs.

Differential Revision: https://reviews.llvm.org/D131677
2022-08-19 20:31:06 +01:00
Craig Topper 1a60e003df [RISCV] Use Triple::isRISCV/isRISCV32/isRISCV64 helps in some places. NFC
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132197
2022-08-19 09:11:22 -07:00
Caroline Concatto 9f21d6e953 [Clang][AArch64] Use generic extract/insert vector for svget/svset/svcreate tuples
This patch replaces svget, svset and svcreate aarch64 intrinsics for tuple
types with the generic llvm-ir intrinsics extract/insert vector

Differential Revision: https://reviews.llvm.org/D131547
2022-08-19 12:58:59 +01:00
Caroline Concatto 4ef1f014a1 [Clang][AArch64] Replace aarch64_sve_ldN intrinsic by aarch64_sve_ldN.sret
Differential Revision: https://reviews.llvm.org/D131687
2022-08-19 11:42:18 +01:00
Yonghong Song 481d67d310 [Clang][BPF] Support record argument with direct values
Currently, record arguments are always passed by reference by allocating
space for record values in the caller. This is less efficient for
small records which may take one or two registers. For example,
for x86_64 and aarch64, for a record size up to 16 bytes, the record
values can be passed by values directly on the registers.

This patch added BPF support of record argument with direct values
for up to 16 byte record size. If record size is 0, that record
will not take any register, which is the same behavior for x86_64
and aarch64. If the record size is greater than 16 bytes, the
record argument will be passed by reference.

Differential Revision: https://reviews.llvm.org/D132144
2022-08-18 19:11:50 -07:00
Prabhdeep Singh Soni bce94ea551 [OMPIRBuilder] Add support for safelen clause
This patch adds OMPIRBuilder support for the safelen clause for the
simd directive.

Reviewed By: shraiysh, Meinersbur

Differential Revision: https://reviews.llvm.org/D131526
2022-08-18 15:43:08 -04:00
Wolfgang Pieb 8564e2fea5 [Inlining] Add a clang option to limit inlining of functions
Add the clang option -finline-max-stacksize=<N> to suppress inlining
of functions whose stack size exceeds the given value.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D131986
2022-08-18 11:56:24 -07:00
Ties Stuij 27cbfa7cc8 [Clang] Propagate const context info when emitting compound literal
This patch fixes a crash when trying to emit a constant compound literal.

For C++ Clang evaluates either casts or binary operations at translation time,
but doesn't pass on the InConstantContext information that was inferred when
parsing the statement.  Because of this, strict FP evaluation (-ftrapping-math)
which shouldn't be in effect yet, then causes checkFloatingpointResult to return
false, which in tryEmitGlobalCompoundLiteral will trigger an assert that the
compound literal wasn't constant.

The discussion here around 'manifestly constant evaluated contexts' was very
helpful to me when trying to understand what LLVM's position is on what
evaluation context should be in effect, together with the explanatory text in
that patch itself:
https://reviews.llvm.org/D87528

Reviewed By: rjmccall, DavidSpickett

Differential Revision: https://reviews.llvm.org/D131555
2022-08-18 11:25:20 +01:00
Vitaly Buka 36c9f5a58b [NFC][OpenMP] Simplify 2f9be69d84 2022-08-17 18:59:48 -07:00
David Blaikie 06c70e9b99 DebugInfo: Remove auto return type representation support
Seems this complicated lldb sufficiently for some cases that it hasn't
been worth supporting/fixing there - and it so far hasn't provided any
new use cases/value for debug info consumers, so let's remove it until
someone has a use case for it.

(side note: the original implementation of this still had a bug (I
should've caught it in review) that we still didn't produce
auto-returning function declarations in types where the function wasn't
instantiatied (that requires a fix to remove the `if
getContainedAutoType` condition in
`CGDebugInfo::CollectCXXMemberFunctions` - without that, auto returning
functions were still being handled the same as member function templates
and special member functions - never added to the member list, only
attached to the type via the declaration chain from the definition)

Further discussion about this in D123319

This reverts commit 5ff992bca208a0e37ca6338fc735aec6aa848b72: [DEBUG-INFO] Change how we handle auto return types for lambda operator() to be consistent with gcc

This reverts commit c83602fdf51b2692e3bacb06bf861f20f74e987f: [DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions.

Differential Revision: https://reviews.llvm.org/D131933
2022-08-17 00:35:05 +00:00
Yonghong Song d9198f64d9 [Clang][BPF]: Force sign/zero extension for return values in caller
Currently bpf supports calling kernel functions (x86_64, arm64, etc.)
in bpf programs. Tejun discovered a problem where the x86_64 func
return value (a unsigned char type) is stored in 8-bit subregister %al
and the other 56-bits in %rax might be garbage. But based on current
bpf ABI, the bpf program assumes the whole %rax holds the correct value
as the callee is supposed to do necessary sign/zero extension.
This mismatch between bpf and x86_64 caused the incorrect results.

To resolve this problem, this patch forced caller to do needed
sign/zero extension for 8/16-bit return values as well. Note that
32-bit return values already had sign/zero extension even without
this patch.

For example, for the test case attached to this patch:

  $  cat t.c
  _Bool bar_bool(void);
  unsigned char bar_char(void);
  short bar_short(void);
  int bar_int(void);
  int foo_bool(void) {
        if (bar_bool() != 1) return 0; else return 1;
  }
  int foo_char(void) {
        if (bar_char() != 10) return 0; else return 1;
  }
  int foo_short(void) {
        if (bar_short() != 10) return 0; else return 1;
  }
  int foo_int(void) {
        if (bar_int() != 10) return 0; else return 1;
  }

Without this patch, generated call insns in IR looks like:
    %call = call zeroext i1 @bar_bool()
    %call = call zeroext i8 @bar_char()
    %call = call signext i16 @bar_short()
    %call = call i32 @bar_int()
So it is assumed that zero extension has been done for return values of
bar_bool()and bar_char(). Sign extension has been done for the return
value of bar_short(). The return value of bar_int() does not have any
assumption so caller needs to do necessary shifting to get correct
32bit values.

With this patch, generated call insns in IR looks like:
    %call = call i1 @bar_bool()
    %call = call i8 @bar_char()
    %call = call i16 @bar_short()
    %call = call i32 @bar_int()
There are no assumptions for return values of the above four function calls,
so necessary shifting is necessary for all of them.

The following is the objdump file difference for function foo_char().
Without this patch:
  0000000000000010 <foo_char>:
       2:       85 10 00 00 ff ff ff ff call -1
       3:       bf 01 00 00 00 00 00 00 r1 = r0
       4:       b7 00 00 00 01 00 00 00 r0 = 1
       5:       15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
       6:       b7 00 00 00 00 00 00 00 r0 = 0
  0000000000000038 <LBB1_2>:
       7:       95 00 00 00 00 00 00 00 exit

With this patch:
  0000000000000018 <foo_char>:
       3:       85 10 00 00 ff ff ff ff call -1
       4:       bf 01 00 00 00 00 00 00 r1 = r0
       5:       57 01 00 00 ff 00 00 00 r1 &= 255
       6:       b7 00 00 00 01 00 00 00 r0 = 1
       7:       15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2>
       8:       b7 00 00 00 00 00 00 00 r0 = 0
  0000000000000048 <LBB1_2>:
       9:       95 00 00 00 00 00 00 00 exit
The zero extension of the return 'char' value is done here.

Differential Revision: https://reviews.llvm.org/D131598
2022-08-16 16:08:01 -07:00
Saleem Abdulrasool 585f62be1a CodeGen: correct handling of debug info generation for aliases
When aliasing a static array, the aliasee is going to be a GEP which
points to the value.  We should strip pointer casts before forming the
reference.  This was occluded by the use of opaque pointers.

This problem has existed since the introduction of the debug info
generation for aliases in b1ea0191a4.  The
test case would assert due to the invalid cast with or without
`-no-opaque-pointers` at that revision.

Fixes: #57179
2022-08-16 21:27:05 +00:00
Arthur Eubanks 9181ce623f [Windows] Put init_seg(compiler/lib) in llvm.global_ctors
Currently we treat initializers with init_seg(compiler/lib) as similar
to any other init_seg, they simply have a global variable in the proper
section (".CRT$XCC" for compiler/".CRT$XCL" for lib) and are added to
llvm.used. However, this doesn't match with how LLVM sees normal (or
init_seg(user)) initializers via llvm.global_ctors. This
causes issues like incorrect init_seg(compiler) vs init_seg(user)
ordering due to GlobalOpt evaluating constructors, and the
ability to remove init_seg(compiler/lib) initializers at all.

Currently we use 'A' for priorities less than 200. Use 200 for
init_seg(compiler) (".CRT$XCC") and 400 for init_seg(lib) (".CRT$XCL"),
which do not append the priority to the section name. Priorities
between 200 and 400 use ".CRT$XCC${Priority}". This allows for
some wiggle room for people/future extensions that want to add
initializers between compiler and lib.

Fixes #56922

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D131910
2022-08-16 08:16:18 -07:00
Kazu Hirata 2b43bd0bd9 Remove unused forward declarations (NFC) 2022-08-13 12:55:47 -07:00
Vitaly Buka 2f9be69d84 [OpenMP] Fix another after scope after D129608
https://lab.llvm.org/buildbot/#/builders/5/builds/26770
2022-08-13 12:13:54 -07:00
Vitaly Buka f385eaf48f [OpenMP] Fix use after scope after D129608
Broken builder https://lab.llvm.org/buildbot/#/builders/5/builds/26764
2022-08-13 09:40:51 -07:00
Jennifer Yu 2ca27206f9 [OpenMP] Fix segmentation fault when data field is used in is_device_pt
Currently, the field just emit map info for this pointer variable. It is
failed at run time. For the fields, the PartialStruct is created and it
needs call to emitCombinedEntry which create the base that covers all
the pieces.

The change is to generate map info as regular fields.

Differential Revision: https://reviews.llvm.org/D129608
2022-08-12 17:10:26 -07:00
Aaron Ballman b48fb85fe6 Fix crash-on-valid with consteval temporary construction through list initialization
Clang currently crashes when lowering a consteval list initialization
of a temporary. This is partially working around an issue in the
template instantiation code (TreeTransform::TransformCXXTemporaryObjectExpr())
that does not yet know how to handle list initialization of temporaries
in all cases. However, it's also helping reduce fragility by ensuring
we always have a valid QualType when trying to emit a constant
expression during IR generation.

Fixes #55871

Differential Revision: https://reviews.llvm.org/D131194
2022-08-11 13:44:24 -04:00
Florian Hahn ef110a491f
[Builtins] Do not claim most libfuncs are readnone with trapping math.
At the moment, Clang only considers errno when deciding if a builtin
is const. This ignores the fact that some library functions may raise
floating point exceptions, which may modify global state, e.g. when
updating FP status registers.

To model the fact that some library functions/builtins may raise
floating point exceptions, this patch adds a new 'g' modifier for
builtins. If a builtin is marked with 'g', it cannot be considered
const, unless FP exceptions are ignored.

So far I've not added CHECK lines for all calls in math-libcalls.c. I'll
do that once we agree on the overall direction.

A consequence seems to be that we fail to select some of the constrained
math builtins now, but I am not entirely sure what's going on there.

Reviewed By: john.brawn

Differential Revision: https://reviews.llvm.org/D129231
2022-08-11 12:29:01 +01:00
Freddy Ye e4888a37d3 [X86][BF16] Enable __bf16 for x86 targets.
X86 psABI has updated to support __bf16 type, the ABI of which is the
same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D130964
2022-08-10 09:00:47 +08:00
Fangrui Song 32197830ef [clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC 2022-08-09 07:11:18 +00:00
Fangrui Song 3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Sergei Barannikov 87dc7d4b61
[clang][CodeGen] Factor out Swift ABI hooks (NFCI)
Swift calling conventions stands out in the way that they are lowered in
mostly target-independent manner, with very few customization points.
As such, swift-related methods of ABIInfo do not reference the rest of
ABIInfo and vice versa.
This change follows interface segregation principle; it removes
dependency of SwiftABIInfo on ABIInfo. Targets must now implement
SwiftABIInfo separately if they support Swift calling conventions.

Almost all targets implemented `shouldPassIndirectly` the same way. This
de-facto default implementation has been moved into the base class.

`isSwiftErrorInRegister` used to be virtual, now it is not. It didn't
accept any arguments which could have an effect on the returned value.
This is now a static property of the target ABI.

Reviewed By: rusyaev-roman, inclyc

Differential Revision: https://reviews.llvm.org/D130394
2022-08-08 00:23:23 +08:00
Shilei Tian e21202dac1 [Clang][OpenMP] Fix the issue that `llvm.lifetime.end` is emitted too early for variables captured in linear clause
Currently if an OpenMP program uses `linear` clause, and is compiled with
optimization, `llvm.lifetime.end` for variables listed in `linear` clause are
emitted too early such that there could still be uses after that. Let's take the
following code as example:
```
// loop.c
int j;
int *u;

void loop(int n) {
  int i;
  for (i = 0; i < n; ++i) {
    ++j;
    u = &j;
  }
}
```
We compile using the command:
```
clang -cc1 -fopenmp-simd -O3 -x c -triple x86_64-apple-darwin10 -emit-llvm loop.c -o loop.ll
```
The following IR (simplified) will be generated:
```
@j = local_unnamed_addr global i32 0, align 4
@u = local_unnamed_addr global ptr null, align 8

define void @loop(i32 noundef %n) local_unnamed_addr {
entry:
  %j = alloca i32, align 4
  %cmp = icmp sgt i32 %n, 0
  br i1 %cmp, label %simd.if.then, label %simd.if.end

simd.if.then:                                     ; preds = %entry
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %j)
  store ptr %j, ptr @u, align 8
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
  %0 = load i32, ptr %j, align 4
  store i32 %0, ptr @j, align 4
  br label %simd.if.end

simd.if.end:                                      ; preds = %simd.if.then, %entry
  ret void
}
```
The most important part is:
```
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
  %0 = load i32, ptr %j, align 4
  store i32 %0, ptr @j, align 4
```
`%j` is still loaded after `@llvm.lifetime.end.p0(i64 4, ptr nonnull %j)`. This
could cause the backend incorrectly optimizes the code and further generates
incorrect code. The root cause is, when we emit a construct that could have
`linear` clause, it usually has the following pattern:
```
EmitOMPLinearClauseInit(S)
{
  OMPPrivateScope LoopScope(*this);
  ...
  EmitOMPLinearClause(S, LoopScope);
  ...
  (void)LoopScope.Privatize();
  ...
}
EmitOMPLinearClauseFinal(S, [](CodeGenFunction &) { return nullptr; });
```
Variables that need to be privatized are added into `LoopScope`, which also
serves as a RAII object. When `LoopScope` is destructed and if optimization is
enabled, a `@llvm.lifetime.end` is also emitted for each privatized variable.
However, the writing back to original variables in `linear` clause happens after
the scope in `EmitOMPLinearClauseFinal`, causing the issue we see above.

A quick "fix" seems to be, moving `EmitOMPLinearClauseFinal` inside the scope.
However, it doesn't work. That's because the local variable map has been updated
by `LoopScope` such that a variable declaration is mapped to the privatized
variable, instead of the actual one. In that way, the following code will be
generated:
```
  %0 = load i32, ptr %j, align 4
  store i32 %0, ptr %j, align 4
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j)
```
Well, now the life time is correct, but apparently the writing back is broken.

In this patch, a new function `OMPPrivateScope::restoreMap` is added and called
before calling `EmitOMPLinearClauseFinal`. This can make sure that
`EmitOMPLinearClauseFinal` can find the orignal varaibls to write back.

Fixes #56913.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D131272
2022-08-06 16:50:37 -04:00
Xiang Li b2c9ff7273 [NFC][HLSL] Fix build error caused missing typo update.
setHLSLFnuctionAttributes to setHLSLFunctionAttributes.

Differential Revision: https://reviews.llvm.org/D131240
2022-08-04 23:20:25 -07:00
Xiang Li 6134629af0 [NFC][HLSL] Fix typo in CGHLSLRuntime.
Change setHLSLFnuctionAttributes to setHLSLFunctionAttributes.

Differential Revision: https://reviews.llvm.org/D131238
2022-08-04 23:08:40 -07:00
Xiang Li 906e41f4e3 [HLSL] clang codeGen for HLSLShaderAttr.
Translate HLSLShaderAttr to IR level.
 1. Skip mangle for hlsl entry functions.
 2. Add function attribute for hlsl entry functions.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D124752
2022-08-04 21:23:57 -07:00
Ellis Hoag 6f4c3c0f64 [InstrProf][attempt 2] Add new format for -fprofile-list=
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.

Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.

This was originally landed as https://reviews.llvm.org/D130808 but was
reverted due to a Windows test failure.

Differential Revision: https://reviews.llvm.org/D131195
2022-08-04 17:12:56 -07:00
Matt Arsenault c5b36ab1d6 AMDGPU/clang: Remove dead code
The order has to be a constant and should be enforced by the builtin
definition. The fallthrough behavior would have been broken anyway.

There's still an existing issue/assert if you try to use garbage for the
ordering. The IRGen should be broken, but we also hit another assert
before that.

Fixes issue 56832
2022-08-04 19:02:56 -04:00
Nico Weber 0eb7d86f58 Revert "[InstrProf] Add new format for -fprofile-list="
This reverts commit b692312ca4.
Breaks tests on Windows, see https://reviews.llvm.org/D130808#3699952
2022-08-04 13:04:59 -04:00
Ellis Hoag b692312ca4 [InstrProf] Add new format for -fprofile-list=
In D130807 we added the `skipprofile` attribute. This commit
changes the format so we can either `forbid` or `skip` profiling
functions by adding the `noprofile` or `skipprofile` attributes,
respectively. The behavior of the original format remains
unchanged.

Also, add the `skipprofile` attribute when using
`-fprofile-function-groups`.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D130808
2022-08-04 08:49:43 -07:00
Ellis Hoag 12e78ff881 [InstrProf] Add the skipprofile attribute
As discussed in [0], this diff adds the `skipprofile` attribute to
prevent the function from being profiled while allowing profiled
functions to be inlined into it. The `noprofile` attribute remains
unchanged.

The `noprofile` attribute is used for functions where it is
dangerous to add instrumentation to while the `skipprofile` attribute is
used to reduce code size or performance overhead.

[0] https://discourse.llvm.org/t/why-does-the-noprofile-attribute-restrict-inlining/64108

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D130807
2022-08-04 08:45:27 -07:00
Matt Jacobson c8b2f3f51b [ObjC] type method metadata `_imp`, messenger routine at callsite with program address space
On targets with non-default program address space (e.g., Harvard
architectures), clang crashes when emitting Objective-C method metadata,
because the address of the method IMP cannot be bitcast to i8*. It similarly
crashes at messenger callsite with a failed bitcast.

Define the _imp field instead as i8 addrspace(1)* (or whatever the target's
program address space is). And in getMessageSendInfo(), create signatureType by
specifying the program address space.

Add a regression test using the AVR target. Test failed previously and passes
now. Checked codegen of the test for x86_64-apple-darwin19.6.0 and saw no
difference, as expected.

Reviewed By: rjmccall, dylanmckay

Differential Revision: https://reviews.llvm.org/D112113
2022-08-04 05:40:32 -04:00
Corentin Jabot 127bf44385 [Clang][C++20] Support capturing structured bindings in lambdas
This completes the implementation of P1091R3 and P1381R1.

This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.

In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.

We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.

In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.

Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.

at the request of @shafik, i can confirm the correct behavior of lldb wit this change.

Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122768
2022-08-04 10:12:53 +02:00
Phoebe Wang 6f867f9102 [X86] Support ``-mindirect-branch-cs-prefix`` for call and jmp to indirect thunk
This is to address feature request from https://github.com/ClangBuiltLinux/linux/issues/1665

Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D130754
2022-08-04 15:12:15 +08:00
Corentin Jabot a274219600 Revert "[Clang][C++20] Support capturing structured bindings in lambdas"
This reverts commit 44f2baa380.

Breaks self builds and seems to have conformance issues.
2022-08-03 21:00:29 +02:00
Corentin Jabot 44f2baa380 [Clang][C++20] Support capturing structured bindings in lambdas
This completes the implementation of P1091R3 and P1381R1.

This patch allow the capture of structured bindings
both for C++20+ and C++17, with extension/compat warning.

In addition, capturing an anonymous union member,
a bitfield, or a structured binding thereof now has a
better diagnostic.

We only support structured bindings - as opposed to other kinds
of structured statements/blocks. We still emit an error for those.

In addition, support for structured bindings capture is entirely disabled in
OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there.

Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented.

at the request of @shafik, i can confirm the correct behavior of lldb wit this change.

Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/54300
Fixes https://github.com/llvm/llvm-project/issues/52720

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122768
2022-08-03 20:00:01 +02:00
Yuanfang Chen 92c1bc6158 [CodeGen][inlineasm] assume the flag output of inline asm is boolean value
GCC inline asm document says that
"... the general rule is that the output variable must be a scalar
integer, and the value is boolean."

Commit e5c37958f9 lowers flag output of
inline asm on X86 with setcc, hence it is guaranteed that the flag
is of boolean value. Clang does not support ARM inline asm flag output
yet so nothing need to be worried about ARM.

See "Flag Output" section at
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands

Fixes https://github.com/llvm/llvm-project/issues/56568

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129954
2022-08-02 11:49:01 -07:00
Alok Kumar Sharma 5ec6ea3dfd [clang][OpenMP][DebugInfo] Mark OpenMP generated functions as artificial
The Clang compiler generates internal functions for OpenMP. Current
patch marks these functions as artificial.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D111521
2022-08-02 21:24:46 +05:30
Chuanqi Xu 6d10733d44 [C++20] [Modules] Handle initializer for Header Units
Previously when we add module initializer, we forget to handle header
units. This results that we couldn't compile a Hello World Example with
Header Units. This patch tries to fix this.

Reviewed By: iains

Differential Revision: https://reviews.llvm.org/D130871
2022-08-02 11:24:46 +08:00
Chuanqi Xu 39cfde2366 Revert "[C++20] [Modules] Handle initializer for Header Units"
This reverts commit db6152ad66.

This commit fails in ppc64. Since we want to backport it to 15.x. So
revert it now to keep the patch complete.
2022-08-02 11:09:38 +08:00
Chuanqi Xu db6152ad66 [C++20] [Modules] Handle initializer for Header Units
Previously when we add module initializer, we forget to handle header
units. This results that we couldn't compile a Hello World Example with
Header Units. This patch tries to fix this.

Reviewed By: iains

Differential Revision: https://reviews.llvm.org/D130871
2022-08-02 10:27:02 +08:00
Zakk Chen 71fd66161d [RISCV][Clang] Support RVV policy functions.
1. Add policy functions support and tests for vadd, vmv, vfmv and all load
   instructions except segment load. I didn't add all combination of policy
   functions in test because it seem not to make sense.
2. Rename HasUnMaskedOverloaded to SupportOverloading.
3. vmv.s.x for ta policy could not have overloaded API.
4. This patch does not support all operations, I will have other follow-up
   patches support all.

[RFC] https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/137

Reviewed By: kito-cheng, fakepaper56, fakepaper56

Differential Revision: https://reviews.llvm.org/D126742
2022-08-01 17:32:08 +00:00
Gabriel Ravier 5674a3c880 Fixed a number of typos
I went over the output of the following mess of a command:

(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z |
 parallel --xargs -0 cat | aspell list --mode=none --ignore-case |
 grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n |
 grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)

and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).

Differential Revision: https://reviews.llvm.org/D130827
2022-08-01 13:13:18 -04:00
Chris Bieneman 5dbb92d8cd [HLSL] CodeGen HLSL Resource annotations
HLSL Resource types need special annotations that the backend will use
to build out metadata and resource annotations that are required by
DirectX and Vulkan drivers in order to provide correct data bindings
for shader exeuction.

This patch adds some of the required data for unordered-access-views
(UAV) resource binding into the module flags. This data will evolve
over time to cover all the required use cases, but this should get
things started.

Depends on D130018.

Differential Revision: https://reviews.llvm.org/D130019
2022-08-01 11:19:43 -05:00
Dominik Adamski d90b7bf2c5 Add support for lowering simd if clause to LLVM IR
Scope of changes:
  1) Added new function to generate loop versioning
  2) Added support for if clause to applySimd function
  2) Added tests which confirm that lowering is successful

If ifCond is specified, then collapsed loop is duplicated and if branch
is added. Duplicated loop is executed if simd ifCond is evaluated to false.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D129368

Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
2022-08-01 04:43:32 -05:00
Chuanqi Xu bacdf80f42 Use @llvm.threadlocal.address intrinsic to access TLS variable
This is successor for D125291. This revision would try to use
@llvm.threadlocal.address in clang to access TLS variable. The reason
why the OpenMP tests contains a lot of change is that they uses
utils/update_cc_test_checks.py to update their tests.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129833
2022-08-01 11:05:00 +08:00
Jun Zhang 3da1395383
[CodeGen][NFC] Use isa_and_nonnull instead of explicit check
Signed-off-by: Jun Zhang <jun@junz.org>
2022-07-31 13:03:24 +08:00
skc7 09c4121123 Revert "Revert "[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values""
This reverts commit 4e1fe96.

Reverting this commit and fix the tests that caused failures due to
a35c64c.
2022-07-29 19:07:07 +00:00
Amy Kwan 4e1fe968c9 Revert "[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values"
This reverts commit a35c64ce23.

Reverting this commit as it causes various failures on LE and BE PPC bots.
2022-07-29 13:28:48 -05:00
skc7 a35c64ce23 [Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values
Add the ability to put __attribute__((maybe_undef)) on function arguments.
Clang codegen introduces a freeze instruction on the argument.

Differential Revision: https://reviews.llvm.org/D130224
2022-07-29 02:27:26 +00:00
Shafik Yaghmour b364535304 [Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values
DR2338 clarified that it was undefined behavior to set the value outside the
range of the enumerations values for an enum without a fixed underlying type.

We should diagnose this with a constant expression context.

Differential Revision: https://reviews.llvm.org/D130058
2022-07-28 15:27:50 -07:00
David Blaikie 4e719e0f16 DebugInfo: Prefer vtable homing over ctor homing.
Vtables will be emitted in fewer places than ctors (every ctor
references the vtable, so at worst it's the same places - but at best
the type has a non-inline key function and the vtable is emitted in one
place)

Pulling this fix out of 517bbc64db which
was reverted in 4821508d4d
2022-07-28 00:07:35 +00:00
Shafik Yaghmour 28cd7f86ed Revert "[Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values"
This reverts commit a3710589f2.
2022-07-27 15:31:41 -07:00
Shafik Yaghmour a3710589f2 [Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values
DR2338 clarified that it was undefined behavior to set the value outside the
range of the enumerations values for an enum without a fixed underlying type.

We should diagnose this with a constant expression context.

Differential Revision: https://reviews.llvm.org/D130058
2022-07-27 14:59:35 -07:00
Matheus Izvekov 15f3cd6bfc
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could expose a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-27 11:10:54 +02:00
Argyrios Kyrtzidis 8dfaecc4c2 [CGDebugInfo] Access the current working directory from the `VFS`
...instead of calling `llvm::sys::fs::current_path()` directly.

Differential Revision: https://reviews.llvm.org/D130443
2022-07-26 13:48:39 -07:00
Fangrui Song de1b5c9145 [AArch64] Simplify BTI/PAC-RET module flags
These module flags use the Min merge behavior with a default value of
zero, so we don't need to emit them if zero.

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D130145
2022-07-26 09:48:36 -07:00
Stefan Gränitz 1e30820483 [WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms
WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This
affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222

This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs.
* Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand.
* PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers.
* LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites.

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D128190
2022-07-26 17:52:43 +02:00
Arthur Eubanks 2eade1dba4 [WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes
Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`.

Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`.

To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D128955
2022-07-26 08:01:08 -07:00
Kazu Hirata 3f3930a451 Remove redundaunt virtual specifiers (NFC)
Identified with tidy-modernize-use-override.
2022-07-25 23:00:59 -07:00
Jun Zhang 58c9480845
[CodeGen] Consider MangleCtx when move lazy emission States
Also move MangleCtx when moving some lazy emission states in
CodeGenModule. Without this patch clang-repl hits an invalid address
access when passing `-Xcc -O2` flag.

Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D130420
2022-07-26 12:34:03 +08:00
Kazu Hirata 95a932fb15 Remove redundaunt override specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 22:28:11 -07:00
Kazu Hirata 3650615fb2 [clang] Remove unused forward declarations (NFC) 2022-07-24 20:51:06 -07:00
David Chisnall 94c3b16978 Fix crash in ObjC codegen introduced with 5ab6ee7599
5ab6ee7599 assumed that if `RValue::isScalar()` returns true then `RValue::getScalarVal` will return a valid value.  This is not the case when the return value is `void` and so void message returns would crash if they hit this path.  This is triggered only for cases where the nil-handling path needs to do something non-trivial (destroy arguments that should be consumed by the callee).

Reviewed By: triplef

Differential Revision: https://reviews.llvm.org/D123898
2022-07-24 13:59:45 +01:00
Dmitri Gribenko aba43035bd Use llvm::sort instead of std::sort where possible
llvm::sort is beneficial even when we use the iterator-based overload,
since it can optionally shuffle the elements (to detect
non-determinism). However llvm::sort is not usable everywhere, for
example, in compiler-rt.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D130406
2022-07-23 15:19:05 +02:00
Jun Zhang 1a3a2eec71
[NFC] Move function definition to cpp file
Signed-off-by: Jun Zhang <jun@junz.org>
2022-07-23 13:43:42 +08:00
Shangwu Yao 31d8dbd1e5 [CUDA/SPIR-V] Force passing aggregate type byval
This patch forces copying aggregate type in kernel arguments by value when
compiling CUDA targeting SPIR-V. The original behavior is not passing by value
when there is any of destructor, copy constructor and move constructor defined
by user. This patch makes the behavior of SPIR-V generated from CUDA follow
the CUDA spec
(https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-function-argument-processing),
and matches the NVPTX
implementation (
41958f76d8/clang/lib/CodeGen/TargetInfo.cpp (L7241)).

Differential Revision: https://reviews.llvm.org/D130387
2022-07-22 20:30:15 +00:00
Sergei Barannikov 37502e042f [clang][CodeGen] Only include ABIInfo.h where required (NFC)
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D130322
2022-07-22 10:45:02 -07:00
Iain Sandoe afda39a566 re-land [C++20][Modules] Build module static initializers per P1874R1.
The re-land fixes module map module dependencies seen on Greendragon, but
not in the clang test suite.

---

Currently we only implement this for the Itanium ABI since the correct
mangling for the initializers in other ABIs is not yet known.

Intended result:

For a module interface [which includes partition interface and implementation
units] (instead of the generic CXX initializer) we emit a module init that:

 - wraps the contained initializations in a control variable to ensure that
   the inits only happen once, even if a module is imported many times by
   imports of the main unit.

 - calls module initializers for imported modules first.  Note that the
   order of module import is not significant, and therefore neither is the
   order of imported module initializers.

 - We then call initializers for the Global Module Fragment (if present)
 - We then call initializers for the current module.
 - We then call initializers for the Private Module Fragment (if present)

For a module implementation unit, or a non-module TU that imports at least one
module we emit a regular CXX init that:

 - Calls the initializers for any imported modules first.
 - Then proceeds as normal with remaining inits.

For all module unit kinds we include a global constructor entry, this allows
for the (in most cases unusual) possibility that a module object could be
included in a final binary without a specific call to its initializer.

Implementation:

 - We provide the module pointer in the AST Context so that CodeGen can act
   on it and its sub-modules.

 - We need to account for module build lines like this:
  ` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or
  ` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o`

 - in order to do this, we add to ParseAST to set the module pointer in
   the ASTContext, once we establish that this is a module build and we
   know the module pointer. To be able to do this, we make the query for
   current module public in Sema.

 - In CodeGen, we determine if the current build requires a CXX20-style module
   init and, if so, we defer any module initializers during the "Eagerly
   Emitted" phase.

 - We then walk the module initializers at the end of the TU but before
   emitting deferred inits (which adds any hidden and static ones, fixing
   https://github.com/llvm/llvm-project/issues/51873 ).

 - We then proceed to emit the deferred inits and continue to emit the CXX
   init function.

Differential Revision: https://reviews.llvm.org/D126189
2022-07-22 08:38:07 +01:00
Shraiysh Vaishay 61fa7a88c7 [clang][OpenMP] Add IRBuilder support for taskgroup
This patch makes use of OMPIRBuilder support for codegen of taskgroup
construct in clang.

Depends on D128203

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D129992
2022-07-21 11:13:57 +05:30
Fangrui Song 23ba688f02 [X86] Use Min behavior for cf-protection-{return,branch}/ibt-seal module flags
These features require that all object files are compiled with the support. When
the feature is disabled for an object file, the merge behavior should treat the
file having a value of 0 (see D129911).

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D130065
2022-07-19 21:20:02 -07:00
serge-sans-paille f764dc99b3 [clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.

Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.

n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member

This takes into account two specificities of clang: array bounds as macro id
disqualify FAM, as well as non standard layout.

Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
2022-07-18 12:45:52 +02:00
Fangrui Song 0d5a62faca [sanitizer] Add "mainfile" prefix to sanitizer special case list
When an issue exists in the main file (caller) instead of an included file
(callee), using a `src` pattern applying to the included file may be
inappropriate if it's the caller's responsibility. Add `mainfile` prefix to check
the main filename.

For the example below, the issue may reside in a.c (foo should not be called
with a misaligned pointer or foo should switch to an unaligned load), but with
`src` we can only apply to the innocent callee a.h. With this patch we can use
the more appropriate `mainfile:a.c`.
```
//--- a.h
// internal linkage
static inline int load(int *x) { return *x; }

//--- a.c, -fsanitize=alignment
#include "a.h"
int foo(void *x) { return load(x); }
```

See the updated clang/docs/SanitizerSpecialCaseList.rst for a caveat due
to C++ vague linkage functions.

Reviewed By: #sanitizers, kstoimenov, vitalybuka

Differential Revision: https://reviews.llvm.org/D129832
2022-07-15 10:39:26 -07:00
Nikita Popov 2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Jonas Devlieghere 888673b6e3
Revert "[clang] Implement ElaboratedType sugaring for types written bare"
This reverts commit 7c51f02eff because it
stills breaks the LLDB tests. This was  re-landed without addressing the
issue or even agreement on how to address the issue. More details and
discussion in https://reviews.llvm.org/D112374.
2022-07-14 21:17:48 -07:00
Matheus Izvekov 7c51f02eff
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

---

Troubleshooting list to deal with any breakage seen with this patch:

1) The most likely effect one would see by this patch is a change in how
   a type is printed. The type printer will, by design and default,
   print types as written. There are customization options there, but
   not that many, and they mainly apply to how to print a type that we
   somehow failed to track how it was written. This patch fixes a
   problem where we failed to distinguish between a type
   that was written without any elaborated-type qualifiers,
   such as a 'struct'/'class' tags and name spacifiers such as 'std::',
   and one that has been stripped of any 'metadata' that identifies such,
   the so called canonical types.
   Example:
   ```
   namespace foo {
     struct A {};
     A a;
   };
   ```
   If one were to print the type of `foo::a`, prior to this patch, this
   would result in `foo::A`. This is how the type printer would have,
   by default, printed the canonical type of A as well.
   As soon as you add any name qualifiers to A, the type printer would
   suddenly start accurately printing the type as written. This patch
   will make it print it accurately even when written without
   qualifiers, so we will just print `A` for the initial example, as
   the user did not really write that `foo::` namespace qualifier.

2) This patch could expose a bug in some AST matcher. Matching types
   is harder to get right when there is sugar involved. For example,
   if you want to match a type against being a pointer to some type A,
   then you have to account for getting a type that is sugar for a
   pointer to A, or being a pointer to sugar to A, or both! Usually
   you would get the second part wrong, and this would work for a
   very simple test where you don't use any name qualifiers, but
   you would discover is broken when you do. The usual fix is to
   either use the matcher which strips sugar, which is annoying
   to use as for example if you match an N level pointer, you have
   to put N+1 such matchers in there, beginning to end and between
   all those levels. But in a lot of cases, if the property you want
   to match is present in the canonical type, it's easier and faster
   to just match on that... This goes with what is said in 1), if
   you want to match against the name of a type, and you want
   the name string to be something stable, perhaps matching on
   the name of the canonical type is the better choice.

3) This patch could exposed a bug in how you get the source range of some
   TypeLoc. For some reason, a lot of code is using getLocalSourceRange(),
   which only looks at the given TypeLoc node. This patch introduces a new,
   and more common TypeLoc node which contains no source locations on itself.
   This is not an inovation here, and some other, more rare TypeLoc nodes could
   also have this property, but if you use getLocalSourceRange on them, it's not
   going to return any valid locations, because it doesn't have any. The right fix
   here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive
   into the inner TypeLoc to get the source range if it doesn't find it on the
   top level one. You can use getLocalSourceRange if you are really into
   micro-optimizations and you have some outside knowledge that the TypeLocs you are
   dealing with will always include some source location.

4) Exposed a bug somewhere in the use of the normal clang type class API, where you
   have some type, you want to see if that type is some particular kind, you try a
   `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an
   ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match.
   Again, like 2), this would usually have been tested poorly with some simple tests with
   no qualifications, and would have been broken had there been any other kind of type sugar,
   be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType.
   The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper
   into the type. Or use `getAsAdjusted` when dealing with TypeLocs.
   For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast.

5) It could be a bug in this patch perhaps.

Let me know if you need any help!

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-15 04:16:55 +02:00
Ellis Hoag af58684f27 [InstrProf] Add options to profile function groups
Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instrument the functions in group `i`. Similar options were added to xray in https://reviews.llvm.org/D87953 and the goal is the same; to reduce instrumented size overhead by spreading the overhead across multiple builds. Raw profiles from different groups can be added like normal using the `llvm-profdata merge` command.

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D129594
2022-07-14 11:41:30 -07:00
Nick Desaulniers 140bfdca60 [clang][CodeGen] add fn_ret_thunk_extern to synthetic fns
Follow up fix to
commit 2240d72f15 ("[X86] initial -mfunction-return=thunk-extern
support")
https://reviews.llvm.org/D129572

@nathanchance reported that -mfunction-return=thunk-extern was failing
to annotate the asan and tsan contructors.
https://lore.kernel.org/llvm/Ys7pLq+tQk5xEa%2FB@dev-arch.thelio-3990X/

I then noticed the same occurring for gcov synthetic functions.

Similar to
commit 2786e67 ("[IR][sanitizer] Add module flag "frame-pointer" and set
it for cc1 -mframe-pointer={non-leaf,all}")
define a new module level MetaData, "fn_ret_thunk_extern", then when set
adds the fn_ret_thunk_extern IR Fn Attr to synthetically created
Functions.

Fixes https://github.com/llvm/llvm-project/issues/56514

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D129709
2022-07-14 11:25:24 -07:00
Kazu Hirata cb2c8f694d [clang] Use value instead of getValue (NFC) 2022-07-13 23:39:33 -07:00
Joseph Huber b370be37cc [CUDA] Allow the new driver to compile CUDA in non-RDC mode
The new driver primarily allows us to support RDC-mode compilations with
proper linking. This is not needed for non-RDC mode compilation, but we
still would like the new driver to be able to handle this mode so we can
transition away from the old driver in the future. This patch adds the
necessary code to support creating a fatbinary for CUDA code generation
as well as removing old assumptions and errors about RDC-mode with the
new driver.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D129655
2022-07-13 21:49:15 -04:00
Jonas Devlieghere 3968936b92
Revert "[clang] Implement ElaboratedType sugaring for types written bare"
This reverts commit bdc6974f92 because it
breaks all the LLDB tests that import the std module.

  import-std-module/array.TestArrayFromStdModule.py
  import-std-module/deque-basic.TestDequeFromStdModule.py
  import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py
  import-std-module/forward_list.TestForwardListFromStdModule.py
  import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py
  import-std-module/list.TestListFromStdModule.py
  import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py
  import-std-module/queue.TestQueueFromStdModule.py
  import-std-module/stack.TestStackFromStdModule.py
  import-std-module/vector.TestVectorFromStdModule.py
  import-std-module/vector-bool.TestVectorBoolFromStdModule.py
  import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py
  import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/
2022-07-13 09:20:30 -07:00
Mitch Phillips 7045519359 Add missing sanitizer metadata plumbing from CFE.
clang misses attaching sanitizer metadata for external globals.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D129492
2022-07-13 08:54:41 -07:00
Mitch Phillips 90e5a8ac47 Remove 'no_sanitize_memtag'. Add 'sanitize_memtag'.
For MTE globals, we should have clang emit the attribute for all GV's
that it creates, and then use that in the upcoming AArch64 global
tagging IR pass. We need a positive attribute for this sanitizer (rather
than implicit sanitization of all globals) because it needs to interact
with other parts of LLVM, including:

  1. Suppressing certain global optimisations (like merging),
  2. Emitting extra directives by the ASM writer, and
  3. Putting extra information in the symbol table entries.

While this does technically make the LLVM IR / bitcode format
non-backwards-compatible, nobody should have used this attribute yet,
because it's a no-op.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D128950
2022-07-13 08:54:41 -07:00
Jun Zhang 8082a00286
[CodeGen] Keep track of decls that were deferred and have been emitted.
This patch adds a new field called EmittedDeferredDecls in CodeGenModule
that keeps track of decls that were deferred and have been emitted.

The intention of this patch is to solve issues in the incremental c++,
we'll lose info of decls that are lazily emitted when we undo their
usage.

See example below:

clang-repl> inline int foo() { return 42;}
clang-repl> int bar = foo();
clang-repl> %undo
clang-repl> int baz = foo();
JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols: { (main, { baz, $.incr_module_2.inits.0,
orc_init_func.incr_module_2 }) }

Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D128782
2022-07-13 20:00:59 +08:00
Matheus Izvekov bdc6974f92
[clang] Implement ElaboratedType sugaring for types written bare
Without this patch, clang will not wrap in an ElaboratedType node types written
without a keyword and nested name qualifier, which goes against the intent that
we should produce an AST which retains enough details to recover how things are
written.

The lack of this sugar is incompatible with the intent of the type printer
default policy, which is to print types as written, but to fall back and print
them fully qualified when they are desugared.

An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still
requires pointer alignment due to pre-existing bug in the TypeLoc buffer
handling.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D112374
2022-07-13 02:10:09 +02:00
Nick Desaulniers 2240d72f15 [X86] initial -mfunction-return=thunk-extern support
Adds support for:
* `-mfunction-return=<value>` command line flag, and
* `__attribute__((function_return("<value>")))` function attribute

Where the supported <value>s are:
* keep (disable)
* thunk-extern (enable)

thunk-extern enables clang to change ret instructions into jmps to an
external symbol named __x86_return_thunk, implemented as a new
MachineFunctionPass named "x86-return-thunks", keyed off the new IR
attribute fn_ret_thunk_extern.

The symbol __x86_return_thunk is expected to be provided by the runtime
the compiled code is linked against and is not defined by the compiler.
Enabling this option alone doesn't provide mitigations without
corresponding definitions of __x86_return_thunk!

This new MachineFunctionPass is very similar to "x86-lvi-ret".

The <value>s "thunk" and "thunk-inline" are currently unsupported. It's
not clear yet that they are necessary: whether the thunk pattern they
would emit is beneficial or used anywhere.

Should the <value>s "thunk" and "thunk-inline" become necessary,
x86-return-thunks could probably be merged into x86-retpoline-thunks
which has pre-existing machinery for emitting thunks (which could be
used to implement the <value> "thunk").

Has been found to build+boot with corresponding Linux
kernel patches. This helps the Linux kernel mitigate RETBLEED.
* CVE-2022-23816
* CVE-2022-28693
* CVE-2022-29901

See also:
* "RETBLEED: Arbitrary Speculative Code Execution with Return
Instructions."
* AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion
* TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0
  2022-07-12
* Return Stack Buffer Underflow / Return Stack Buffer Underflow /
  CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702

SystemZ may eventually want to support "thunk-extern" and "thunk"; both
options are used by the Linux kernel's CONFIG_EXPOLINE.

This functionality has been available in GCC since the 8.1 release, and
was backported to the 7.3 release.

Many thanks for folks that provided discrete review off list due to the
embargoed nature of this hardware vulnerability. Many Bothans died to
bring us this information.

Link: https://www.youtube.com/watch?v=IF6HbCKQHK8
Link: https://github.com/llvm/llvm-project/issues/54404
Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html
Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html
Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60
Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html
Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html

Reviewed By: aaron.ballman, craig.topper

Differential Revision: https://reviews.llvm.org/D129572
2022-07-12 09:17:54 -07:00
Xiang1 Zhang a45dd3d814 [X86] Support -mstack-protector-guard-symbol
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D129346
2022-07-12 10:17:00 +08:00
Xiang1 Zhang 643786213b Revert "[X86] Support -mstack-protector-guard-symbol"
This reverts commit efbaad1c4a.
due to miss adding review info.
2022-07-12 10:14:32 +08:00
Xiang1 Zhang efbaad1c4a [X86] Support -mstack-protector-guard-symbol 2022-07-12 10:13:48 +08:00
Joseph Huber e88d53d25f [HIP] Generate offloading entries for HIP with the new driver.
This patch adds the small change required to output offloading entried
for HIP instead of CUDA. These should be placed in different sections so
because they need to be distinct to the offloading toolchain, otherwise
we'd have HIP trying to register CUDA kernels or vice-versa. This patch will
precede support for HIP in the linker wrapper.

Reviewed By: yaxunl, tra

Differential Revision: https://reviews.llvm.org/D128850
2022-07-11 15:49:21 -04:00
Mitch Phillips f18de7619e Update DynInit generation for ASan globals.
Address a follow-up TODO for Sanitizer Metadata.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D128672
2022-07-11 12:23:37 -07:00