Commit Graph

15545 Commits

Author SHA1 Message Date
Bradley Smith a83aa33d1b [IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.

Differential Revision: https://reviews.llvm.org/D127976
2022-06-27 10:48:45 +00:00
Craig Topper 016342e319 [RISCV] Evaluate ICE operands to builtins using getIntegerConstantExpr.
Some RISC-V builtins requires ICE operands. We should call
getIntegerConstantExpr instead of EmitScalarExpr to match other
targets.

This was made a little trickier by the vector intrinsics not having
a valid type string, but there are two that have ICE operands so
I specified them manually.
2022-06-26 13:51:17 -07:00
Kazu Hirata 97afce08cb [clang] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 22:26:24 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata b8df4093e4 [clang, clang-tools-extra] Don't use Optional::{hasValue,getValue} (NFC) 2022-06-25 11:55:33 -07:00
Fangrui Song 572b08790a [clang] Add back -fsanitize=array-bounds workaround for size-1 array after -fstrict-flex-arrays change
Before C99 introduced flexible array member, common practice uses size-1 array
to emulate FAM, e.g. https://github.com/python/cpython/issues/94250
As a result, -fsanitize=array-bounds instrumentation skipped such structures
as a workaround (from 539e4a77bb).

D126864 accidentally dropped the workaround. Add it back with tests.
2022-06-24 22:15:47 -07:00
David Blaikie 4821508d4d Revert "DebugInfo: Fully integrate ctor type homing into 'limited' debug info"
Reverting to simplify some Google-internal rollout issues. Will recommit
in a week or two.

This reverts commit 517bbc64db.
2022-06-24 17:07:47 +00:00
Fazlay Rabbi 42bb88e2aa [OpenMP] Initial parsing and sema support for 'masked taskloop' construct
This patch gives basic parsing and semantic support for "masked taskloop"
construct introduced in OpenMP 5.1 (section 2.16.7)

Differential Revision: https://reviews.llvm.org/D128478
2022-06-24 10:00:08 -07:00
Eli Friedman e11bf8de72 [clang codegen] Add dso_local/hidden/etc. markings to VTT declarations
We were marking definitions, but not declarations. Marking declarations
makes computing the address more efficient.

Fixes issue reported at https://discourse.llvm.org/t/63090

Differential Revision: https://reviews.llvm.org/D128482
2022-06-24 09:58:31 -07:00
Yaxun (Sam) Liu 8ad4c6e4b1 [HIP] add -fhip-kernel-arg-name
Add option -fhip-kernel-arg-name to emit kernel argument
name metadata, which is needed for certain HIP applications.

Reviewed by: Artem Belevich, Fangrui Song, Brian Sumner

Differential Revision: https://reviews.llvm.org/D128022
2022-06-24 11:15:36 -04:00
serge-sans-paille 886715af96 [clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays
Some code [0] consider that trailing arrays are flexible, whatever their size.
Support for these legacy code has been introduced in
f8f6324983 but it prevents evaluation of
__builtin_object_size and __builtin_dynamic_object_size in some legit cases.

Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is
desirable.

n = 0: current behavior, any trailing array member is a flexible array. The default.
n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member
n = 2: any trailing array member of undefined or 0 size is a flexible array member
n = 3: any trailing array member of undefined size is a flexible array member (strict c99 conformance)

Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

[0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions
2022-06-24 16:13:29 +02:00
David Blaikie 517bbc64db DebugInfo: Fully integrate ctor type homing into 'limited' debug info
Simplify debug info back to just "limited" or "full" by rolling the ctor
type homing fully into the "limited" debug info.

Also fix a bug I found along the way that was causing ctor type homing
to kick in even when something could be vtable homed (where vtable
homing is stronger/more effective than ctor homing) - fixing at the same
time as it keeps the tests (that were testing only "limited non ctor"
homing and now test ctor homing) passing.
2022-06-23 20:15:00 +00:00
Jeroen Dobbelaere 8999b745bc Revert "[tbaa] Handle base classes in struct tbaa"
This reverts commit cdc59e2202.

The Verifier finds a problem in a stage2 build. Reverting so Bruno can investigate.
2022-06-23 14:18:49 +02:00
Bruno De Fraine cdc59e2202 [tbaa] Handle base classes in struct tbaa
This is a fix for the miscompilation reported in https://github.com/llvm/llvm-project/issues/55384

Not adding a new test case since existing test cases already cover base classes (including new-struct-path tbaa).

Reviewed By: jeroen.dobbelaere

Differential Revision: https://reviews.llvm.org/D126956
2022-06-23 13:39:49 +02:00
Guillaume Gomez d0a4450ecd Rename GCCBuiltin into ClangBuiltin
This patch is needed because developers expect "GCCBuiltin" items to be the GCC intrinsics equivalent and not the Clang internals.

Reviewed By: #libc_abi, RKSimon, xbolva00

Differential Revision: https://reviews.llvm.org/D127460
2022-06-22 19:49:20 +01:00
Serge Pavlov 706e89db97 Fix interaction of pragma FENV_ACCESS with other pragmas
Previously `#pragma STDC FENV_ACCESS ON` always set dynamic rounding
mode and strict exception handling. It is not correct in the presence
of other pragmas that also modify rounding mode and exception handling.
For example, the effect of previous pragma FENV_ROUND could be
cancelled, which is not conformant with the C standard. Also
`#pragma STDC FENV_ACCESS OFF` turned off only FEnvAccess flag, leaving
rounding mode and exception handling unchanged, which is incorrect in
general case.

Concrete rounding and exception mode depend on a combination of several
factors like various pragmas and command-line options. During the review
of this patch an idea was proposed that the semantic actions associated
with such pragmas should only set appropriate flags. Actual rounding
mode and exception handling should be calculated taking into account the
state of all relevant options. In such implementation the pragma
FENV_ACCESS should not override properties set by other pragmas but
should set them if such setting is absent.

To implement this approach the following main changes are made:

- Field `FPRoundingMode` is removed from `LangOptions`. Actually there
  are no options that set it to arbitrary rounding mode, the choice was
  only `dynamic` or `tonearest`. Instead, a new boolean flag
  `RoundingMath` is added, with the same meaning as the corresponding
  command-line option.

- Type `FPExceptionModeKind` now has possible value `FPE_Default`. It
  does not represent any particular exception mode but indicates that
  such mode was not set and default value should be used. It allows to
  distinguish the case:

    {
        #pragma STDC FENV_ACCESS ON
	...
    }

  where the pragma must set FPE_Strict, from the case:

    {
        #pragma clang fp exceptions(ignore)
        #pragma STDC FENV_ACCESS ON
        ...
    }

  where exception mode should remain `FPE_Ignore`.

  - Class `FPOptions` has now methods `getRoundingMode` and
  `getExceptionMode`, which calculates the respective properties from
  other specified FP properties.

  - Class `LangOptions` has now methods `getDefaultRoundingMode` and
  `getDefaultExceptionMode`, which calculates default modes from the
  specified options and should be used instead of `getRoundingMode` and
  `getFPExceptionMode` of the same class.

Differential Revision: https://reviews.llvm.org/D126364
2022-06-22 15:13:54 +07:00
Kazu Hirata ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Kazu Hirata 064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Kazu Hirata ad7ce1e769 Don't use Optional::hasValue (NFC) 2022-06-20 11:49:10 -07:00
Kazu Hirata 452db157c9 [clang] Don't use Optional::hasValue (NFC) 2022-06-20 10:51:34 -07:00
Kazu Hirata 06decd0b41 [clang] Use value_or instead of getValueOr (NFC) 2022-06-18 23:21:34 -07:00
Jun Zhang cd64a427ef
Reland "[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder"
This reverts commits:
d3ddc251ac
d90eecff5c

It turned out there're some options turned on that leaks the memory
intentionally, which fires the asan builds after the patch being
applied. The issue has been fixed in
7bc00ce5cd, so reland it.

Below is the original commit message:

The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.

This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:

clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();

JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }

Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>
2022-06-18 20:27:21 +08:00
Pavel Iliin 6e070c3c91 [NFC] Specifing clang namespace for builtins. 2022-06-18 10:44:25 +01:00
Akira Hatanaka 8fc3d719ee Stop wrapping GCCAsmStmts inside StmtExprs to destruct temporaries
Instead, just pop the cleanups at the end of the asm statement.

This fixes an assertion failure in BuildStmtExpr. It also fixes a bug
where blocks and C compound literals were destructed at the end of the
asm statement instead of at the end of the enclosing scope.

Differential Revision: https://reviews.llvm.org/D125936
2022-06-17 17:28:00 -07:00
Jennifer Yu bb83f8e70b [OpenMP] Initial parsing and sema for 'parallel masked' construct
Differential Revision: https://reviews.llvm.org/D127454
2022-06-16 18:01:15 -07:00
Lei Huang dba2ff500d fix x86 sanitizer failure due to use of or 2022-06-16 17:20:31 -05:00
Maryam Moghadas a9ddb7d54e [PowerPC] Fixing implicit castings in altivec for -fno-lax-vector-conversions
XL considers different vector types to be incompatible with each other.
For example assignment between variables of types vector float and vector
long long or even vector signed int and vector unsigned int are diagnosed.
clang, however does not diagnose such cases and does a simple bitcast between
the two types. This could easily result in program errors. This patch is to
fix the implicit casts in altivec.h so that there is no incompatible vector
type errors whit -fno-lax-vector-conversions, this is the prerequisite patch
to switch the default to -fno-lax-vector-conversions later.

Reviewed By: nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D124093
2022-06-16 17:07:03 -05:00
Arthur Eubanks a70b39abff [clang] Don't emit type test/assume for virtual classes that should never participate in WPD
Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D127876
2022-06-16 09:38:14 -07:00
Jun Zhang 44f0a2658d
Revert "Reland "[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder""
This reverts commit 781ee538da.

Asan build is still broken :(
2022-06-14 19:53:17 +08:00
Guillaume Chatelet d9b8d13f8b [NFC][Alignment] Use MaybeAlign in CGCleanup/CGExpr 2022-06-14 10:56:36 +00:00
Jun Zhang 781ee538da
Reland "[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder"
This reverts commits:
d3ddc251ac
d90eecff5c

This relands below commit with asan fix:

The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.

This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:

clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();

JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }

Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D127730
2022-06-14 18:36:03 +08:00
Chuanqi Xu 735e6c40b5 [Coroutines] Convert coroutine.presplit to enum attr
This is required by @nikic in https://reviews.llvm.org/D127383 to
decrease the cost to check whether a function is a coroutine and this
fixes a FIXME too.

Reviewed By: rjmccall, ezhulenev

Differential Revision: https://reviews.llvm.org/D127471
2022-06-14 14:23:46 +08:00
Mitch Phillips 77475ffd22 Reland "Add sanitizer metadata attributes to clang IR gen."
RE-LAND (reverts a revert):
This reverts commit 8e1f47b596.

This patch adds generation of sanitizer metadata attributes (which were
added in D126100) to the clang frontend.

We still currently generate the llvm.asan.globals that's consumed by
the IR pass, but the plan is to eventually migrate off of that onto
purely debuginfo and these IR attributes.

Reviewed By: vitalybuka, kstoimenov

Differential Revision: https://reviews.llvm.org/D126929
2022-06-13 12:23:27 -07:00
Mitch Phillips 8e1f47b596 Revert "Add sanitizer metadata attributes to clang IR gen."
This reverts commit e7766972a6.

Broke the Windows buildbots.
2022-06-13 12:11:13 -07:00
Mitch Phillips e7766972a6 Add sanitizer metadata attributes to clang IR gen.
This patch adds generation of sanitizer metadata attributes (which were
added in D126100) to the clang frontend.

We still currently generate the `llvm.asan.globals` that's consumed by
the IR pass, but the plan is to eventually migrate off of that onto
purely debuginfo and these IR attributes.

Reviewed By: vitalybuka, kstoimenov

Differential Revision: https://reviews.llvm.org/D126929
2022-06-13 11:19:15 -07:00
David Tenty 6a8673038b Reland [clang][AIX] add option mdefault-visibility-export-mapping
The option mdefault-visibility-export-mapping is created to allow
mapping default visibility to an explicit shared library export
(e.g. dllexport). Exactly how and if this is manifested is target
dependent (since it depends on how they map dllexport in the IR).

Three values are provided for the option:

* none: the default and behavior without the option, no additional export linkage information is created.
* explicit: add the export for entities with explict default visibility from the source, including RTTI
* all: add the export for all entities with default visibility

This option is useful for targets which do not export symbols as part of
their usual default linkage behaviour (e.g. AIX), such targets
traditionally specified such information in external files (e.g. export
lists), but this mapping allows them to use the visibility information
typically used for this purpose on other (e.g. ELF) platforms.

This relands commit: 8c8a2679a2

with fixes for the compile time and assert problems that were reported
by:

* making shouldMapVisibilityToDLLExport inline and provide an early return
in the case where no mapping is in effect (aka non-AIX platforms)
* don't try to export RTTI types which we will give internal linkage to

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D126340
2022-06-13 13:43:46 -04:00
Mitch Phillips d3ddc251ac Revert "[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder"
This reverts commit b8f9459715.

Broke the ASan buildbot. See https://reviews.llvm.org/D126781 for more
information.
2022-06-13 10:12:38 -07:00
Mitch Phillips d90eecff5c Revert "Also move WeakRefReferences in CodeGenModule::moveLazyEmssionStates"
This reverts commit 0ecbedc098.

Parent change broke the ASan buildbot. See
https://reviews.llvm.org/D126781 for more information.
2022-06-13 10:12:38 -07:00
Jez Ng d4bcb45db7 [MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible
This reverts commit d941d59783.

Differential Revision: https://reviews.llvm.org/D122258
2022-06-12 17:24:19 -04:00
Nuno Lopes 4dd1bffc9d [clang][CodeGen] Switch a few placeholders from UndefValue to PoisonValue
This change is cosmetic, as these are dummy values that are not observable, but it
gets us closer to removing undef.
NFC
2022-06-12 19:07:59 +01:00
Jez Ng d941d59783 Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible"
This reverts commit ef501bf85d.
2022-06-12 10:47:08 -04:00
Jez Ng ef501bf85d [MC] Omit DWARF unwind info if compact unwind is present where eligible
Previously, omitting unnecessary DWARF unwinds was only done in two
cases:
* For Darwin + aarch64, if no DWARF unwind info is needed for all the
  functions in a TU, then the `__eh_frame` section would be omitted
  entirely. If any one function needed DWARF unwind, then MC would emit
  DWARF unwind entries for all the functions in the TU.
* For watchOS, MC would omit DWARF unwind on a per-function basis, as
  long as compact unwind was available for that function.

This diff makes it so that we omit DWARF unwind on a per-function basis
for Darwin + aarch64 as well. In addition, we introduce the flag
`--emit-dwarf-unwind=` which can toggle between `always`,
`no-compact-unwind` (only emit DWARF when CU cannot be emitted for a
given function), and the target platform `default`.  `no-compact-unwind`
is particularly useful for newer x86_64 platforms: we don't want to omit
DWARF unwind for x86_64 in general due to possible backwards compat
issues, but we should make it possible for people to opt into this
behavior if they are only targeting newer platforms.

**Motivation:** I'm working on adding support for `__eh_frame` to LLD,
but I'm concerned that we would suffer a perf hit. Processing compact
unwind is already expensive, and that's a simpler format than EH frames.
Given that MC currently produces one EH frame entry for every compact
unwind entry, I don't think processing them will be cheap. I tried to do
something clever on LLD's end to drop the unnecessary EH frames at parse
time, but this made the code significantly more complex. So I'm looking
at fixing this at the MC level instead.

**Addendum:** It turns out that there was a latent bug in the X86
backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is
not too surprising given that this combination has not been heretofore
used.

For functions that have unwind info that cannot be encoded with CU, MC
would end up dropping both the compact unwind entry (OK; existing
behavior) as well as the DWARF entries (not OK).  This diff fixes things
so that we emit the DWARF entry, as well as a CU entry with encoding
`UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for
the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry
is necessary, this was the simplest fix. ld64 seems to be able to handle
both the absence and presence of this CU entry. Ultimately ld64 (and
LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there
is no impact to the final binary size.

Reviewed By: davide, lhames

Differential Revision: https://reviews.llvm.org/D122258
2022-06-12 10:03:56 -04:00
Kazu Hirata 5ee3876905 Use isa instead of dyn_cast (NFC) 2022-06-11 11:15:52 -07:00
Guillaume Chatelet 38637ee477 [clang] Add support for __builtin_memset_inline
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.

The idea is to get support from the compiler to easily write efficient memory function implementations.

This patch could be split in two:
 - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
 - and another one for the Clang part providing the instrinsic as a builtin.

Differential Revision: https://reviews.llvm.org/D126903
2022-06-10 13:13:59 +00:00
Simon Tatham ceb21fa4e4 [ARM] Fix how size-0 bitfields affect homogeneous aggregates.
By both AAPCS32 and AAPCS64, the test for whether an aggregate
qualifies as homogeneous (either HFA or HVA) is based on the data
layout alone. So any logical member of the structure that does not
affect the data layout also should not affect homogeneity. In
particular, an empty bitfield ('int : 0') should make no difference.

In fact, clang considered it to make a difference in C but not in C++,
and justified that policy as compatible with gcc. But that's
considered a bug in gcc as well (at least for Arm targets), and it's
fixed in gcc 12.1.

This fix mimics gcc's: zero-sized bitfields are now ignored in all
languages for the Arm (32- and 64-bit) ABIs. But I've left the
previous behaviour unchanged in other ABIs, by means of adding an
ABIInfo::isZeroLengthBitfieldPermittedInHomogeneousAggregate query
method which the Arm subclasses override.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D127197
2022-06-10 11:27:24 +01:00
Jun Zhang 0ecbedc098
Also move WeakRefReferences in CodeGenModule::moveLazyEmssionStates
I forgot this field in b8f9459715
Signed-off-by: Jun Zhang <jun@junz.org>
2022-06-10 13:11:09 +08:00
Jun Zhang b8f9459715
[CodeGen] Keep track info of lazy-emitted symbols in ModuleBuilder
The intent of this patch is to selectively carry some states over to
the Builder so we won't lose the information of the previous symbols.

This used to be several downstream patches of Cling, it aims to fix
errors in Clang Interpreter when trying to use inline functions.
Before this patch:

clang-repl> inline int foo() { return 42;}
clang-repl> int x = foo();

JIT session error: Symbols not found: [ _Z3foov ]
error: Failed to materialize symbols:
{ (main, { x, $.incr_module_1.__inits.0, __orc_init_func.incr_module_1 }) }

Co-authored-by: Axel Naumann <Axel.Naumann@cern.ch>
Signed-off-by: Jun Zhang <jun@junz.org>

Differential Revision: https://reviews.llvm.org/D126781
2022-06-09 23:12:21 +08:00
Bruno Cardoso Lopes e6a76a4935 [Clang][CoverageMapping] Fix compile time explosions by adjusting only appropriated skipped ranges
D83592 added comments to be part of skipped regions, and as part of that, it
also shrinks a skipped range if it spans a line that contains a non-comment
token. This is done by `adjustSkippedRange`.

The `adjustSkippedRange` currently runs on skipped regions that are not
comments, causing a 5min regression while building a big C++ files without any
comments.

Fix the compile time introduced in D83592 by tagging SkippedRange with kind
information and use that to decide what needs additional processing.

Differential Revision: https://reviews.llvm.org/D127338
2022-06-08 23:13:39 -07:00
Thomas Lively aff679a48c [WebAssembly] Implement remaining relaxed SIMD instructions
Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s,
i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are
the last instructions from the relaxed SIMD proposal[1] that had not been
implemented.

[1]:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.

Differential Revision: https://reviews.llvm.org/D127170
2022-06-08 10:32:10 -07:00
Vitaly Buka d7df3f0a4b [NFC] Exctract getNoSanitizeMask lambda 2022-06-07 14:08:43 -07:00
Vitaly Buka f32ad5703e [NFC] Move part of SanitizerMetadata into private method 2022-06-07 14:08:43 -07:00
Mitch Phillips 5d9de5f446 [NFC] Clang-format parts of D126929 and D126100 2022-06-07 14:08:43 -07:00
Mitch Phillips f49a5844b6 [NFC][CodeGen] Rename method
Extracted from D84652.
2022-06-07 14:08:42 -07:00
Erich Keane 5c3bde9625 [CodeGen] Fix an issue when the 'extern C' replacement names broke
Originally broken by me in D122608, this is a regression where we
attempt to replace an extern-C thing with 'itself'.  The problem is that
we end up deleting it, causing the value to fail when it gets put into
llvm.used.
2022-06-07 11:30:59 -07:00
Akira Hatanaka 3ba6ace3cc [gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in
DebugTypeVisitor

This recommits d1346e2. I've added a line to the test case to enable it
only on assert builds.

Differential Revision: https://reviews.llvm.org/D125839
2022-06-06 19:12:26 -07:00
Akira Hatanaka 834e5d12c7 Revert "[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in"
This reverts commit d1346e2ee2.

The commit broke a few bots.
2022-06-06 18:48:24 -07:00
Sam Clegg 47039a1a4b [WebAssembly] Remove restriction on main name mangling
Summary: Emscripten now handles/supports this new mode.

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75277
2022-06-06 14:04:27 -07:00
Akira Hatanaka d1346e2ee2 [gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in
DebugTypeVisitor

Differential Revision: https://reviews.llvm.org/D125839
2022-06-06 12:51:36 -07:00
Fangrui Song d86a206f06 Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 00:31:44 -07:00
Anders Waldenborg dd2362a8ba [clang] Allow const variables with weak attribute to be overridden
A variable with `weak` attribute signifies that it can be replaced with
a "strong" symbol link time. Therefore it must not emitted with
"weak_odr" linkage, as that allows the backend to use its value in
optimizations.

The frontend already considers weak const variables as
non-constant (note_constexpr_var_init_weak diagnostic) so this change
makes frontend and backend consistent.

This commit reverses the
  f49573d1 weak globals that are const should get weak_odr linkage.
commit from 2009-08-05 which introduced this behavior. Unfortunately
that commit doesn't provide any details on why the change was made.

This was discussed in
https://discourse.llvm.org/t/weak-attribute-semantics-on-const-variables/62311

Differential Revision: https://reviews.llvm.org/D126324
2022-06-03 23:44:15 +02:00
Guillaume Chatelet c698189696 [NFC] Format CGBuilder.h 2022-06-03 07:54:01 +00:00
Shilei Tian c4a90db720 [Clang][OpenMP] Add the codegen support for `atomic compare capture`
This patch adds the codegen support for `atomic compare capture` in clang.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D120290
2022-06-02 21:38:21 -04:00
Shilei Tian 3a96256b7e [Clang][OpenMP] Avoid using `IgnoreImpCasts` if possible
This patch removes all `IgnoreImpCasts` in Sema, and only uses it if necessary. If the expression is not of the same type as the pointer value, a cast is inserted.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D126602
2022-06-02 17:45:02 -04:00
Paul Robinson dc5175adef [PS5] Make passing unions in registers match PS4 ABI 2022-06-02 11:00:54 -07:00
Paul Robinson cc756f91c3 [PS5] Classify __m64 as integer, matching PS4 ABI 2022-06-02 11:00:53 -07:00
Hans Wennborg d42fe9aa84 Revert "[clang][AIX] add option mdefault-visibility-export-mapping"
This caused assertions, see comment on the code review:

llvm/clang/lib/AST/Decl.cpp:1510:
clang::LinkageInfo clang::LinkageComputer::getLVForDecl(const clang::NamedDecl *, clang::LVComputationKind):
Assertion `D->getCachedLinkage() == LV.getLinkage()' failed.

> The option mdefault-visibility-export-mapping is created to allow
> mapping default visibility to an explicit shared library export
> (e.g. dllexport). Exactly how and if this is manifested is target
> dependent (since it depends on how they map dllexport in the IR).
>
> Three values are provided for the option:
>
> * none: the default and behavior without the option, no additional export linkage information is created.
> * explicit: add the export for entities with explict default visibility from the source, including RTTI
> * all: add the export for all entities with default visibility
>
> This option is useful for targets which do not export symbols as part of
> their usual default linkage behaviour (e.g. AIX), such targets
> traditionally specified such information in external files (e.g. export
> lists), but this mapping allows them to use the visibility information
> typically used for this purpose on other (e.g. ELF) platforms.
>
> Reviewed By: MaskRay
>
> Differential Revision: https://reviews.llvm.org/D126340

This reverts commit 8c8a2679a2.
2022-06-02 15:09:39 +02:00
Martin Storsjö f730749e85 [clang] [ARM] Add __builtin_sponentry like on aarch64
This is used for calling the SEH aware setjmp on MinGW.

Differential Revision: https://reviews.llvm.org/D126764
2022-06-02 12:29:59 +03:00
Shilei Tian eb673be5ac [OMPIRBuilder] Add the support for compare capture
This patch adds the support for `compare capture` in `OMPIRBuilder`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120007
2022-06-01 19:53:43 -04:00
Joseph Huber afd2f7e991 [Binary] Promote OffloadBinary to inherit from Binary
We use the `OffloadBinary` to create binary images of offloading files
and their corresonding metadata. This patch changes this to inherit from
the base `Binary` class. This allows us to create and insepect these
more generically. This patch includes all the necessary glue to
implement this as a new binary format, along with added the magic bytes
we use to distinguish the offloading binary to the `file_magic`
implementation.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D126812
2022-06-01 18:40:57 -04:00
David Tenty 8c8a2679a2 [clang][AIX] add option mdefault-visibility-export-mapping
The option mdefault-visibility-export-mapping is created to allow
mapping default visibility to an explicit shared library export
(e.g. dllexport). Exactly how and if this is manifested is target
dependent (since it depends on how they map dllexport in the IR).

Three values are provided for the option:

* none: the default and behavior without the option, no additional export linkage information is created.
* explicit: add the export for entities with explict default visibility from the source, including RTTI
* all: add the export for all entities with default visibility

This option is useful for targets which do not export symbols as part of
their usual default linkage behaviour (e.g. AIX), such targets
traditionally specified such information in external files (e.g. export
lists), but this mapping allows them to use the visibility information
typically used for this purpose on other (e.g. ELF) platforms.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D126340
2022-06-01 18:07:17 -04:00
Nikita Popov 858e6273d9 [Clang] Always set opaque pointers mode
Always set the opaque pointers mode, to make sure that
-no-opaque-pointers continues working when the default on the LLVM
side is flipped.
2022-05-31 15:43:05 +02:00
Zi Xuan Wu (Zeson) 563cc3fda9 [Clang][CSKY] Add support about CSKYABIInfo
According to the CSKY ABIv2 document, https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf
construct the ABIInfo to handle argument passing and return of clang data type. It also includes how to emit and expand VAArg intrinsic.

Differential Revision: https://reviews.llvm.org/D126451
2022-05-31 10:53:30 +08:00
Joel E. Denny d2e3cb7374 [OpenMP][Clang] Fix atomic compare for signed vs. unsigned
Without this patch, arguments to the
`llvm::OpenMPIRBuilder::AtomicOpValue` initializer are reversed.

Reviewed By: ABataev, tianshilei1992

Differential Revision: https://reviews.llvm.org/D126619
2022-05-30 11:02:20 -04:00
Enna1 52992f136b Add !nosanitize to FixedMetadataKinds
This patch adds !nosanitize metadata to FixedMetadataKinds.def, !nosanitize indicates that LLVM should not insert any sanitizer instrumentation.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126294
2022-05-27 09:46:13 +08:00
Bruno Cardoso Lopes ce54b22657 [Clang][CoverageMapping] Fix switch counter codegen compile time explosion
C++ generated code with huge amount of switch cases chokes badly while emitting
coverage mapping, in our specific testcase (~72k cases), it won't stop after hours.
After this change, the frontend job now finishes in 4.5s and shrinks down `@__covrec_`
by 288k when compared to disabling simplification altogether.

There's probably no good way to create a testcase for this, but it's easy to
reproduce, just add thousands of cases in the below switch, and build with
`-fprofile-instr-generate -fcoverage-mapping`.

```
enum type : int {
 FEATURE_INVALID = 0,
 FEATURE_A = 1,
 ...
};

const char *to_string(type e) {
  switch (e) {
  case type::FEATURE_INVALID: return "FEATURE_INVALID";
  case type::FEATURE_A: return "FEATURE_A";}
  ...
  }

```

Differential Revision: https://reviews.llvm.org/D126345
2022-05-26 11:05:15 -07:00
Mike Rice 0a5cfbf7b2 [OpenMP] Use the align clause value from 'omp allocate' for globals
Refactor the code that handles the align clause of 'omp allocate' so
it can be used with globals as well as local variables.

Differential Revision: https://reviews.llvm.org/D126426
2022-05-26 09:51:48 -07:00
Joseph Huber 1bae02b773 [Cuda] Use fallback method to mangle externalized decls if no CUID given
CUDA requires that static variables be visible to the host when
offloading. However, The standard semantics of a stiatc variable dictate
that it should not be visible outside of the current file. In order to
access it from the host we need to perform "externalization" on the
static variable on the device. This requires generating a semi-unique
name that can be affixed to the variable as to not cause linker errors.

This is currently done using the CUID functionality, an MD5 hash value
set up by the clang driver. This allows us to achieve is mostly unique
ID that is unique even between multiple compilations of the same file.
However, this is not always availible. Instead, this patch uses the
unique ID from the file to generate a unique symbol name. This will
create a unique name that is consistent between the host and device side
compilations without requiring the CUID to be entered by the driver. The
one downside to this is that we are no longer stable under multiple
compilations of the same file. However, this is a very niche use-case
and is not supported by Nvidia's CUDA compiler so it likely to be good
enough.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D125904
2022-05-26 09:18:22 -04:00
Aaron Ballman 9368bf9023 Removing this as part of the revert done in 69da3b6aea
This appears to have been added in a follow-up commit that I missed.
2022-05-25 13:45:17 -04:00
Adrian Kuegel 9698a445c6 Fix warning by handling OMPC_fail in switch statement. 2022-05-25 09:33:41 +02:00
Mike Rice 239094cdee [OpenMP] Add codegen for 'omp_all_memory' reserved locator.
This creates an entry with address=nullptr and flag=0x80.
When an 'omp_all_memory' entry is specified any other 'out' or
'inout' entries are not needed and are not passed to the runtime.

Differential Revision: https://reviews.llvm.org/D126321
2022-05-24 15:26:23 -07:00
Mike Rice 9ba937112f [OpenMP] Add parsing/sema support for omp_all_memory reserved locator
Adds support for the reserved locator 'omp_all_memory' for use
in depend clauses with 'out' or 'inout' dependence-types.

Differential Revision: https://reviews.llvm.org/D125828
2022-05-24 10:28:59 -07:00
Stephen Long 4f1e64b54f [MSVC, ARM64] Add __readx18 intrinsics
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

  unsigned char __readx18byte(unsigned long)
  unsigned short __readx18word(unsigned long)
  unsigned long __readx18dword(unsigned long)
  unsigned __int64 __readx18qword(unsigned long)

Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedLoad()`

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D126024
2022-05-23 10:59:12 -07:00
Stephen Long 3e0be5610f [MSVC, ARM64] Add __writex18 intrinsics
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

  void __writex18byte(unsigned long, unsigned char)
  void __writex18word(unsigned long, unsigned short)
  void __writex18dword(unsigned long, unsigned long)
  void __writex18qword(unsigned long, unsigned __int64)

Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedStore()`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D126023
2022-05-23 07:01:11 -07:00
Stephen Long ae80024fbe [clang] Honor __attribute__((no_builtin("foo"))) on functions
Support for `__attribute__((no_builtin("foo")))` was added in https://reviews.llvm.org/D68028,
but builtins were still being used even when the attribute was placed on a function.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D124701
2022-05-20 06:41:47 -07:00
Jon Chesterfield 83c431fb9e [amdgpu] Add amdgpu_kernel calling conv attribute to clang
Allows emitting define amdgpu_kernel void @func() IR from C or C++.

This replaces the current workflow which is to write a stub in opencl that
calls an external C function implemented in C++ combined through llvm-link.

Calling the resulting function still requires a manual implementation of the
ABI from the host side. The primary application is for more rapid debugging
of the amdgpu backend by permuting a C or C++ test file instead of manually
updating an IR file.

Implementation closely follows D54425. Non-amd reviewers from there.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D125970
2022-05-20 08:50:37 +01:00
Yaxun (Sam) Liu cefe472c51 [clang] Fix __has_builtin
Fix __has_builtin to return 1 only if the requested target features
of a builtin are enabled by refactoring the code for checking
required target features of a builtin and use it in evaluation
of __has_builtin.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D125829
2022-05-19 11:34:42 -04:00
Jay Foad 6bec3e9303 [APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557
2022-05-19 11:23:13 +01:00
Mitch Phillips 7aa1fa0a0a Reland "[dwarf] Emit a DIGlobalVariable for constant strings."
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.

Reviewed By: dblaikie, rnk, aprantl

Differential Revision: https://reviews.llvm.org/D123534
2022-05-18 13:56:45 -07:00
Mitch Phillips ed2c3218f5 Revert "[dwarf] Emit a DIGlobalVariable for constant strings."
This reverts commit 4680982b36.

Broke a fuchsia windows bot. More details in the review:
https://reviews.llvm.org/D123534
2022-05-16 19:07:38 -07:00
Mitch Phillips 4680982b36 [dwarf] Emit a DIGlobalVariable for constant strings.
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.

Reviewed By: dblaikie, rnk, aprantl

Differential Revision: https://reviews.llvm.org/D123534
2022-05-16 16:52:16 -07:00
Egor Zhdan 2f04e703bf [Clang] Add DriverKit support
This is the second patch that upstreams the support for Apple's DriverKit.

The first patch: https://reviews.llvm.org/D118046.

Differential Revision: https://reviews.llvm.org/D121911
2022-05-13 20:34:57 +01:00
Joseph Huber af757f8980 [OpenMP] Don't set device runtime debugging flags if using '-nogpulib'
We use globals to configure debugging at compile-time for the device
runtime. Because these are only used by the OpenMP runtime we shouldn't
define them if we aren't using the device runtime. When a user passes in
'-nogpulib' this indicates that we are not using the device runtime, so
we should check for the precense of this flag and not emit these globals
if used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125314
2022-05-13 14:38:43 -04:00
Mike Rice 772b0c44a4 [OpenMP] Fix mangling for linear parameters with negative stride
The 'n' character is used in place of '-' in the mangled name.

Differential Revision: https://reviews.llvm.org/D125406
2022-05-11 14:02:09 -07:00
Joseph Huber 26eb04268f [Clang] Introduce clang-offload-packager tool to bundle device files
In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the `-fembed-offload-binary`
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
`clang-offload-packager` that behaves similarly to CUDA's `fatbinary`.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D125165
2022-05-11 09:39:13 -04:00
Matt Devereau 75bb815231 [AArch64][SVE] Add aarch64_sve_pcs attribute to Clang
Enable function attribute aarch64_sve_pcs at the C level, which correspondes to
aarch64_sve_vector_pcs at the LLVM IR level.

This requirement was created by this addition to the ARM C Language Extension:
https://github.com/ARM-software/acle/pull/194

Differential Revision: https://reviews.llvm.org/D124998
2022-05-11 13:33:56 +00:00
Joseph Huber 0035f7154c [CUDA] Create offloading entries when using the new driver
The changes made in D123460 generalized the code generation for OpenMP's
offloading entries. We can use the same scheme to register globals for
CUDA code. This patch adds the code generation to create these
offloading entries when compiling using the new offloading driver mode.
The offloading entries are simple structs that contain the information
necessary to register the global. The struct used is as follows:

```
Type struct __tgt_offload_entry {
  void    *addr;      // Pointer to the offload entry info.
                      // (function or global)
  char    *name;      // Name of the function or global.
  size_t  size;       // Size of the entry info (0 if it a function).
  int32_t flags;
  int32_t reserved;
};
```

Currently CUDA handles RDC code generation by deferring the registration
of globals in the current TU to a callback function containing the
modules ID. Later all the module IDs will be used to register all of the
globals at once. Rather than mimic this, offloading entries allow us to
mimic the way OpenMP registers globals. That is, we create a simple
global struct for each device global to be registered. These are placed
at a special section `cuda_offloading_entires`. Because this section is
a valid C-identifier, the linker will profide a `__start` and `__stop`
pointer that we can use to iterate and register all globals at runtime.

the registration requires a flag variable to indicate which registration
function to use. I have assigned the flags somewhat arbitrarily, but
these use the following values.

Kernel: 0
Variable: 0
Managed: 1
Surface: 2
Texture: 3

Depends on D120272

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D123471
2022-05-11 07:30:21 -04:00
Mike Rice 0dbaef61b5 [OpenMP] Fix mangling for linear modifiers with variable stride
This adds support for variable stride with the val, uval, and ref linear
modifiers.  Previously only the no modifer type ls<argno> was supported.

  val  -> Ls<argno>
  uval -> Us<argno>
  ref  -> Rs<argno>

Differential Revision: https://reviews.llvm.org/D125330
2022-05-10 14:12:44 -07:00
Mike Rice 1a02519bc5 [OpenMP] Add mangling support for linear modifiers (ref,uval,val)
Add mangling for linear parameters specified with ref, uval, and val
for 'omp declare simd' vector functions.

Add missing stride for linear this parameters.

Differential Revision: https://reviews.llvm.org/D125269
2022-05-10 09:56:55 -07:00
Daniel Bertalan 93a8225da1 [CodeGen] Use ABI alignment for C++ new expressions
In case of placement new, if we do not know the alignment of the
operand, we can't assume it has the preferred alignment. It might be
e.g. a pointer to a struct member which follows ABI alignment rules.

This makes UBSAN no longer report "constructor call on misaligned
address" when constructing a double into a struct field of type double
on i686. The psABI specifies an alignment of 4 bytes, but the preferred
alignment used by Clang is 8 bytes.

We now use ABI alignment for allocating new as well, as the preferred
alignment should be used for over-aligning e.g. local variables, which
isn't relevant for ABI code dealing with operator new. AFAICT there
wouldn't be problems either way though.

Fixes #54845.

Differential Revision: https://reviews.llvm.org/D124736
2022-05-10 16:02:23 +01:00
Simon Pilgrim ec6024d081 [X86] Replace avx512f integer mul reduction builtins with generic builtin
D117829 added the generic "__builtin_reduce_mul" which we can use to replace the x86 specific integer mul reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.

Differential Revision: https://reviews.llvm.org/D125222
2022-05-09 14:10:28 +01:00
Simon Pilgrim 8a92c45e07 [Clang] Add integer mul reduction builtin
Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.mul intrinsic call.

For other reductions, we've tried to share builtins for float/integer vectors, but the fmul reduction intrinsic also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fmul support this shouldn't affect the integer case.

Differential Revision: https://reviews.llvm.org/D117829
2022-05-09 12:12:53 +01:00
Richard Smith c4f95ef86a Reimplement `__builtin_dump_struct` in Sema.
Compared to the old implementation:

* In C++, we only recurse into aggregate classes.
* Unnamed bit-fields are not printed.
* Constant evaluation is supported.
* Proper conversion is done when passing arguments through `...`.
* Additional arguments are supported and are injected prior to the
  format string; this directly supports use with `fprintf`, for example.
* An arbitrary callable can be passed rather than only a function
  pointer. In particular, in C++, a function template or overload set is
  acceptable.
* All text generated by Clang is printed via `%s` rather than directly;
  this avoids issues where Clang's pretty-printing output might itself
  contain a `%` character.
* Fields of types that we don't know how to print are printed with a
  `"*%p"` format and passed by address to the print function.
* No return value is produced.

Reviewed By: aaron.ballman, erichkeane, yihanaa

Differential Revision: https://reviews.llvm.org/D124221
2022-05-05 14:55:47 -07:00
Yaxun (Sam) Liu 62501bc45a [NFC][CUDA][HIP] rework mangling number for aux target
CUDA/HIP needs to mangle for aux target. When mangling for aux target,
the mangler should use mangling number for aux target. Previously
in https://reviews.llvm.org/D122734 a state was introduced in
ASTContext to let the mangler get mangling number for aux target
from ASTContext. This patch removes that state from ASTConext
and add an IsAux member to MangleContext to indicate that
the mangle context is for aux target. This reflects the reality that
the mangle context is created for mangling aux target and makes
ASTContext cleaner.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D124842
2022-05-04 13:05:33 -04:00
David Pagan 37471cf2c3 [clang][OpenMP] Local variable alignment incorrect with align clause
If alignment specified with align clause is less than natural alignment for
list item type, the alignment should be set to the natural alignment.

See OMP5.1 specification, page 185, lines 7-10

Differential Revision: https://reviews.llvm.org/D124676
2022-05-03 13:10:01 -07:00
Shilei Tian 9c1085c7e2 [Clang][OpenMP] Add the support for floating-point variables for specific atomic clauses
Currently when using `atomic update` with floating-point variables, if
the operation is add or sub, `cmpxchg`, instead of `atomicrmw` is emitted, as
shown in [1].  In fact, about three years ago, llvm-svn: 351850 added the
support for FP operations. This patch adds the support in OpenMP as well.

[1] https://godbolt.org/z/M7b4ba9na

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D124724
2022-05-03 11:30:54 -04:00
David Truby 8bc29d1427 [clang][AArch64][SVE] Implement conditional operator for SVE vectors
This patch adds support for the conditional (ternary) operator on SVE
scalable vector types in C++, matching the behaviour for NEON vector
types. Like the conditional operator for NEON types, this is disabled in
C mode.

Differential Revision: https://reviews.llvm.org/D124091
2022-05-03 13:10:32 +00:00
Fangrui Song 4d34c4e0e6 [OpenMP] Fix -Wswitch (due to new OMPC_cancellation_construct_type) after D123828 2022-05-02 12:10:09 -07:00
Simon Pilgrim 9a14c369c4 [X86] Replace avx512f integer add reduction builtins with generic builtin
D124741 added the generic "__builtin_reduce_add" which we can use to replace the x86 specific integer add reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.

Differential Revision: https://reviews.llvm.org/D124757
2022-05-02 14:39:17 +01:00
Simon Pilgrim a23291b7db [Clang] Add integer add reduction builtin
Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.add intrinsic call.

For other reductions, we've tried to share builtins for float/integer vectors, but the fadd reduction intrinsics also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fadd support this shouldn't affect the integer case.

(Split off from D117829)

Differential Revision: https://reviews.llvm.org/D124741
2022-05-02 11:03:25 +01:00
joker881 19978e0874 [RISCV]Add CTZ Intrinsic for ZBB in Clang
Add Intrinsics and test for B extension (updating coming soon (:

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D124348
2022-04-30 08:18:10 +08:00
python3kgae 73417c5176 [HLSL][clang][Driver] Support validator version command line option.
The DXIL validator version option(/validator-version) decide the validator version when compile hlsl.
The format is major.minor like 1.0.

In normal case, the value of validator version should be got from DXIL validator. Before we got DXIL validator ready for llvm/main, DXIL validator version option is added first to set validator version.

It will affect code generation for DXIL, so it is treated as a code gen option.

A new member std::string DxilValidatorVersion is added to clang::CodeGenOptions.

Then CGHLSLRuntime is added to clang::CodeGenModule.
It is used to translate clang::CodeGenOptions::DxilValidatorVersion into a ModuleFlag under key "dx.valver" at end of clang code generation.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D123884
2022-04-29 16:48:08 -07:00
Joe Nash 8bdfc73f63 [AMDGPU][clang] Definition of gfx11 subtarget
Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 2/N for upstreaming of AMDGPU gfx11 architecture

Depends on D124536

Reviewed By: foad, kzhuravl, #amdgpu, arsenm

Differential Revision: https://reviews.llvm.org/D124537
2022-04-29 13:55:56 -04:00
Joseph Huber 643c9b22ef [OpenMP] Make generating offloading entries more generic
This patch moves the logic for generating the offloading entries to the
OpenMPIRBuilder. This makes it easier to re-use in other places, such as
for OpenMP support in Flang or using the same method for generating
offloading entires for other languages like Cuda.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D123460
2022-04-29 09:14:31 -04:00
jonasyhwang eaca933c59 [Clang][CodeGen]Fix __builtin_dump_struct missing record type field name
Thanks for @rsmith to point this. I'm sorry for introducing this bug.
See @rsmith 's comment in https://reviews.llvm.org/D122248
Eg:(By @rsmith ) https://godbolt.org/z/o7vcbWaEf
I have added a test case
struct:
```
struct U19A {
    int a;
  };
  struct U19B {
    struct U19A a;
  };

  struct U19B a = {
    .a.a = 2022
  };
```
Dump result:
```
struct U19B {
    struct U19A a = {
        int a = 2022
    }
}
```

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122920
2022-04-29 12:58:53 +08:00
Yaxun (Sam) Liu 11d3e31c60 [CUDA][HIP] Fix mangling number for local struct
MSVC and Itanium mangling use different mangling numbers
for function-scope structs, which causes inconsistent
mangled kernel names in device and host compilations.

This patch uses Itanium mangling number for structs
in for mangling device side names in CUDA/HIP host
compilation on Windows to fix this issue.

A state is added to ASTContext to indicate whether the
current name mangling is for device side names in host
compilation. Device and host mangling number
are encoded/decoded as upper and lower half of 32 bit
unsigned integer to fit into the original mangling number
field for AST. Diagnostic will be emitted if a manglining
number exceeds limit.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D122734

Fixes: SWDEV-328515
2022-04-28 19:54:43 -04:00
Alexey Bataev 1462e63f67 [OPENMP]PR53344: Emit code for final update of the inscan reduction vars in worksharing loops.
Need to emit final update of the inscan reduction variables. For
worksharing loops, the reduction values are stored in the temp array,
need to copy the last element to the original var at the end of the
construct.

Differential Revision: https://reviews.llvm.org/D121156
2022-04-28 10:41:28 -07:00
Yaxun (Sam) Liu 57a210e5b7 [CUDA][HIP] Fix linkage of __clang_gpu_used_external
Different TU's may have this globl var. appending linkage can
only be used with lld recognized special variables.

Change it to internal linkage.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D124466
2022-04-26 20:43:39 -04:00
Michael Kruse ff289feeba [OpenMPIRBuilder] Remove ContinuationBB argument from Body callback.
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems:

 1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block.

 2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator.

 3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again.

 4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch.

With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback.

Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block.

Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D118409
2022-04-26 16:35:01 -05:00
Andrew Savonichev 0a27622a1d [NVPTX] Disable DWARF .file directory for PTX
Default behavior for .file directory was changed in D105856, but
ptxas (CUDA 11.5 release) refuses to parse it:

    $ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll
    $ ptxas debug-file-loc.s
    ptxas debug-file-loc.s, line 42; fatal   : Parsing error near
    '"foo.h"': syntax error

Added a new field to MCAsmInfo to control default value of
UseDwarfDirectory. This value is used if -dwarf-directory command line
option is not specified.

Differential Revision: https://reviews.llvm.org/D121299
2022-04-26 21:40:36 +03:00
Jonas Paulsson 9b38e2efa0 [SystemZ] Fix C++ ABI for passing args of structs containing zero width bitfield.
A struct like { float a; int :0; } should per the SystemZ ABI be passed in a
GPR, but to match a bug in GCC it has been passed in an FPR (see 759449c).

GCC has now corrected the C++ ABI for this case, and this patch for clang
follows suit.

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122388
2022-04-26 17:16:14 +02:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Alok Kumar Sharma a48300aee5 [clang][OpenMP][DebugInfo] Debug support for TLS variables present in OpenMP consruct
In case of OpenMP programs, thread local variables can be present in
any clause pertaining to OpenMP constructs, as we know that compiler
generates artificial functions and in some cases values are passed to
those artificial functions thru parameters. For an example, if thread
local variable is present in copyin clause (testcase attached with the
patch), parameter with same name is generated as parameter to artificial
function. When user inquires the thread Local variable, its debug info
is hidden by the parameter. User never gets the actual TLS variable
when inquires it, instead gets the artificial parameter.

Current patch suppresses the debug info for such artificial parameter to
enable correct debugging of TLS variables.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D123787
2022-04-23 12:29:32 +05:30
Yaxun (Sam) Liu 04fb81674e [CUDA][HIP] Externalize kernels with internal linkage
This patch is a continuation of https://reviews.llvm.org/D123353.

Not only kernels in anonymous namespace, but also template
kernels with template arguments in anonymous namespace
need to be externalized.

To be more generic, this patch checks the linkage of a kernel
assuming the kernel does not have __global__ attribute. If
the linkage is internal then clang will externalize it.

This patch also fixes the postfix for externalized symbol
since nvptx does not allow '.' in symbol name.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D124189

Fixes: https://github.com/llvm/llvm-project/issues/54560
2022-04-22 17:05:36 -04:00
Ying Yi b09ba42620 Bug 51277: [DWARF] DW_AT_alignment incorrect when
attribute((__aligned__)) is present but ignored`

In the original code, the 'getDeclAlignIfRequired' function is used.
The 'getDeclAlignIfRequired' function will return the max alignment
of all aligned attributes if the type has aligned attributes. The
function doesn't consider the type at all.

The 'getTypeAlignIfRequired' function uses the type's alignment value,
which also used by the 'alignof' function. I think we should use the
function of 'getTypeAlignIfRequired'.

Reviewed By: dblaikie, jmorse, wolfgangp

Differential Revision: https://reviews.llvm.org/D124006
2022-04-22 12:15:00 +01:00
Shafik Yaghmour 5ff992bca2 [DEBUG-INFO] Change how we handle auto return types for lambda operator() to be consistent with gcc
D70524 added support for auto return types for C++ member functions. I was
implementing support on the LLDB side for looking up the deduced type.

I ran into trouble with some cases with respect to lambdas. I looked into
how gcc was handling these cases and it appears gcc emits the deduced return type for lambdas.

So I am changing out behavior to match that.

Differential Revision: https://reviews.llvm.org/D123319
2022-04-21 14:58:50 -07:00
Chuanqi Xu 483efc9ad0 [Pipelines] Remove Legacy Passes in Coroutines
The legacy passes are deprecated now and would be removed in near
future. This patch tries to remove legacy passes in coroutines.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D123918
2022-04-21 10:59:11 +08:00
Richard Smith 72315d02c4 Treat `std::move`, `forward`, etc. as builtins.
This is extended to all `std::` functions that take a reference to a
value and return a reference (or pointer) to that same value: `move`,
`forward`, `move_if_noexcept`, `as_const`, `addressof`, and the
libstdc++-specific function `__addressof`.

We still require these functions to be declared before they can be used,
but don't instantiate their definitions unless their addresses are
taken. Instead, code generation, constant evaluation, and static
analysis are given direct knowledge of their effect.

This change aims to reduce various costs associated with these functions
-- per-instantiation memory costs, compile time and memory costs due to
creating out-of-line copies and inlining them, code size at -O0, and so
on -- so that they are not substantially more expensive than a cast.
Most of these improvements are very small, but I measured a 3% decrease
in -O0 object file size for a simple C++ source file using the standard
library after this change.

We now automatically infer the `const` and `nothrow` attributes on these
now-builtin functions, in particular meaning that we get a warning for
an unused call to one of these functions.

In C++20 onwards, we disallow taking the addresses of these functions,
per the C++20 "addressable function" rule. In earlier language modes, a
compatibility warning is produced but the address can still be taken.

The same infrastructure is extended to the existing MSVC builtin
`__GetExceptionInfo`, which is now only recognized in namespace `std`
like it always should have been.

This is a re-commit of
  fc30901096,
  a571f82a50,
  64c045e25b, and
  de6ddaeef3,
and reverts aa643f455a.
This change also includes a workaround for users using libc++ 3.1 and
earlier (!!), as apparently happens on AIX, where std::move sometimes
returns by value.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123345

Revert "Fixup D123950 to address revert of D123345"

This reverts commit aa643f455a.
2022-04-20 17:58:31 -07:00
David Tenty aa643f455a Fixup D123950 to address revert of D123345
Since D123345 got reverted Builtin::BIaddressof and Builtin::BI__addressof don't exist and cause build breaks.
2022-04-20 19:59:07 -04:00
David Tenty 98d911e01f Revert "Treat `std::move`, `forward`, etc. as builtins."
This reverts commit b27430f9f4 as the
    parent https://reviews.llvm.org/D123345 breaks the AIX CI:

    https://lab.llvm.org/buildbot/#/builders/214/builds/819
2022-04-20 19:14:37 -04:00
Pengxuan Zheng 38612fbc89 Reland "[COFF, ARM64] Add __break intrinsic"
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

Reland after fixing the test failure. The failure was due to conflict with a
change (D122983) which was merged right before this patch.

Reviewed By: rnk, mstorsjo

Differential Revision: https://reviews.llvm.org/D124032
2022-04-20 13:01:30 -07:00
Pengxuan Zheng bff8356b19 Revert "[COFF, ARM64] Add __break intrinsic"
This reverts commit 8a9b4fb4aa.
2022-04-20 11:57:49 -07:00
Eli Friedman ecc8479a01 Look through calls to std::addressof to compute pointer alignment.
This is sort of a followup to D37310; that basically fixed the same
issue, but then the libstdc++ implementation of <atomic> changed. Re-fix
the the issue in essentially the same way: look through the addressof
operation to find the alignment of the underlying object.

Differential Revision: https://reviews.llvm.org/D123950
2022-04-20 11:30:11 -07:00
Pengxuan Zheng 8a9b4fb4aa [COFF, ARM64] Add __break intrinsic
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170

Reviewed By: rnk, mstorsjo

Differential Revision: https://reviews.llvm.org/D124032
2022-04-20 11:20:26 -07:00
Fangrui Song a57d16bf80 [CodeGen] Fix -Wswitch after D116462 2022-04-19 17:33:15 -07:00
Fangrui Song 8b0e7f2293 [CodeGen] Fix -Wswitch after D116462 2022-04-19 17:28:54 -07:00
Paul Kirth bac6cd5bf8 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-04-19 21:23:48 +00:00
Yaxun (Sam) Liu cac4e2fe25 [CUDA][HIP] Fix gpu.used.external
Rename gpu.used.external as __clang_gpu_used_external as ptxas does not
allow . in global variable name.

Fixes: https://github.com/llvm/llvm-project/issues/54934

Reviewed by: Joseph Huber, Artem Belevich

Differential Revision: https://reviews.llvm.org/D123946
2022-04-18 23:10:31 -04:00
Michael Kruse 2d92ee97f1 Reapply "[OpenMP] Refactor OMPScheduleType enum."
This reverts commit af0285122f.

The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
2022-04-18 21:56:47 -05:00
Michael Kruse af0285122f Revert "[OpenMP] Refactor OMPScheduleType enum."
This reverts commit 9ec501da76.

It may have caused the openmp-gcc-x86_64-linux-debian buildbot to fail.
https://lab.llvm.org/buildbot/#/builders/4/builds/20377
2022-04-18 14:38:31 -05:00
Michael Kruse 9ec501da76 [OpenMP] Refactor OMPScheduleType enum.
The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered.

This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function.

While this is mostly a NFC refactor, it still applies the following functional changes:
 * The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default.
 * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`.
 * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight.

The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion.

Reviewed By: peixin

Differential Revision: https://reviews.llvm.org/D123403
2022-04-18 14:03:17 -05:00
Richard Smith b27430f9f4 Treat `std::move`, `forward`, etc. as builtins.
This is extended to all `std::` functions that take a reference to a
value and return a reference (or pointer) to that same value: `move`,
`forward`, `move_if_noexcept`, `as_const`, `addressof`, and the
libstdc++-specific function `__addressof`.

We still require these functions to be declared before they can be used,
but don't instantiate their definitions unless their addresses are
taken. Instead, code generation, constant evaluation, and static
analysis are given direct knowledge of their effect.

This change aims to reduce various costs associated with these functions
-- per-instantiation memory costs, compile time and memory costs due to
creating out-of-line copies and inlining them, code size at -O0, and so
on -- so that they are not substantially more expensive than a cast.
Most of these improvements are very small, but I measured a 3% decrease
in -O0 object file size for a simple C++ source file using the standard
library after this change.

We now automatically infer the `const` and `nothrow` attributes on these
now-builtin functions, in particular meaning that we get a warning for
an unused call to one of these functions.

In C++20 onwards, we disallow taking the addresses of these functions,
per the C++20 "addressable function" rule. In earlier language modes, a
compatibility warning is produced but the address can still be taken.

The same infrastructure is extended to the existing MSVC builtin
`__GetExceptionInfo`, which is now only recognized in namespace `std`
like it always should have been.

This is a re-commit of
  fc30901096,
  a571f82a50, and
  64c045e25b
which were reverted in
  e75d8b7037
due to a crasher bug where CodeGen would emit a builtin glvalue as an
rvalue if it constant-folds.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123345
2022-04-17 13:26:16 -07:00
Vitaly Buka e75d8b7037 Revert "Treat `std::move`, `forward`, and `move_if_noexcept` as builtins."
Revert "Extend support for std::move etc to also cover std::as_const and"
Revert "Update test to handle opaque pointers flag flip."

It crashes on libcxx tests https://lab.llvm.org/buildbot/#/builders/85/builds/8174

This reverts commit fc30901096.
This reverts commit a571f82a50.
This reverts commit 64c045e25b.
2022-04-16 00:27:51 -07:00
Joseph Huber 984a0dc386 [OpenMP] Use new offloading binary when embedding offloading images
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.

Differential Revision: https://reviews.llvm.org/D122683
2022-04-15 20:35:26 -04:00
Richard Smith fc30901096 Extend support for std::move etc to also cover std::as_const and
std::addressof, plus the libstdc++-specific std::__addressof.

This brings us to parity with the corresponding GCC behavior.

Remove STDBUILTIN macro that ended up not being used.
2022-04-15 16:31:39 -07:00
Richard Smith 64c045e25b Treat `std::move`, `forward`, and `move_if_noexcept` as builtins.
We still require these functions to be declared before they can be used,
but don't instantiate their definitions unless their addresses are
taken. Instead, code generation, constant evaluation, and static
analysis are given direct knowledge of their effect.

This change aims to reduce various costs associated with these functions
-- per-instantiation memory costs, compile time and memory costs due to
creating out-of-line copies and inlining them, code size at -O0, and so
on -- so that they are not substantially more expensive than a cast.
Most of these improvements are very small, but I measured a 3% decrease
in -O0 object file size for a simple C++ source file using the standard
library after this change.

We now automatically infer the `const` and `nothrow` attributes on these
now-builtin functions, in particular meaning that we get a warning for
an unused call to one of these functions.

In C++20 onwards, we disallow taking the addresses of these functions,
per the C++20 "addressable function" rule. In earlier language modes, a
compatibility warning is produced but the address can still be taken.

The same infrastructure is extended to the existing MSVC builtin
`__GetExceptionInfo`, which is now only recognized in namespace `std`
like it always should have been.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123345
2022-04-15 14:09:45 -07:00
Eli Friedman 4802edd1ac Fix size of flexible array initializers, and re-enable assertions.
In D123649, I got the formula for getFlexibleArrayInitChars slightly
wrong: the flexible array elements can be contained in the tail padding
of the struct.  Fix the formula to account for that.

With the fixed formula, we run into another issue: in some cases, we
were emitting extra padding for flexible arrray initializers. Fix
CGExprConstant so it uses a packed struct when necessary, to avoid this
extra padding.

Differential Revision: https://reviews.llvm.org/D123826
2022-04-15 12:09:57 -07:00
Jan Svoboda 9d98f58959 [clang][CodeGen] NFCI: Use FileEntryRef
This patch removes use of the deprecated `DirectoryEntry::getName()` from clangCodeGen by using `{File,Directory}EntryRef` instead.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D123768
2022-04-15 15:16:17 +02:00
Eli Friedman 6cf0b1b3da Comment out assertions about initializer size added in D123649.
They're causing failures in LLVM test-suite.  Added some regression
tests that explain the issue.
2022-04-14 13:58:17 -07:00
Eli Friedman 5955a0f937 Allow flexible array initialization in C++.
Flexible array initialization is a C/C++ extension implemented in many
compilers to allow initializing the flexible array tail of a struct type
that contains a flexible array. In clang, this is currently restricted
to C. But this construct is used in the Microsoft SDK headers, so I'd
like to extend it to C++.

For now, this doesn't handle dynamic initialization; probably not hard
to implement, but it's extra code, and I don't think it's necessary for
the expected uses.  And we explicitly fail out of constant evaluation.

I've added some additional code to assert that initializers have the
correct size, with or without flexible array init. This might catch
issues unrelated to flexible array init.

Differential Revision: https://reviews.llvm.org/D123649
2022-04-14 11:56:40 -07:00
David Truby 53fd8db791 [Clang][AArch64][SVE] Allow subscript operator for SVE types
Undefined behaviour is just passed on to extract_element when the
index is out of bounds. Subscript on svbool_t is not allowed as
this doesn't really have meaningful semantics.

Differential Revision: https://reviews.llvm.org/D122732
2022-04-14 13:20:50 +01:00
Jan Svoboda d79ad2f1db [clang][lex] NFCI: Use FileEntryRef in PPCallbacks::InclusionDirective()
This patch changes type of the `File` parameter in `PPCallbacks::InclusionDirective()` from `const FileEntry *` to `Optional<FileEntryRef>`.

With the API change in place, this patch then removes some uses of the deprecated `FileEntry::getName()` (e.g. in `DependencyGraph.cpp` and `ModuleDependencyCollector.cpp`).

Reviewed By: dexonsmith, bnbarham

Differential Revision: https://reviews.llvm.org/D123574
2022-04-14 10:46:12 +02:00
joker881 a4f47a99aa RISCV] Add clang builtins for CLZ instruction.
add intrinsic for CLZ

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121915
2022-04-14 12:29:15 +08:00
Eli Friedman d791de0e25 Restrict lvalue-to-rvalue conversions in CGExprConstant.
We were generating wrong code for cxx20-consteval-crash.cpp: instead of
loading a value of a variable, we were using its address as the
initializer.

Found while adding code to verify the size of constant initializers.

Differential Revision: https://reviews.llvm.org/D123648
2022-04-13 12:34:57 -07:00
Erich Keane 38823b7f5f Fix Werror build issue from 6f20744b7f 2022-04-13 11:00:09 -07:00
Erich Keane 6f20744b7f Add support for ignored bitfield conditional codegen.
Currently we emit an error in just about every case of conditionals
with a 'non simple' branch if treated as an LValue.  This patch adds
support for the special case where this is an 'ignored' lvalue, which
permits the side effects from happening.

It also splits up the emit for conditional LValue in a way that should
be usable to handle simple assignment expressions in similar situations.

Differential Revision: https://reviews.llvm.org/D123680
2022-04-13 10:33:55 -07:00
Yaxun (Sam) Liu 0424b5115c [CUDA][HIP] Fix host used external kernel in archive
For -fgpu-rdc, a host function may call an external kernel
which is defined in an archive of bitcode. Since this external
kernel is only referenced in host function, the device
bitcode does not contain reference to this external
kernel, then the linker will not try to resolve this external
kernel in the archive.

To fix this issue, host-used external kernels and device
variables are tracked. A global array containing pointers
to these external kernels and variables is emitted which
serves as an artificial references to the external kernels
and variables used by host.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123441
2022-04-13 10:47:16 -04:00
Nikita Popov 2978d02681 [Clang] Remove support for legacy pass manager
This removes the -flegacy-pass-manager and
-fno-experimental-new-pass-manager options, and the corresponding
support code in BackendUtil. The -fno-legacy-pass-manager and
-fexperimental-new-pass-manager options are retained as no-ops.

Differential Revision: https://reviews.llvm.org/D123609
2022-04-13 10:21:42 +02:00
Daniel Kiss b0343a38a5 Support the min of module flags when linking, use for AArch64 BTI/PAC-RET
LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker.
Such a setup is allowed in the normal build with this change that is possible.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D123493
2022-04-13 09:31:51 +02:00
Quinn Pham 7d7022fb0c [PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once
This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove
the loop at the beginning of the function that emits the arguments and
to delay emitting the arguments until inside the switch statement. These
changes will put `EmitPPCBuiltinExpr` in line with the strategy of the
target independent function `EmitBuiltinExpr`. Also, this patch
ensures that arguments are only emitted once.

Tests that included builtins affected by these changes have been
modified to match expected behaviour.

Reviewed By: #powerpc, nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D121637
2022-04-12 15:33:20 -05:00
Nikita Popov b72fd1a84d [CGCall] Check store type in findDominatingStoreToReturnValue()
We need to make sure that the stored type matches the return type.
2022-04-11 12:08:29 +02:00
Yaxun (Sam) Liu 4ea1d43509 [CUDA][HIP] Externalize kernels in anonymous name space
kernels in anonymous name space needs to have unique name
to avoid duplicate symbols.

Fixes: https://github.com/llvm/llvm-project/issues/54560

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123353
2022-04-10 21:56:28 -04:00
Jonas Hahnfeld e4903d8be3 [CUDA/HIP] Remove argument from module ctor/dtor signatures
In theory, constructors can take arguments when called via .init_array
where at least glibc passes in (argc, argv, envp). This isn't used in
the generated code and if it was, the first argument should be an
integer, not a pointer. For destructors registered via atexit, the
function should never take an argument.

Differential Revision: https://reviews.llvm.org/D123370
2022-04-09 12:34:41 +02:00
Jennifer Yu 187ccc66fa [clang][OpenMP5.1] Initial parsing/sema for has_device_addr
Added basic parsing/sema/ support for the 'has_device_addr' clause.

Differential Revision: https://reviews.llvm.org/D123402
2022-04-08 21:19:38 -07:00
Mitch Phillips fa34951fbc Reland "[MTE] Add -fsanitize=memtag* and friends."
Differential Revision: https://reviews.llvm.org/D118948
2022-04-08 14:28:33 -07:00
Aaron Ballman 4aaf25b4f7 Revert "[MTE] Add -fsanitize=memtag* and friends."
This reverts commit 8aa1490513.

Broke testing: https://lab.llvm.org/buildbot/#/builders/109/builds/36233
2022-04-08 16:15:58 -04:00
Mitch Phillips 8aa1490513 [MTE] Add -fsanitize=memtag* and friends.
Currently, enablement of heap MTE on Android is specified by an ELF note, which
signals to the linker to enable heap MTE. This change allows
-fsanitize=memtag-heap to synthesize these notes, rather than adding them
through the build system. We need to extend this feature to also signal the
linker to do special work for MTE globals (in future) and MTE stack (currently
implemented in the toolchain, but not implemented in the loader).

Current Android uses a non-backwards-compatible ELF note, called
".note.android.memtag". Stack MTE is an ABI break anyway, so we don't mind that
we won't be able to run executables with stack MTE on Android 11/12 devices.

The current expectation is to support the verbiage used by Android, in
that "SYNC" means MTE Synchronous mode, and "ASYNC" effectively means
"fast", using the Kernel auto-upgrade feature that allows
hardware-specific and core-specific configuration as to whether "ASYNC"
would end up being Asynchronous, Asymmetric, or Synchronous on that
particular core, whichever has a reasonable performance delta. Of
course, this is platform and loader-specific.

Differential Revision: https://reviews.llvm.org/D118948
2022-04-08 12:13:15 -07:00
Nikita Popov 692a147bf4 [CGCall] Make findDominatingStoreToReturnValue() more robust
This was skipping specific lifetime + bitcast patterns, but with
opaque pointers the bitcast will not be present, and we did not
perform this fold.

Instead skip over lifetime.end and bitcasts generally, without
trying to correlate them.
2022-04-08 15:18:12 +02:00
serge-sans-paille 301e0d9135 [Clang][Fortify] drop inline decls when redeclared
When an inline builtin declaration is shadowed by an actual declaration, we must
reference the actual declaration, even if it's not the last, following GCC
behavior.

This fixes #54715

Differential Revision: https://reviews.llvm.org/D123308
2022-04-08 09:31:51 +02:00
Chuanqi Xu 74b56e02bd [NFC] Remove unused variable in CodeGenModules
This eliminates an unused-variable warning
2022-04-08 11:52:31 +08:00
David Blaikie 1cee3d9db7 DebugInfo: Consider the type of NTTP when simplifying template names
Since the NTTP may need to be cast to the type when rebuilding the name,
check that the type can be rebuilt when determining whether a template
name can be simplified.
2022-04-08 00:00:46 +00:00
Quinn Pham fef56f79ac Revert "[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once"
This reverts commit 2aae5b1fac. Because it
breaks tests on windows.
2022-04-07 16:45:19 -05:00
Quinn Pham 2aae5b1fac [PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once
This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove
the loop at the beginning of the function that emits the arguments and
to delay emitting the arguments until inside the switch statement. These
changes will put `EmitPPCBuiltinExpr` in line with the strategy of the
target independent function `EmitBuiltinExpr`. Also, this patch
ensures that arguments are only emitted once.

Tests that included builtins affected by these changes have been
modified to match expected behaviour.

Reviewed By: #powerpc, nemanjai, amyk

Differential Revision: https://reviews.llvm.org/D121637
2022-04-07 16:00:12 -05:00
Pavel Samolysov b4ac84901e [clang][NFC] Extract EmitAssemblyHelper::shouldEmitRegularLTOSummary
The code to check if the regular LTO summary should be emitted and to
add the corresponding module flags was duplicated in the
'EmitAssemblyHelper::EmitAssemblyWithLegacyPassManager' and
'EmitAssemblyHelper::RunOptimizationPipeline' methods.

In order to eliminate these code duplications, the
'EmitAssemblyHelper::shouldEmitRegularLTOSummary' method has been
extracted. The method returns a bool value, the value is 'true' if the
module summary should be emitted. The patch keeps the setting of the
module flags inline.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D123026
2022-04-07 10:38:46 -07:00
Kavitha Natarajan b1ea0191a4 [clang][DebugInfo] Support debug info for alias variable
clang to emit DWARF information for global alias variable as
DW_TAG_imported_declaration. This change also handles nested
(recursive) imported declarations.

Reviewed by: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D120989
2022-04-07 17:15:40 +05:30
David Blaikie 6b306233f7 DebugInfo: Make the simplified template names prefix more unique 2022-04-06 18:25:46 +00:00
Ting Wang b389354b28 [Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend
Add support for builtin_[max|min] which has below prototype:
A builtin_max (A1, A2, A3, ...)
All arguments must have the same type; they must all be float, double, or long double.
Internally use SelectCC to get the result.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D122478
2022-04-05 22:43:48 -04:00
Tom Honermann 5531abaf71 [clang] Corrections for target_clones multiversion functions.
This change merges code for emit of target and target_clones multiversion
resolver functions and, in doing so, corrects handling of target_clones
functions that are declared but not defined. Previously, a use of such
a target_clones function would result in an attempted emit of an ifunc
that referenced an undefined resolver function. Ifunc references to
undefined resolver functions are not allowed and, when the LLVM verifier
is not disabled (via '-disable-llvm-verifier'), resulted in the verifier
issuing a "IFunc resolver must be a definition" error and aborting the
compilation. With this change, ifuncs and resolver function definitions
are always emitted for used target_clones functions regardless of whether
the target_clones function is defined (if the function is defined, then
the ifunc and resolver are emitted regardless of whether the function is
used).

This change has the side effect of causing target_clones variants and
resolver functions to be emitted in a different order than they were
previously. This is harmless and is reflected in the updated tests.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122958
2022-04-05 19:50:22 -04:00
Tom Honermann 40af8df6fe [clang] NFC: Preparation for merging code to emit target and target_clones resolvers.
This change modifies CodeGenModule::emitMultiVersionFunctions() in preparation
for a change that will merge support for emitting target_clones resolvers into
this function. This change mostly serves to isolate indentation changes from
later behavior modifying changes.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122957
2022-04-05 19:50:22 -04:00
Tom Honermann 0ace0100ae [clang] NFC: Simplify the interface to CodeGenModule::GetOrCreateMultiVersionResolver().
Previously, GetOrCreateMultiVersionResolver() required the caller to provide
a GlobalDecl along with an llvm::type and FunctionDecl. The latter two can be
cheaply obtained from the first, and the llvm::type parameter is not always
used, so requiring the caller to provide them was unnecessary and created the
possibility that callers would pass an inconsistent set. This change simplifies
the interface to only require the GlobalDecl value.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122956
2022-04-05 19:50:22 -04:00
Tom Honermann bed5ee3f4b [clang] NFC: Enhance comments in CodeGen for multiversion function support.
Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122955
2022-04-05 19:50:22 -04:00
Shangwu Yao 15a1769631 Emit OpenCL metadata when targeting SPIR-V
This is required for converting function calls such as get_global_id()
into SPIR-V builtins.

Differential Revision: https://reviews.llvm.org/D123049
2022-04-05 20:58:32 +00:00
Tom Honermann 7c53fc4fe1 [clang] Emit target_clones resolver functions as COMDAT.
Previously, resolver functions synthesized for target_clones multiversion
functions were not emitted as COMDAT. Now fixed.
2022-04-05 15:34:35 -04:00
David Blaikie bb3980ae9f DebugInfo: Don't use enumerators in template names for debug info as they are not canonical
Since enumerators may not be available in every translation unit they
can't be reliably used to name entities. (this also makes simplified
template name roundtripping infeasible - since the expected name could
only be rebuilt if the enumeration definition could be found (or only if
it couldn't be found, depending on the context of the original name))
2022-04-05 17:16:42 +00:00
David Truby 4be1ec9fb5 [clang][AArc64][SVE] Add support for comparison operators on SVE types
Comparison operators on SVE types return a signed integer vector
of the same width as the incoming SVE type. This matches the existing
behaviour for NEON types.

Differential Revision: https://reviews.llvm.org/D122404
2022-04-05 13:56:27 +01:00
Nikita Popov 46cfbe561b [LLVMContext] Replace enableOpaquePointers() with setOpaquePointers()
This allows both explicitly enabling and explicitly disabling
opaque pointers, in anticipation of the default switching at some
point.

This also slightly changes the rules by allowing calls if either
the opaque pointer mode has not yet been set (explicitly or
implicitly) or if the value remains unchanged.
2022-04-05 12:02:48 +02:00
Nikita Popov ff18b158ed [CodeGen] Avoid unnecessary ConstantExpr cast
With opaque pointers, this is not necessarily a ConstantExpr. And
we don't need one here either, just Constant is sufficient.
2022-04-05 11:28:40 +02:00
Nikita Popov d69e9f9d89 [OpaquePtrs][Clang] Add -opaque-pointers/-no-opaque-pointers cc1 options
This adds cc1 options for enabling and disabling opaque pointers
on the clang side. This is not super useful now (because
-mllvm -opaque-pointers and -Xclang -opaque-pointers have the same
visible effect) but will be important once opaque pointers are
enabled by default in clang. In that case, it will only be
possible to disable them using the cc1 -no-opaque-pointers option.

Differential Revision: https://reviews.llvm.org/D123034
2022-04-05 10:15:41 +02:00
Pavel Samolysov 87b28f5092 [clang][NFC] Extract the EmitAssemblyHelper::TargetTriple member
Few times in different methods of the EmitAssemblyHelper class the following
code snippet is used to get the TargetTriple and then use it's single method
to check some conditions:

TargetTriple(TheModule->getTargetTriple())

The parsing of a target triple string is not a trivial operation and it takes
time to repeat the parsing many times in different methods of the class and
even numerous times in one method just to call a getter
(llvm::Triple(TheModule->getTargetTriple()).getVendor()), for example.
The patch extracts the TargetTriple member of the EmitAssemblyHelper class to
parse the triple only once in the class' constructor.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D122587
2022-04-04 12:16:39 +03:00
Luo, Yuanke 979d876bb4 [X86][AMX] enable amx cast intrinsics in FE.
We have some discission in D99152 and llvm-dev and finially come up with
a solution to add amx specific cast intrinsics. We've support the
intrinsics in llvm IR. This patch is to replace bitcast with amx cast
intrinsics in code emitting in FE.

Differential Revision: https://reviews.llvm.org/D122567
2022-04-02 14:02:35 +08:00
Erich Keane 9ba8c4024b Fix behavior of ifuncs with 'used' extern "C" static functions
We expect that `extern "C"` static functions to be usable in things like
inline assembly, as well as ifuncs:
See the bug report here: https://github.com/llvm/llvm-project/issues/54549

However, we were diagnosing this as 'not defined', because the
ifunc's attempt to look up its resolver would generate a declared IR
function.

Additionally, as background, the way we allow these static extern "C"
functions to work in inline assembly is by making an alias with the C
mangling in MOST situations to the version we emit with
internal-linkage/mangling.

The problem here was multi-fold: First- We generated the alias after the
ifunc was checked, so the function by that name didn't exist yet.
Second, the ifunc's generation caused a symbol to exist under the name
of the alias already (the declared function above), which suppressed the
alias generation.

This patch fixes all of this by moving the checking of ifuncs/CFE aliases
until AFTER we have generated the extern-C alias.  Then, it does a
'fixup' around the GlobalIFunc to make sure we correct the reference.

Differential Revision: https://reviews.llvm.org/D122608
2022-04-01 13:00:59 -07:00
Jorge Gorbe Moya fc7573f29c Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 46774df307.
2022-03-31 14:54:41 -07:00
Paul Kirth 46774df307 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-31 17:38:21 +00:00
Aaron Ballman 2267549296 Fix the build after cd26190a10
These variables were being used uninitialized and it caused a
significant number of test failures on Windows.
2022-03-31 12:03:53 -04:00
wangyihan 907d3acefc [Clang][CodeGen]Beautify dump format, add indent for nested struct and struct members
Beautify dump format, add indent for nested struct and struct members, also fix test cases in dump-struct-builtin.c
for example:
struct:
```
  struct A {
    int a;
    struct B {
      int b;
      struct C {
        struct D {
          int d;
          union E {
            int x;
            int y;
          } e;
        } d;
        int c;
      } c;
    } b;
  };
```
Before:
```
struct A {
int a = 0
struct B {
    int b = 0
struct C {
struct D {
            int d = 0
union E {
                int x = 0
                int y = 0
                }
            }
        int c = 0
        }
    }
}
```
After:
```
struct A {
    int a = 0
    struct B {
        int b = 0
        struct C {
            struct D {
                int d = 0
                union E {
                    int x = 0
                    int y = 0
                }
            }
            int c = 0
        }
    }
}
```

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D122704
2022-03-31 07:38:37 +08:00
Chris Bieneman dfde354958 NFC. Fixing warnings from adding DXContainer
Adds DXContainer to switch statements in Clang and LLDB to silence
warnings.
2022-03-29 14:46:24 -05:00
wangyihan de7cd3ccf5 [Clang][CodeGen]Remove anonymous tag locations
Remove anonymous tag locations, powered by 'PrintingPolicy',
@aaron.ballman once suggested removing this extra information in
https://reviews.llvm.org/D122248
struct:

  struct S {
    int a;
    struct /* Anonymous*/ {
      int x;
    } b;
    int c;
  };

Before:

  struct S {
    int a = 0
      struct S::(unnamed at ./builtin_dump_struct.c:20:3) {
        int x = 0
      }
    int c = 0
  }

After:

struct S {
  int a = 0
    struct S::(unnamed) {
      int x = 0
    }
  int c = 0
}

Differntial Revision: https://reviews.llvm.org/D122670
2022-03-29 11:38:29 -07:00
Paul Kirth 90cb325abd Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd97.
2022-03-29 06:20:30 +00:00
Phoebe Wang cd26190a10 [X86][regcall] Support passing / returning structures
Currently, the regcall calling conversion in Clang doesn't match with
ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7

This patch tries to fix the problem to match with ICC.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D122104
2022-03-29 11:29:57 +08:00
Paul Kirth 2add3fbd97 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-28 23:30:04 +00:00
James Y Knight d614874900 [Clang] Implement __builtin_source_location.
This builtin returns the address of a global instance of the
`std::source_location::__impl` type, which must be defined (with an
appropriate shape) before calling the builtin.

It will be used to implement std::source_location in libc++ in a
future change. The builtin is compatible with GCC's implementation,
and libstdc++'s usage. An intentional divergence is that GCC declares
the builtin's return type to be `const void*` (for
ease-of-implementation reasons), while Clang uses the actual type,
`const std::source_location::__impl*`.

In order to support this new functionality, I've also added a new
'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after
MSGuidDecl, and is used to represent a generic concept of an lvalue
constant with global scope, deduplicated by its value. It's possible
that MSGuidDecl itself, or some of the other similar sorts of things
in Clang might be able to be refactored onto this more-generic
concept, but there's enough special-case weirdness in MSGuidDecl that
I gave up attempting to share code there, at least for now.

Finally, for compatibility with libstdc++'s <source_location> header,
I've added a second exception to the "cannot cast from void* to T* in
constant evaluation" rule. This seems a bit distasteful, but feels
like the best available option.

Reviewers: aaron.ballman, erichkeane

Differential Revision: https://reviews.llvm.org/D120159
2022-03-28 18:29:02 -04:00
Joseph Huber 9d3550c517 [OpenMP] Add AMDGPU calling convention to ctor / dtor functions
This patch adds the necessary AMDGPU calling convention to the ctor /
dtor kernels. These are fundamentally device kenels called by the host
on image load. Without this calling convention information the AMDGPU
plugin is unable to identify them.

Depends on D122504

Fixes #54091

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122515
2022-03-25 22:44:20 -04:00
Joseph Huber 3c6d32ec6c [OpenMP] Make Ctor / Dtor functions have external visibility
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D122504
2022-03-25 22:44:17 -04:00
William S. Moses 89525cbf28 [Clang] Add helper method to determine if a nonvirtual base has an entry in the LLVM struct
This patch adds a helper method to determine if a nonvirtual base has an entry in the LLVM struct. Such a base may not have an entry
if the base does not have any fields/bases itself that would change the size of the struct. This utility method is useful for other frontends (Polygeist) that use Clang as an API to generate code.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122502
2022-03-25 16:32:12 -04:00
Joseph Huber b9f67d44ba [OpenMP] Replace device kernel linkage with weak_odr
Currently the device kernels all have weak linkage to prevent linkage
errors on multiple defintions. However, this prevents some optimizations
from adequately analyzing them because of the nature of weak linkage.
This patch replaces the weak linkage with weak_odr linkage so we can
statically assert that multiple declarations of the same kernel will
have the same definition.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122443
2022-03-25 11:29:15 -04:00
Jennifer Yu a6cdac48ff Eliminate extra set of simd variant function attribute.
Current clang generates extra set of simd variant function attribute
with extra 'v' encoding.
For example:
_ZGVbN2v__Z5add_1Pf vs _ZGVbN2vv__Z5add_1Pf
The problem is due to declaration of ParamAttrs following:
    llvm::SmallVector<ParamAttrTy, 8> ParamAttrs(ParamPositions.size());
where ParamPositions.size() is grown after following assignment:
    Pos = ParamPositions[PVD];
So the PVD is not find in ParamPositions.

The problem is ParamPositions need to set for each FD decl. To fix this

Move ParamPositions's init inside while loop for each FD.

Differential Revision: https://reviews.llvm.org/D122338
2022-03-24 13:27:28 -07:00
wangyihan 7faa95624e [clang][CodeGen]Fix clang crash and add bitfield support in __builtin_dump_struct
Fix clang crash and add bitfield support in __builtin_dump_struct.

In clang13.0.x, a struct with three or more members and a bitfield at
the same time will cause a crash. In clang15.x, as long as the struct
has one bitfield, it will cause a crash in clang.

Open issue: https://github.com/llvm/llvm-project/issues/54462

Differential Revision: https://reviews.llvm.org/D122248
2022-03-24 12:23:29 -07:00
David Blaikie 7b498beef0 DebugInfo: Classify noreturn function types as non-reconstructible
This information isn't preserved in the DWARF description of function
types (though probably should be - it's preserved on the function
declarations/definitions themselves through the DW_AT_noreturn attribute
- but we should move or also include that in the subroutine type itself
too - but for now, with it not being there, the DWARF is lossy and
can't be reconstructed)
2022-03-24 18:53:14 +00:00
Mike Rice f82ec5532b [OpenMP] Initial parsing/sema for the 'omp target parallel loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target parallel loop directive.

Differential Revision: https://reviews.llvm.org/D122359
2022-03-24 09:19:00 -07:00
Aaron Ballman 488c772920 Fix a crash with variably-modified parameter types in a naked function
Naked functions have no prolog, so it's not valid to emit prolog code
to evaluate the variably-modified type. This fixes Issue 50541.
2022-03-24 10:39:14 -04:00
Dávid Bolvanský a683ba4ff5 [NFCI] Fix set-but-unused warning in CGOpenMPRuntime.cpp 2022-03-24 07:49:21 +01:00
Ben Shi 51585aa240 [clang][AVR] Implement standard calling convention for AVR and AVRTiny
This patch implements avr-gcc's calling convention:
https://gcc.gnu.org/wiki/avr-gcc#Calling_Convention

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D120720
2022-03-24 02:08:22 +00:00
Julian Lettner 64902d335c Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-23 18:36:55 -07:00
Zequan Wu 581dc3c729 Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
This reverts commit 22570bac69.
2022-03-23 16:11:54 -07:00
Joseph Huber 0d16c23af1 [OpenMP] Do not create offloading entries for internal or hidden symbols
Currently we create offloading entries to register device variables with
the host. When we register a variable we will look up the symbol in the
device image and map the device address to the host address. This is a
problem when the symbol is declared with hidden visibility or internal
linkage. This means the symbol is not accessible externally and we
cannot get its address. We should still allow static variables to be
declared on the device, but ew should not create an offloading entry for
them so they exist independently on the host and device.

Fixes #54309

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122352
2022-03-23 18:27:16 -04:00
Erich Keane 3fb101a691 [NFC] Replace a not-null-check && isa with isa_and_nonnull 2022-03-23 13:09:28 -07:00
Nikita Popov a8690ba9d0 [CGExpr] Perform bitcast unconditionally
The way the check is written is not compatible with opaque
pointers -- while we don't need to change the IR pointer type,
we do need to change the element type stored in the Address.
2022-03-23 15:39:39 +01:00
Nikita Popov 5c6752d4ad [CGObjCMac] Check global value type instead of poitner type
As we're going to reassign the initializer, we actually need the
value types to match, not just the pointer types. This is only
relevant with opaque pointers.
2022-03-23 15:39:39 +01:00
Nikita Popov beee09687f [CGBlocks] Don't assume presence of bitcast
With opaque pointers, the bitcast constexpr will not be present.
2022-03-23 15:39:39 +01:00
David Truby 683fc6203c [clang][AArc64][SVE] Implement vector-scalar operators
This patch extends the support for C/C++ operators for SVE
types to allow one of the arguments to be a scalar, in which
case a vector splat is performed.

Differential Revision: https://reviews.llvm.org/D121829
2022-03-23 14:20:48 +00:00
Nikita Popov c070d5ceff [CGOpenMPRuntime] Remove uses of deprecated Address constructor
And as these are the last remaining uses, also remove the
constructor itself.
2022-03-23 12:40:44 +01:00
Nikita Popov 8b62dd3cd6 Reapply [CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()
This requires some adjustment in caller code, because there was
a confusion regarding the meaning of the PtrTy argument: This
argument is the type of the pointer being loaded, not the addresses
being loaded from.

Reapply after fixing the specified pointer type for one call in
47eb4f7dcd, where the used type is
important for determining alignment.
2022-03-23 12:06:11 +01:00
Nikita Popov 47eb4f7dcd [CGOpenMPRuntime] Specify correct type in EmitLoadOfPointerLValue()
Perform a bitcast first, so we can specify the correct pointer type
inf EmitLoadOfPointerLValue(), rather than using a dummy void pointer.
2022-03-23 11:51:14 +01:00
Nikita Popov ba2be802b0 [CGOpenMPRuntime] Reuse getDepobjElements() (NFC)
There were two more places repeating this code, reuse the helper.
This requires moving the static functions into the class.
2022-03-23 11:31:49 +01:00
Nikita Popov 27f6cee12d Revert "[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()"
This reverts commit 767ec883e3.

This results in a some incorrect alignments which are not covered
by existing tests.
2022-03-23 10:24:39 +01:00
Phoebe Wang 32103608fc [Inline-asm] Add diagnosts for unsupported inline assembly arguments
GCC supports power-of-2 size structures for the arguments. Clang supports fewer than GCC. But Clang always crashes for the unsupported cases.

This patch adds sema checks to do the diagnosts to solve these crashes.

Reviewed By: jyu2

Differential Revision: https://reviews.llvm.org/D107141
2022-03-23 11:25:19 +08:00
Akira Hatanaka 818e72d1b0 [NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in
TargetInfo.cpp

Differential Revision: https://reviews.llvm.org/D122199
2022-03-22 18:39:16 -07:00
Mike Rice 2cedaee6f7 [OpenMP] Initial parsing/sema for the 'omp parallel loop' construct
Adds basic parsing/sema/serialization support for the
  #pragma omp parallel loop directive.

 Differential Revision: https://reviews.llvm.org/D122247
2022-03-22 13:55:47 -07:00
Nikita Popov cd6d9ae263 [CGOpenMPRuntime] Remove some uses of deprecated Adddress ctor 2022-03-22 16:29:35 +01:00
Nikita Popov 4f5640cad3 [CGOpenMPRuntime] Remove some uses of deprecated Address ctor 2022-03-22 15:35:45 +01:00
Nikita Popov 73c0d05e6a [CGOpenMPRuntimeGPU] Remove uses of deprecated address constructor
Worth noting that the code marked with FIXME is dead and would
produce invalid IR if hit. Someone familiar with this code should
probably look into that.
2022-03-22 15:02:45 +01:00
Djordje Todorovic 73777b4c35 [Debugify] Optimize debugify original mode
Before we start addressing the issue with having
a lot of false positives when using debugify in
the original mode, we have made a few patches that
should speed up the execution of the testing
utility Passes.

For example, when testing a large project
(let's say LLVM project itself), we can face
a lot of potential DI issues. Usually, we use
-verify-each-debuginfo-preserve (that is very
similar to -debugify-each) -- it collects
DI metadata before each Pass, and after the Pass
it checks if the Pass preserved the DI metadata.
However, we can speed up this process, since we
don't need to collect DI metadata before each
Pass -- we could use the DI metadata that are
collected after the previous Pass from
the pipeline as an input for the next Pass.

This patch speeds up the utility for ~2x.

Differential Revision: https://reviews.llvm.org/D115622
2022-03-22 12:14:00 +01:00
Nikita Popov 51ba13b1ae [CGStmtOpenMP] Remove uses of deprecated Address constructor 2022-03-22 11:00:08 +01:00
Nikita Popov b8f0e12847 [CodeGen] Remove some uses of deprecated Address constructor
Remove two stray uses in CodeGenModule and CGCUDANV.
2022-03-22 10:02:35 +01:00
Nikita Popov 767ec883e3 [CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()
This requires some adjustment in caller code, because there was
a confusion regarding the meaning of the PtrTy argument: This
argument is the type of the pointer being loaded, not the addresses
being loaded from.
2022-03-22 09:42:31 +01:00
Nikita Popov a9656bd1bc [CodeGen][OpenMP] Make EmitLoadOfPointer() type consistent
If necessary insert a bitcast beforehand, so the LLVM-level pointer
type and the Clang-level pointer type line up.
2022-03-22 09:37:48 +01:00
Nikita Popov 7a2e12e0a7 [CodeGen][OpenMP] Use correct type in EmitLoadOfPointer()
The EmitLoadOfPointer() call already specified the right pointer
type, but it did not match the Address we're loading from, so we
need to insert a bitcast first.
2022-03-21 15:22:37 +01:00
Nikita Popov b6f85d8539 [CodeGen][OpenMP] Use correct type in EmitLoadOfPointer()
Rather than using a dummy void pointer type, we should specify the
correct private type and perform the bitcast beforehand rather than
afterwards. This way, the Address will have correct alignment
information.
2022-03-21 12:08:05 +01:00
Mike Rice 6bd8dc91b8 [OpenMP] Initial parsing/sema for the 'omp target teams loop' construct
Adds basic parsing/sema/serialization support for the
 #pragma omp target teams loop directive.

Differential Revision: https://reviews.llvm.org/D122028
2022-03-18 13:48:32 -07:00
Alan Zhao 8cd8bd4a5c Implement __cpuid and __cpuidex as Clang builtins
https://reviews.llvm.org/D23944 implemented the #pragma intrinsic from
MSVC. This causes the statement #pragma intrinsic(cpuid) to fail [0]
on Clang because cpuid is currently implemented in intrin.h instead
of a Clang builtin. Reimplementing cpuid (as well as it's releated
function, cpuidex) should resolve this.

[0]: https://crbug.com/1279344

Differential revision: https://reviews.llvm.org/D121653
2022-03-18 18:13:52 +01:00
Nikita Popov 52cc65d474 [OpenMPRuntime] Specify correct pointer type
Rather than specifying a dummy type in EmitLoadOfPointer() and
then casting it to the correct one, we should instead specify the
correct type and cast beforehand. Otherwise the computed alignment
will be incorrect.
2022-03-18 14:25:51 +01:00
Nikita Popov 74992f4a5b [CodeGen] Store element type in DominatingValue<RValue>
For aggregate rvalues, we need to store the element type in the
dominating value, so we can recover the element type for the
address.
2022-03-18 11:13:25 +01:00
Nikita Popov 33d020d010 [CodeGen] Remove some uses of deprecated Address constructor 2022-03-18 11:01:25 +01:00
Benjamin Kramer 5d2ce7663b Use llvm::append_range instead of push_back loops where applicable. NFCI. 2022-03-18 01:25:34 +01:00
Paul Kirth 964398ccb1 Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"""
This reverts commit 6cf560d69a.
2022-03-18 00:21:33 +00:00
Paul Kirth 6cf560d69a Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""
I mistakenly reverted my commit, so I'm relanding it.

This reverts commit 10866a1df4.
2022-03-18 00:04:22 +00:00
Paul Kirth 10866a1df4 Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit e7749d4713.
2022-03-17 23:54:26 +00:00
Paul Kirth e7749d4713 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Differential Revision: https://reviews.llvm.org/D115907
2022-03-17 23:46:23 +00:00
Changpeng Fang dd5895cc39 AMDGPU: Use the implicit kernargs for code object version 5
Summary:
  Specifically, for trap handling, for targets that do not support getDoorbellID,
we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1].
To get aperture bases when targets do not have aperture registers, we load
private_base or shared_base directly from the implicit kernarg. In clang, we use
implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}.

Reviewers: arsenm, sameerds, yaxunl

Differential Revision: https://reviews.llvm.org/D120265
2022-03-17 14:12:36 -07:00
Johannes Doerfert f02550bdd9 Reapply "[OpenMP][FIX] Allow device constructors for AMD GPU"
This reverts commit a597d6a780 and
reapplies 07b1766461.

In AMD GPU device code the globals are in AS(1). Before, we crashed if
the global was a structure. Now we simply cast away the AS before we
generate the code to initialize the global.

Differential Revision: https://reviews.llvm.org/D121837

Fixes: https://github.com/llvm/llvm-project/issues/54421
2022-03-17 12:53:47 -05:00
Julian Lettner 22570bac69 Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-17 10:47:13 -07:00
Nikita Popov 6e1e99dc07 [CodeGen] Avoid pointer element type access for blocks
Pass the block struct type down to the TargetInfo hooks.
2022-03-17 16:56:31 +01:00
Nikita Popov 6c0af92612 [CodeGen] Avoid some pointer element type accesses 2022-03-17 16:36:14 +01:00
Nikita Popov 2edac9d962 [CodeGen] Avoid some pointer element type accesses 2022-03-17 16:32:45 +01:00
Nikita Popov bf1a99861c [CodeGen] Avoid some pointer element type accesses 2022-03-17 15:25:55 +01:00
Nikita Popov 799643f7f0 [CGObjCGNU] Remove pointer element type uses 2022-03-17 14:53:34 +01:00
Evgenii Stepanov cb96464f12 Stricter use-after-dtor detection for trivial members.
Poison trivial class members one-by-one in the reverse order of their
construction, instead of all-at-once at the very end.

For example, in the following code access to `x` from `~B` will
produce an undefined value.

struct A {
  struct B b;
  int x;
};

Reviewed By: kda

Differential Revision: https://reviews.llvm.org/D119600
2022-03-16 18:20:27 -07:00
Evgenii Stepanov c5ea8e9138 Use-after-dtor detection for trivial base classes.
-fsanitize-memory-use-after-dtor detects memory access after a
subobject is destroyed but its memory is not yet deallocated.
This is done by poisoning each object memory near the end of its destructor.

Subobjects (members and base classes) do this in their respective
destructors, and the parent class does the same for its members with
trivial destructors.

Inexplicably, base classes with trivial destructors are not handled at
all. This change fixes this oversight by adding the base class poisoning logic
to the parent class destructor.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D119300
2022-03-16 18:20:27 -07:00
Eli Friedman 04ba344176 [CodeGen] Inline _byteswap_* builtins.
As discussed in D57915.

Fixes https://github.com/llvm/llvm-project/issues/39999 .

Differential Revision: https://reviews.llvm.org/D121865
2022-03-16 16:18:51 -07:00
Johannes Doerfert a597d6a780 Revert "[OpenMP][FIX] Allow device constructors for AMD GPU"
This reverts commit 07b1766461 as it broke
the buildbots:
    https://lab.llvm.org/buildbot#builders/193/builds/8594
2022-03-16 17:35:54 -05:00
Johannes Doerfert 07b1766461 [OpenMP][FIX] Allow device constructors for AMD GPU
In AMD GPU device code the globals are in AS(1). Before, we crashed if
the global was a structure. Now we simply cast away the AS before we
generate the code to initialize the global.

Differential Revision: https://reviews.llvm.org/D121837
2022-03-16 17:04:28 -05:00
Mike Rice 79f661edc1 [OpenMP] Initial parsing/sema for the 'omp teams loop' construct
Adds basic parsing/sema/serialization support for the #pragma omp teams loop
directive.

Differential Revision: https://reviews.llvm.org/D121713
2022-03-16 14:39:18 -07:00
Arthur Eubanks 2371c5a0e0 [OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.

Basically the same as D120527.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D121847
2022-03-16 14:11:53 -07:00
Thomas Lively 7e8913d775 [WebAssembly] Fix names of SIMD instructions containing '_zero'
Fix the instruction names to match the WebAssembly spec:

 - `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero`
 - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero`

Also rename related things like intrinsics, builtins, and test functions to
match.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D121661
2022-03-16 13:34:57 -07:00
Yonghong Song 3251ba2d0f [Attr] Fix a btf_type_tag AST generation
Current ASTContext.getAttributedType() takes attribute kind,
ModifiedType and EquivType as the hash to decide whether an AST node
has been generated or note. But this is not enough for btf_type_tag
as the attribute might have the same ModifiedType and EquivType, but
still have different string associated with attribute.

For example, for a data structure like below,
  struct map_value {
        int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a;
        int __attribute__((btf_type_tag("tag2"))) __attribute__((btf_type_tag("tag4"))) *b;
  };
The current ASTContext.getAttributedType() will produce
an AST similar to below:
  struct map_value {
        int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a;
        int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *b;
  };
and this is incorrect.

It is very difficult to use the current AttributedType as it is hard to
get the tag information. To fix the problem, this patch introduced
BTFTagAttributedType which is similar to AttributedType
in many ways but with an additional BTFTypeTagAttr. The tag itself can
be retrieved with BTFTypeTagAttr.
With the new BTFTagAttributed type, the debuginfo code can be greatly
simplified compared to previous TypeLoc based approach.

Differential Revision: https://reviews.llvm.org/D120296
2022-03-16 08:46:52 -07:00
Simon Moll 0aab344104 [Clang] Allow "ext_vector_type" applied to Booleans
This is the `ext_vector_type` alternative to D81083.

This patch extends Clang to allow 'bool' as a valid vector element type
(attribute ext_vector_type) in C/C++.

This is intended as the canonical type for SIMD masks and facilitates
clean vector intrinsic declarations.  Vectors of i1 are supported on IR
level and below down to many SIMD ISAs, such as AVX512, ARM SVE (fixed
vector length) and the VE target (NEC SX-Aurora TSUBASA).

The RFC on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D88905
2022-03-16 11:10:32 +01:00
Keith Smiley a2db7d5e9c reland: [clang] Don't append the working directory to absolute paths
This fixes a bug that happens when using -fdebug-prefix-map to remap an
absolute path to a relative path. Since the path was absolute before
remapping, it is safe to assume that concatenating the remapped working
directory would be wrong.

This was originally submitted as https://reviews.llvm.org/D113718, but
reverted because when testing with dwarf 5 enabled, the tests were too
strict.

Differential Revision: https://reviews.llvm.org/D121663
2022-03-15 13:42:35 -07:00
Simon Pilgrim 7262eacd41 Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
2022-03-15 13:01:35 +00:00
Keith Smiley cb22d71806 [clang] Fix DIFile directory root on Windows
On unix systems this logic would not separate the file and directory of
the DIFile unless they shared more components at the start than just the
root path character. The logic to do this was unix specific so it didn't
work on Windows. Now we check if the entire root_path is the same as
what you were going to set as the Dir and use the full filepath in that
case.

Differential Revision: https://reviews.llvm.org/D111579
2022-03-14 20:07:01 -07:00
Julian Lettner 9c542a5a4e Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Joseph Huber 806bbc49dc [OpenMP] Try to embed offloading objects after codegen
Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a
codegen action, meaning we only handle this option correctly when the
input is a bitcode file. This patch adds the same handling to embed an
offloading object after we complete code generation. This allows us to
embed the object correctly if the input file is source or bitcode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120270
2022-03-14 20:08:24 -04:00
Dávid Bolvanský 003c0b9307 [Clang] always_inline statement attribute
Motivation:

```
int test(int x, int y) {
    int r = 0;
    [[clang::always_inline]] r += foo(x, y); // force compiler to inline this function here
    return r;
}
```

In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).

This patch solves this problem with call site attribute. "noinline" statement attribute already landed in D119061. Also, some LLVM Inliner fixes landed so call site attribute is stronger than function attribute.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D120717
2022-03-14 21:45:31 +01:00
Arthur Eubanks 250620f76e [OpaquePtr][AArch64] Use elementtype on ldxr/stxr
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120527
2022-03-14 10:09:59 -07:00
Erich Keane dc152659b4 Have cpu-specific variants set 'tune-cpu' as an optimization hint
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.

This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.

Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).

If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.

1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.

2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.

Differential Revision: https://reviews.llvm.org/D121410
2022-03-14 06:14:30 -07:00
Kazushi (Jam) Marukawa b1b4b6f366 [Clang][VE] Add vector load intrinsics
Add vector load intrinsic instructions for VE.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D121049
2022-03-12 09:09:57 +09:00
Akira Hatanaka aa4ea0ee54 [NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in a couple
more files

Differential Revision: https://reviews.llvm.org/D121135
2022-03-11 09:30:31 -08:00
Simon Pilgrim d258196f5f [clang] ScalarExprEmitter::VisitCastExpr - use castAs<> instead of getAs<> to avoid dereference of nullptr
The pointers are always dereferenced, so assert the cast is correct instead of returning nullptr
2022-03-09 11:40:37 +00:00
Ryan Senanayake b3dae59b9d [clang] Fix CodeGenAction for LLVM IR MemBuffers
Replaces use of getCurrentFile with getCurrentFileOrBufferName
in CodeGenAction. This avoids an assertion error or an incorrect
name chosen for the output file when assertions are disabled.
This error previously occurred when the FrontendInputFile was a
MemoryBuffer instead of a file.

Reviewed By: jlebar

Differential Revision: https://reviews.llvm.org/D121259
2022-03-09 00:39:48 +00:00
Akira Hatanaka 9bb8c80bea [NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in
CGBuiltin.cpp

Differential Revision: https://reviews.llvm.org/D121153
2022-03-08 09:45:15 -08:00
Stanislav Mekhanoshin 932f628121 [AMDGPU] new gfx940 fp atomics
Differential Revision: https://reviews.llvm.org/D121028
2022-03-07 12:32:02 -08:00
David Blaikie c0a6433f2b Simplify OpenMP Lambda use
* Use default ref capture for non-escaping lambdas (this makes
  maintenance easier by allowing new uses, removing uses, having
  conditional uses (such as in assertions) not require updates to an
  explicit capture list)
* Simplify addPrivate API not to take a lambda, since it calls it
  unconditionally/immediately anyway - most callers are simply passing
  in a named value or short expression anyway and the lambda syntax just
  adds noise/overhead

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D121077
2022-03-07 18:23:20 +00:00
Qiu Chaofan b2497e5435 [PowerPC] Add generic fnmsub intrinsic
Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015
2022-03-07 13:00:06 +08:00
Shao-Ce SUN fa9c8bab0c [RISCV] Support k-ext clang intrinsics
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D112774
2022-03-05 13:57:18 +08:00
Akira Hatanaka 3717b9661f [NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in
CGBlocks.cpp

Differential Revision: https://reviews.llvm.org/D120856
2022-03-03 08:54:46 -08:00
Aakanksha 840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin 2e2e64df4a [AMDGPU] Add gfx940 target
This is target definition only.

Differential Revision: https://reviews.llvm.org/D120688
2022-03-02 13:54:48 -08:00
Tong Zhang f76d3b800f [clang][CGStmt] fix crash on invalid asm statement
Clang is crashing on the following statement

  char var[9];
  __asm__ ("" : "=r" (var) : "0" (var));

This is similar to existing test: crbug_999160_regtest

The issue happens when EmitAsmStmt is trying to convert input to match
output type length. However, that is not guaranteed to be successful all the
time and if the statement itself is invalid like having an array type in
the example, we should give a regular error message here instead of
using assert().

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D120596
2022-03-02 11:18:55 -08:00
Akira Hatanaka d112cc2756 [NFC][Clang][OpaquePtr] Remove the call to Address::deprecated in
CreatePointerBitCastOrAddrSpaceCast

Differential Revision: https://reviews.llvm.org/D120757
2022-03-02 08:58:00 -08:00
Tong Zhang 17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
Michael Kruse a66f7769a3 [OpenMPIRBuilder] Implement static-chunked workshare-loop schedules.
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads.

This patch includes the following related changes:
 * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder.
 * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder.
 * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop.
 * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause.

Differential Revision: https://reviews.llvm.org/D114413
2022-02-28 18:18:33 -06:00
Dávid Bolvanský 223b824022 [Clang] noinline call site attribute
Motivation:

```
int foo(int x, int y) { // any compiler will happily inline this function
    return x / y;
}

int test(int x, int y) {
    int r = 0;
    [[clang::noinline]] r += foo(x, y); // for some reason we don't want any inlining here
    return r;
}

```

In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics"  in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).

This patch solves this problem with call site attribute. The implementation is "smaller" wrt approach which uses new intrinsics and thanks to https://reviews.llvm.org/D79121 (Add nomerge statement attribute to clang), we have got some basic infrastructure to deal with attrs on statements with call expressions.

GCC devs are more inclined to call attribute solution as well, as builtins are problematic for them - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104187. But they have no patch proposal yet so..  We have free hands here.

If this approach makes sense, next future steps would be support for call site attributes for always_inline / flatten.

Reviewed By: aaron.ballman, kuhar

Differential Revision: https://reviews.llvm.org/D119061
2022-02-28 21:21:17 +01:00
Itay Bookstein f3480390be [clang][CodeGen] Avoid emitting ifuncs with undefined resolvers
The purpose of this change is to fix the following codegen bug:

```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }

// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);

// run:
clang main.c other.c -o main; ./main
```

This will segfault prior to the change, and return the correct
exit code 5 after the change.

The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.

The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.

I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.

Previous discussion in: https://reviews.llvm.org/D112349

Signed-off-by: Itay Bookstein <ibookstein@gmail.com>

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D120266
2022-02-26 11:17:49 +02:00
Adrian Prantl bc7aeea854 Revert "Don't append the working directory to absolute paths"
This reverts commit 2cd9a86da5.
2022-02-25 17:00:10 -08:00
Adrian Prantl 2cd9a86da5 Don't append the working directory to absolute paths
This fixes a bug that happens when using -fdebug-prefix-map to remap
an absolute path to a relative path. Since the path was absolute
before remapping, it is safe to assume that concatenating the remapped
working directory would be wrong.

Differential Revision: https://reviews.llvm.org/D113718
2022-02-25 13:03:59 -08:00
Alexey Bataev d04d9220e1 [OPENMP]Fix PR50347: Mapping of global scope deep object fails.
Changed the we handle llvm::Constants in sizes arrays. ConstExprs and
GlobalValues cannot be used as initializers, need to put them at the
runtime, otherwise there wight be the compilation errors.

Differential Revision: https://reviews.llvm.org/D105297
2022-02-25 10:54:24 -08:00
Shangwu Yao c2f501f395
[CUDA][SPIRV] Assign global address space to CUDA kernel arguments
(resubmit https://reviews.llvm.org/D119207 after fixing the test for
some build settings)

This patch converts CUDA pointer kernel arguments with default address
space to CrossWorkGroup address space (__global in OpenCL). This is
because Generic or Function (OpenCL's private) is not supported as
storage class for kernel pointer types.

Differential revision: https://reviews.llvm.org/D120366
2022-02-24 20:51:43 -08:00
Alexey Bataev ca6fa71b7e Revert "[OPENMP]Fix PR50347: Mapping of global scope deep object fails."
This reverts commit 638938117a. Need to
fix reported fail https://lab.llvm.org/buildbot/#/builders/193/builds/7496
2022-02-24 12:04:39 -08:00
Alexey Bataev 638938117a [OPENMP]Fix PR50347: Mapping of global scope deep object fails.
Changed the we handle llvm::Constants in sizes arrays. ConstExprs and
GlobalValues cannot be used as initializers, need to put them at the
runtime, otherwise there wight be the compilation errors.

Differential Revision: https://reviews.llvm.org/D105297
2022-02-24 11:49:14 -08:00
Joseph Huber 7aef8b3754 [OpenMP] Make section variable external to prevent collisions
Summary:
We use a section to embed offloading code into the host for later
linking. This is normally unique to the translation unit as it is thrown
away during linking. However, if the user performs a relocatable link
the sections will be merged and we won't be able to access the files
stored inside. This patch changes the section variables to have external
linkage and a name defined by the section name, so if two sections are
combined during linking we get an error.
2022-02-24 10:57:09 -05:00
Yaxun (Sam) Liu 9d899d8f01 [HIP] Support `-fgpu-default-stream`
Introduce -fgpu-default-stream={legacy|per-thread} option to
support per-thread default stream for HIP runtime.

When -fgpu-default-stream=per-thread, HIP kernels are
launched through hipLaunchKernel_spt instead of
hipLaunchKernel. Also HIP_API_PER_THREAD_DEFAULT_STREAM=1
is defined by the preprocessor to enable other per-thread stream
API's.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120298
2022-02-23 22:28:29 -05:00