Commit Graph

15669 Commits

Author SHA1 Message Date
Zequan Wu 387620aa8c Reland "[LTO][COFF] Use bitcode file names in lto native object file names."
This reverts commit eef5405f74.
2022-11-22 11:26:18 -08:00
Zequan Wu eef5405f74 Revert "[LTO][COFF] Use bitcode file names in lto native object file names."
This reverts commit 531ed6d5aa.
2022-11-22 10:55:05 -08:00
Zequan Wu 531ed6d5aa [LTO][COFF] Use bitcode file names in lto native object file names.
Currently the lto native object files have names like main.exe.lto.1.obj. In
PDB, those names are used as names for each compiland. Microsoft’s tool
SizeBench uses those names to present to users the size of each object files.
So, names like main.exe.lto.1.obj is not user friendly.

This patch makes the lto native object file names more readable by using
the bitcode file names as part of the file names. For example, if the input
bitcode file has path like "path/to/foo.obj", its corresponding lto native
object file path would be "path/to/main.exe.lto.foo.obj". Since the lto native
object file name only bothers PDB, this patch only changes the lld-linker's
behavior.

Reviewed By: tejohnson, MaskRay, #lld-macho

Differential Revision: https://reviews.llvm.org/D137217
2022-11-22 10:19:58 -08:00
Krzysztof Parzyszek b805853ccb [Hexagon] Make local array static in getIntrinsicForHexagonNonClangBuiltin
It should not be created on every call, the omission of `static` was a bug
in the patch that introduced it.
2022-11-22 09:48:01 -08:00
Jan Sjodin 969d787a47 [OpenMP][OMPIRBuilder] Add a configuration class that captures flags that affect codegen
This patch introudces the OpenMPIRBuilderConfig class which contains various
flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to
keep the flags in one place so they do not have to be passed in every time.
The flags can be set optionally since some uses cases don't rely on functions
that depend on these flags.

Reviewed By: jdoerfert, tschuett

Differential Revision: https://reviews.llvm.org/D138220
2022-11-22 09:25:04 -05:00
Stefan Gränitz 9a9d636cae [CGObjC] Open cleanup scope before SaveAndRestore CurrentFuncletPad and push CatchRetScope early
Pushing the `CatchRetScope` early causes cleanups for catch parameters to be emitted in the basic block of the catch handler instead of the `catchret.dest` block. This is important because the latter is not part of the catchpad and this caused code truncations due to ARC PreISel intrinsics in WinEHPrepare.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D137939
2022-11-22 12:02:53 +01:00
Kazu Hirata 6ba4b62af8 Return None instead of Optional<T>() (NFC)
This patch replaces:

  return Optional<T>();

with:

  return None;

to make the migration from llvm::Optional to std::optional easier.
Specifically, I can deprecate None (in my source tree, that is) to
identify all the instances of None that should be replaced with
std::nullopt.

Note that "return None" far outnumbers "return Optional<T>();".  There
are more than 2000 instances of "return None" in our source tree.

All of the instances in this patch come from functions that return
Optional<T> except Archive::findSym and ASTNodeImporter::import, where
we return Expected<Optional<T>>.  Note that we can construct
Expected<Optional<T>> from any parameter convertible to Optional<T>,
which None certainly is.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716

Differential Revision: https://reviews.llvm.org/D138464
2022-11-21 19:06:42 -08:00
Thomas Lively ae96b5bd2d [WebAssembly] Update relaxed-simd instruction names
Including builtin and intrinsic names. These should be the final names for the
proposal.
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md

Reviewed By: aheejin, maratyszcza

Differential Revision: https://reviews.llvm.org/D138249
2022-11-21 12:40:15 -08:00
gonglingqin c2ec455f18 [LoongArch] Add intrinsics for ibar, break and syscall
Diagnostics for intrinsic input parameters have also been added.

Differential Revision: https://reviews.llvm.org/D138094
2022-11-21 09:31:26 +08:00
Kazu Hirata 30f9eb1eb8 [clang] Remove unused forward declarations (NFC) 2022-11-20 14:32:17 -08:00
Fangrui Song d1163784b5 Remove unused llvm/IRPrinter/IRPrintingPasses.h or reorder #include after D137768 2022-11-19 22:09:05 +00:00
Alex Richardson 0745b0c035 Fix incorrect cast in VisitSYCLUniqueStableNameExpr
Clang language-level address spaces and LLVM pointer address spaces are
not the same thing (even though they will both have a numeric value of
zero in many cases). LangAS is a enum class to avoid implicit conversions,
but eba69b59d1 avoided the compiler error by
adding a `static_cast<>`. While touching this code, simplify it by using
CreatePointerBitCastOrAddrSpaceCast() which is already a no-op if the types
match.

This changes the code generation for spir64 to place the globals in
the sycl_global addreds space, which maps to `addrspace(1)`.

Reviewed By: bader

Differential Revision: https://reviews.llvm.org/D138284
2022-11-19 11:43:17 +00:00
yronglin 80f444646c [CodeGen][ARM] Fix ARMABIInfo::EmitVAAarg crash with empty record type variadic arg
Fix ARMABIInfo::EmitVAAarg crash with empty record type variadic arg

Open issue: https://github.com/llvm/llvm-project/issues/58794

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D138137
2022-11-19 15:14:10 +08:00
Jennifer Yu 9d90cf2fca [OPENMP5.1] Initial support for message clause. 2022-11-18 17:59:23 -08:00
Xing Xue fa7477eb87 [Clang][CodeGen][AIX] Map __builtin_frexpl, __builtin_ldexpl, and __builtin_modfl to 'double' version lib calls in 64-bit 'long double' mode
Summary:
AIX library functions frexpl(), ldexpl(), and modfl() are for 128-bit IBM long double, i.e. __ibm128. Other *l() functions, e.g., acosl(), are for 64-bit long double. The AIX Clang compiler currently maps builtin functions __builtin_frexpl(), __builtin_ldexpl(), and __builtin_modfl() to frexpl(), ldexpl(), and modfl() in 64-bit long double mode which results in seg-faults or incorrect return values. This patch changes to map __builtin_frexpl(), __builtin_ldexpl(), and __builtin_modfl() to double version lib functions frexp(), ldexp() and modf() in 64-bit long double mode.

Reviewed by: hubert.reinterpretcast, daltenty

Differential Revision: https://reviews.llvm.org/D137986
2022-11-18 11:36:56 -05:00
Alexander Shaposhnikov f102fe7304 Revert "Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm""
This reverts commit 7f608a2497
and removes the dependency of Object on IRPrinter.
2022-11-18 08:58:31 +00:00
Mikhail Goncharov 7f608a2497 Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"
This reverts commit 34ab474348.

as it has introduced circular dependency lib - analysis
2022-11-18 09:25:45 +01:00
Alexander Shaposhnikov 34ab474348 [opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768
2022-11-18 05:04:07 +00:00
Alexander Shaposhnikov 7059a6c32c [IR] Split out IR printing passes into IRPrinter
This diff splits out (from LLVMCore) IR printing passes into IRPrinter.
This structure is similar to what we already have for IRReader and
enables us to avoid circular dependencies between LLVMCore and Analysis
(this is a preparation for https://reviews.llvm.org/D137768).
The legacy interface is left unchanged, once the legacy pass manager
is removed (in the future) we will be able to clean it up further.
The bazel build configuration has been updated as well.

Test plan:
1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc
2/ bazel build --config=generic_clang @llvm-project//...

Differential revision: https://reviews.llvm.org/D138081
2022-11-18 01:47:56 +00:00
Jennifer Yu 1e054e6b52 [OPENMP5.1] Initial support for severity clause
Differential Revision:https://reviews.llvm.org/D138227
2022-11-17 16:05:02 -08:00
Doru Bercea 98bfd7f976 Fix declare target implementation to support enter. 2022-11-17 17:35:53 -06:00
Fangrui Song fc91c70593 Revert D135411 "Add generic KCFI operand bundle lowering"
This reverts commit eb2a57ebc7.

llvm/include/llvm/Transforms/Instrumentation/KCFI.h including
llvm/CodeGen is a layering violation. We should use an approach where
Instrumementation/ doesn't need to include CodeGen/.
Sorry for not spotting this in the review.
2022-11-17 22:45:30 +00:00
Sami Tolvanen eb2a57ebc7 Add generic KCFI operand bundle lowering
The KCFI sanitizer emits "kcfi" operand bundles to indirect
call instructions, which the LLVM back-end lowers into an
architecture-specific type check with a known machine instruction
sequence. Currently, KCFI operand bundle lowering is supported only
on 64-bit X86 and AArch64 architectures.

As a lightweight forward-edge CFI implementation that doesn't
require LTO is also useful for non-Linux low-level targets on
other machine architectures, add a generic KCFI operand bundle
lowering pass that's only used when back-end lowering support is not
available and allows -fsanitize=kcfi to be enabled in Clang on all
architectures.

Reviewed By: nickdesaulniers, MaskRay

Differential Revision: https://reviews.llvm.org/D135411
2022-11-17 21:55:00 +00:00
Ben Shi 84ef723573 [clang] Fix wrong ABI of AVRTiny.
A scalar which exceeds 4 bytes should be returned via a stack slot,
on an AVRTiny device.

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D138125
2022-11-17 08:38:44 +08:00
Eli Friedman 0fcb26c5b6 [clang] Fix __try/__finally blocks in C++ constructors.
We were crashing trying to convert a GlobalDecl from a
CXXConstructorDecl.  Instead of trying to do that conversion, just pass
down the original GlobalDecl.

I think we could actually compute the correct constructor/destructor
kind from the context, given the way Microsoft mangling works, but it's
simpler to just pass through the correct constructor/destructor kind.

Differential Revision: https://reviews.llvm.org/D136776
2022-11-16 15:13:33 -08:00
Tom Honermann 3e25ae605e [Clang] Correct when Itanium ABI guard variables are set for non-block variables with static or thread storage duration.
Previously, Itanium ABI guard variables were set after initialization was
complete for non-block declared variables with static and thread storage
duration. That resulted in initialization of such variables being restarted
in cases where the variable was referenced while it was still under
construction. Per C++20 [class.cdtor]p2, such references are permitted
(though the value obtained by such an access is unspecified). The late
initialization resulted in recursive reinitialization loops for cases like
this:
  template<typename T>
  struct ct {
    struct mc {
      mc() { ct<T>::smf(); }
      void mf() const {}
    };
    thread_local static mc tlsdm;
    static void smf() { tlsdm.mf(); }
  };
  template<typename T>
  thread_local typename ct<T>::mc ct<T>::tlsdm;
  int main() {
    ct<int>::smf();
  }

With this change, guard variables are set before initialization is started
so as to avoid such reinitialization loops.

Fixes https://github.com/llvm/llvm-project/issues/57828

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D135919
2022-11-16 16:31:35 -05:00
Jennifer Yu 628fdc3f57 [OPENMP]Initial support for at clause
Error directive is allowed in both declared and executable contexts.
The function ActOnOpenMPAtClause is called in both places during the
parsers.

Adding a param "bool InExContext" to identify context which is used to
emit error massage.

Differential Revision: https://reviews.llvm.org/D137851
2022-11-15 14:06:50 -08:00
Michele Scandale b7d7c448df Fix `unsafe-fp-math` attribute emission.
The conditions for which Clang emits the `unsafe-fp-math` function
attribute has been modified as part of
`84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`.
In the backend code generators `"unsafe-fp-math"="true"` enable floating
point contraction for the whole function.
The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
`unsafe-fp-math` to be set also when only in-statement contraction is
allowed.

Consider the following example
```
float foo(float a, float b, float c) {
  float tmp = a * b;
  return tmp + c;
}
```
and compile it with the command line
```
clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
  -O2 -mavx512f -S -o -
```
The resulting assembly has a `vfmadd213ss` instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.

The optimized IR is:
```
define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
  %mul = fmul reassoc nsz arcp afn float %b, %a
  %add = fadd reassoc nsz arcp afn float %mul, %c
  ret float %add
}

attributes #0 = {
  [...]
  "no-signed-zeros-fp-math"="true"
  "no-trapping-math"="true"
  [...]
  "unsafe-fp-math"="true"
}
```
The `"unsafe-fp-math"="true"` function attribute allows the backend code
generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`.

In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the `contract` fast-math flag and
`llvm.fmuladd` is being used to represent contractions opportunities
that occur within a single statement.
Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction
across statements is allowed.

Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see `CGFPOptionsRAII`).
For consistency `unsafe-fp-math` needs to be disabled if the contraction
mode for any scope/operation is not `fast`.
Similarly for consistency reason the initialization of `UnsafeFPMath` of
in `TargetOptions` for the backend code generation should take into
account the contraction mode as well.

Reviewed By: zahiraam

Differential Revision: https://reviews.llvm.org/D136786
2022-11-14 20:40:57 -08:00
Fangrui Song 77bf0df376 Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"
This reverts commit bf8381a8bc.

There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h
2022-11-14 15:51:03 -08:00
Alexander Shaposhnikov bf8381a8bc [opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768
2022-11-14 23:24:08 +00:00
Alexander Shaposhnikov 8c15c17e3b Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"
This reverts commit ef9e624694
for further investigation offline.
It appears to break the buildbot
llvm-clang-x86_64-sie-ubuntu-fast.
2022-11-14 21:31:30 +00:00
Alexander Shaposhnikov ef9e624694 [opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm
Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768
2022-11-14 21:11:07 +00:00
Arthur Eubanks cbcf123af2 [LegacyPM] Remove cl::opts controlling optimization pass manager passes
Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reland with UseNewGVN usage in clang removed.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915
2022-11-14 09:38:17 -08:00
Akash Banerjee 87f652d31f Migrate getOrCreateInternalVariable from Clang to OMPIRBuilder.
This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef

Differential Revision: https://reviews.llvm.org/D137720
2022-11-14 17:18:10 +00:00
Matt Arsenault 76db4e3c43 clang: Fix unnecessary truncation of resource limit values 2022-11-11 16:38:51 -08:00
Joshua Batista a5d14f757b Add builtin_elementwise_sin and builtin_elementwise_cos
Add codegen for llvm cos and sin elementwise builtins
The sin and cos elementwise builtins are necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered
when these functions are given inputs of incompatible types.
The new builtins are restricted to floating point types only.

Reviewed By: craig.topper, fhahn

Differential Revision: https://reviews.llvm.org/D135011
2022-11-10 23:30:27 -08:00
gonglingqin da34aff90d [Clang][LoongArch] Implement __builtin_loongarch_crc_w_d_w builtin and add diagnostics
This patch adds support to prevent __builtin_loongarch_crc_w_d_w from compiling
on loongarch32 in the front end and adds diagnostics accordingly.

Reference: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/loongarch/larchintrin.h#L175-L184

Depends on D136906

Differential Revision: https://reviews.llvm.org/D137316
2022-11-11 09:16:57 +08:00
gonglingqin 85f08c4197 [Clang][LoongArch] Implement __builtin_loongarch_dbar builtin
Differential Revision: https://reviews.llvm.org/D136906
2022-11-10 17:27:44 +08:00
Matt Jacobson dd9f7963e4 [ObjC] avoid crashing when emitting synthesized getter/setter and ptrdiff_t is smaller than long
On targets where ptrdiff_t is smaller than long, clang crashes when emitting
synthesized getters/setters that call objc_[gs]etProperty.  Explicitly emit a
zext/trunc of the ivar offset value (which is defined to long) to ptrdiff_t,
which objc_[gs]etProperty takes.

Add a test using the AVR target, where ptrdiff_t is smaller than long. Test
failed previously and passes now.

Differential Revision: https://reviews.llvm.org/D112049
2022-11-10 02:10:30 -05:00
OCHyams 4b6b2b1a42 Reapply: [Assignment Tracking][7/*] Add assignment tracking functionality to clang
Reverted in 98fa95492f.

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in
the previous patch in this set, into the optimisation pipeline from
clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the
main test for this patch.

Note: while clang (with the help of the declare-to-assign pass) can now emit
Assignment Tracking metadata, the llvm middle and back ends don't yet
understand it.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D132226
2022-11-09 09:28:41 +00:00
OCHyams 98fa95492f Revert "[Assignment Tracking][7/*] Add assignment tracking functionality to clang"
This reverts commit 28f9636edd.

Bot failure: https://lab.llvm.org/buildbot/#/builders/109/builds/50251
2022-11-08 18:43:05 +00:00
OCHyams 28f9636edd [Assignment Tracking][7/*] Add assignment tracking functionality to clang
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in
the previous patch in this set, into the optimisation pipeline from
clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the
main test for this patch.

Note: while clang (with the help of the declare-to-assign pass) can now emit
Assignment Tracking metadata, the llvm middle and back ends don't yet
understand it.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D132226
2022-11-08 17:49:08 +00:00
Rageking8 94738a5ac3 Fix duplicate word typos; NFC
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
2022-11-08 07:21:23 -05:00
Grace Jennings 86674f66cc [HLSL] Added HLSL this as a reference
This change makes `this` a reference instead of a pointer in
HLSL. HLSL does not have the `->` operator, and accesses through `this`
are with the `.` syntax.

Tests were added and altered to make sure
the AST accurately reflects the types.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D135721
2022-11-07 13:50:08 -08:00
Matthias Braun cafe50daf5 Explicitly initialize opaque pointer mode in CodeGenAction
Explicitly call `LLVMContext::setOpaquePointers` in `CodeGenAction`
before loading any IR files. With this we use the mode specified on the
command-line rather than lazily initializing it based on the contents of
the IR.

This helps when using `-fthinlto-index` which may end up mixing files
with typed and opaque pointer types which fails when the first file
happened to use typed pointers since we cannot downgrade IR with opaque
pointer types to typed pointer types.

Differential Revision: https://reviews.llvm.org/D137475
2022-11-07 12:31:28 -08:00
Jennifer Yu de14befa77 Remove redundant loads.
It is caused by regenerate captured var value when processing the
has_device_addr, the captured var value has been generated in
GenerateOpenMPCapturedVars and passed as Arg in generateInfoForCapture.
The fix just use Arg instead regenerated just same as is_device_ptr
2022-11-04 15:22:25 -07:00
Mike Rice c954cfeb57 Some uses of the preprocessor can result in multiple target regions on the
same line. Cases such as those in the associated lit tests, can now be
supported.

This adds a 'Count' field to TargetRegionEntryInfo to differentiate
regions with the same source position.

The OffloadEntriesInfoManager routines are updated to maintain a count of
regions seen at a location. The registration of regions proceeds that same as
before, but now the next available count is always determined and used in the
offload entry.

Fixes: https://github.com/llvm/llvm-project/issues/52707

Differential Revision: https://reviews.llvm.org/D134816
2022-11-04 12:54:22 -07:00
Nikita Popov 304f1d59ca [IR] Switch everything to use memory attribute
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.

The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.

High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.

Differential Revision: https://reviews.llvm.org/D135780
2022-11-04 10:21:38 +01:00
Freddy Ye a806fc2767 [X86] Support -march=raptorlake, meteorlake
Reviewed By: pengfei, skan, MaskRay

Differential Revision: https://reviews.llvm.org/D135937
2022-11-04 09:32:17 +08:00
Jan Sjodin 9ea2b150b5 [OpenMP][OMPIRBuilder] Migrate createOffloadEntriesAndInfoMetadata from clang to OpenMPIRBuilder
This patch moves the createOffloadEntriesAndInfoMetadata to OpenMPIRBuilder,
the createOffloadEntry helper function. The clang specific error handling is
invoked using a callback. This code will also be used by flang in the future.
2022-11-03 10:27:44 -04:00
Jennifer Yu ea64e66f7b [OPENMP]Initial support for error directive.
Differential Revision: https://reviews.llvm.org/D137209
2022-11-02 14:25:28 -07:00
Martin Storsjö 9b3834ef67 [clang] Fix inline builtin functions of an __asm__ renamed function with symbol prefixes
If a function is renamed with `__asm__`, the name provided is the
exact symbol name, without any extra implicit symbol prefixes.
If the target does use symbol prefixes, the IR level symbol gets
an `\01` prefix to indicate that it's a literal symbol name to be
taken as is.

When a builtin function is specialized by providing an inline
version of it, that inline function is named `<funcname>.inline`.

When the base function has been renamed due to `__asm__`, the inline
function ends up named `<asmname>.inline`. Up to this point,
things did work as expected before.

However, for targets with symbol prefixes, one codepath that produced
the combined name `<asmname>.inline` used the mangled `asmname` with
`\01` prefix, while others didn't. This patch fixes this.

This fixes the combination of asm renamed builtin function, with
inline override of the function, on any target with symbol
prefixes (such as i386 windows and any Darwin target).

Differential Revision: https://reviews.llvm.org/D137073
2022-11-02 22:24:42 +02:00
Akash Banerjee a3463a9f5c [OpenMP][OpenMPIRBuilder] Migrate loadOffloadInfoMetadata from clang to OMPIRbuilder
This patch moves the implementation of the loadOffloadInfoMetadata to the OMPIRbuilder.

Differential Revision: https://reviews.llvm.org/D136872
2022-11-02 18:54:25 +00:00
Krzysztof Parzyszek 13918432cf [Hexagon] Add builtins and intrinsics for V6_v[add|sub]carryo 2022-10-31 13:41:31 -07:00
Jan Sjodin 67f8521cd4 [OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info
Re-apply of: 3d0e9edd8e
Reverted in: 0cb65b0a58

A function parameter was using the wrong type 'llvm::TargetRegion' instead of
'const llvm:: TargetRegion&', which caused the error in the address sanitizer.
The correct type is now used.

This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.

Reviewed By: jdoerfert, mikerice

Differential Revision: https://reviews.llvm.org/D136601
2022-10-31 10:49:44 -04:00
Matt Arsenault 0ebd4638af clang: Improve errors for DiagnosticInfoResourceLimit
Print source location info and demangle the name, compared
to the default behavior.

Several observations:

1. Specially handling this seems to give source locations
without enabling debug info, and also gives columns compared
to the backend diagnostic.

2. We're duplicating diagnostic effort in DiagnosticInfo
and clang. This feels wrong, but clang can demangle and I guess
have better debug info available? Should clang really have any of this
code? For the purposes of this diagnostic, the important piece
is just reading the source location out of the llvm::Function.

3. lld is not duplicating the same effort as clang with LTO, and
just directly printing the DiagnosticInfo as-is. e.g.

  $ clang -fgpu-rdc
	lld: error: local memory (480000) exceeds limit (65536) in function '_Z12use_huge_ldsIiEvv'
	lld: error: local memory (960000) exceeds limit (65536) in function '_Z12use_huge_ldsIdEvv'

  $ clang -fno-gpu-rdc
	backend-resource-limit-diagnostics.hip:8:17: error: local memory (480000) exceeds limit (65536) in 'void use_huge_lds<int>()'
	__global__ void use_huge_lds() {
                ^
	backend-resource-limit-diagnostics.hip:8:17: error: local memory (960000) exceeds limit (65536) in 'void use_huge_lds<double>()'
	2 errors generated when compiling for gfx90a.

4. Backend errors are not observed with -save-temps and -fno-gpu-rdc or -flto,
and the compile incorrectly succeeds.

5. The backend version prints error: <location info>; clang prints <location info>: error:

6. -emit-codegen-only is totally broken for AMDGPU. MC
gets a null target streamer. I do not understand why this
is a thing. This just creates a horrible edge case.
Just work around this by emitting actual code instead of blocking
this patch.
2022-10-28 21:42:57 -07:00
Ben Langmuir e1f9983022 Move getenv for AS_SECURE_LOG_FILE to clang
Avoid calling getenv in the MC layer and let the clang driver do it so
that it is reflected in the command-line as an -mllvm option.

rdar://101558354

Differential Revision: https://reviews.llvm.org/D136888
2022-10-28 16:08:04 -07:00
Eduard Zingerman 524c640090 [clang][DebugInfo] Emit DISubprogram for extern functions with reserved names
Callsite `DISubprogram` entries are not generated for:
- builtin functions;
- external functions with reserved names (e.g. names starting from "__").

This limitation was added by the commit [1] as a workaround for the
situation described in [2] that triggered the IR verifier error.
The goal of the present commit is to lift this limitation by adjusting
the IR verifier logic.

The logic behind [1] is to avoid the following situation:
- a `DISubprogram` is added for some builtin function;
- there is some location where this builtin is also emitted by a
  transformation (w/o debug location);
- the `Verifier::visitCallBase` sees a call to a function with
  `DISubprogram` but w/o debug location and emits an error.

Here is an updated example of such situation taken from [2]:

```
extern "C" int memcmp(void *, void *, long);

struct a { int b; int c; int d; };

struct e { int f[1000]; };

bool foo(e g, e &h) {
  // DISubprogram for memcmp is created here when [1] is commented out
  return memcmp(&g, &h, sizeof(e));
}

bool bar(a &g, a &h) {
  // memcmp might be generated here by MergeICmps
  return g.b == h.b && g.c == h.c && g.d == h.d;
}
```

This triggers the verifier error when:
- compiled for AArch64:
  `clang++ -c -g -Oz -target aarch64-unknown-linux-android21 test.cpp`;
- [1] check is commented out.

Instead of forbidding generation of `DISubprogram` entries as in [1]
one can instead adjust the verifier to additionally check if callee
has a body. Functions w/o bodies cannot be inlined and thus verifier
warning is not necessary.

E.g. `llvm::InlineFunction` requires functions for which
`GlobalValue::isDeclaration() == false`.

[1] 568db780bb
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=1022296

Differential Revision: https://reviews.llvm.org/D136041
2022-10-28 08:07:54 -07:00
Kevin Athey 0cb65b0a58 Revert "[OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info"
This reverts commit 3d0e9edd8e.

Breaking HWASAN buildbot:
https://lab.llvm.org/buildbot/#/builders/236/builds/786

Shown by targetted builds breaking at this patch:
Built at this patch: https://lab.llvm.org/buildbot/#/builders/236/builds/803
Built at prior patch: https://lab.llvm.org/buildbot/#/builders/236/builds/804
2022-10-27 13:57:25 -07:00
Jan Sjodin 3d0e9edd8e [OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.

Reviewed By: jdoerfert, mikerice

Differential Revision: https://reviews.llvm.org/D136601
2022-10-25 11:15:36 -04:00
David Green af1bb287b4 [AArch64][ARM] Alter v8.3a complex neon intrinsics to be target-based, not preprocessor based
This alters the 8.3 complex intrinsics to be target-gated, as opposed to
hidden behind preprocessor macros. This is the last of arm_neon.h, and
follows the same formula as before.

Differential Revision: https://reviews.llvm.org/D135647
2022-10-25 14:35:11 +01:00
David Green 9c48b7f0e7 [AArch64][ARM] Alter v8.1a neon intrinsics to be target-based, not preprocessor based
As a continuation of D132034, this switches the QRDMX v8.1a neon
intrinsics over from preprocessor defines to be target-gated. As there
is no "rdma" or "qrdmx" target feature, they use the "v8.1a"
architecture feature directly.

This works well for AArch64, but something needs to be done for Arm at
the same time, as they both use the same header and tablegen emitter.
This patch opts for adding "v8.1a" and all dependant target features to
the Arm TargetParser, similar to what was recently done for AArch64 but
through initFeatureMap when the Architecture is parsed. I attempted to
make the code similar to the AArch64 backend.

Otherwise this is similar to the changes made in D132034.

Differential Revision: https://reviews.llvm.org/D135615
2022-10-25 09:02:52 +01:00
Markus Böck 3637dc601c [clang][CodeGen] Consistently return nullptr Values for void builtins and scalar initalization
A common post condition of the various visitor functions in CodeGen is that instructions, that do not return any values, simply return a nullptr Value as a sentinel. This has not been the case however for calls to some builtins returning void, as well as for an initializer expression of the form `void()`. This would then lead to ICEs in CodeGen on code relying on nullptr being returned for void values, which is eg. the case for conditional expressions [0].
This patch fixes that by returning nullptr Values for intrinsics known not to return any values as well as for a scalar initializer returning void.

Fixes https://github.com/llvm/llvm-project/issues/53127

[0] 266ec801fb/clang/lib/CodeGen/CGExprScalar.cpp (L4849-L4892)

Differential Revision: https://reviews.llvm.org/D136548
2022-10-24 21:41:13 +02:00
Erich Keane 975740bf8d "Reapply "GH58368: Correct concept checking in a lambda defined in concept""
This reverts commit cecc9a92cf.

The problem ended up being how we were handling the lambda-context in
code generation: we were assuming any decl context here would be a
named-decl, but that isn't the case.  Instead, we just replace it with
the concept's owning context.

Differential Revision: https://reviews.llvm.org/D136451
2022-10-24 12:36:54 -07:00
Erich Keane cecc9a92cf Revert "Reapply "GH58368: Correct concept checking in a lambda defined in concept"""
This reverts commit b876f6e2f2.

Still getting build failures on PPC AIX that aren't obvious what is causing
them, so reverting while I try to figure this out.
2022-10-24 12:20:23 -07:00
Erich Keane b876f6e2f2 Reapply "GH58368: Correct concept checking in a lambda defined in concept""
This reverts commit 5293016287.

Now with updating the ASTBitcodes to show that this AST is incompatible
from the last.
2022-10-24 11:46:54 -07:00
Erich Keane 5293016287 Revert "GH58368: Correct concept checking in a lambda defined in concept"
This reverts commit b7c922607c.

This seems to cause some problems with some modules related things,
which makes me think I should have updated the version-major in
ast-bit-codes?  Going to revert to confirm this was a problem, then
change that and re-try a commit.
2022-10-24 10:21:22 -07:00
Erich Keane b7c922607c GH58368: Correct concept checking in a lambda defined in concept
As that bug reports, the problem here is that the lambda's
'context-decl' was not set to the concept, and the lambda picked up
template arguments from the concept.  SO, we failed to get the correct
template arguments in SemaTemplateInstantiate.

However, a Concept Specialization is NOT a decl, its an expression, so
we weren't able to put the concept in the decl tree like we needed.
This patch introduces a ConceptSpecializationDecl, which is the smallest
type possible to use for this purpose, containing only the template
arguments.

The net memory impliciation of this is turning a
trailing-objects into a pointer to a type with trailing-objects,  so it
should be minor.

As future work, we may consider giving this type more responsibility, or
figuring out how to better merge duplicates, but as this is just a
template-argument collection at the moment, there isn't much value to
it.

Differential Revision: https://reviews.llvm.org/D136451
2022-10-24 06:32:18 -07:00
David Green 6f1e430360 [AArch64] Alter v8.5a FRINT neon intrinsics to be target-based, not preprocessor based
This switches the v8.5-a FRINT intrinsics over to be target-gated,
behind preprocessor defines. This one is pretty simple, being AArch64
only.

Differential Revision: https://reviews.llvm.org/D135646
2022-10-24 11:22:06 +01:00
Chris Bieneman 4c7218e770 [HLSL] Remove unused frontend-generated ID
As @python3kgae pointed out we're going to want to assign these IDs
after optimization so that we can remove unused resrouces. This patch
just removes the unused ID value from the frontend metadata, clang code
generation, and updates associated test cases.

Reviewed By: python3kgae

Differential Revision: https://reviews.llvm.org/D136271
2022-10-21 12:41:09 -05:00
Paulo Matos 39d8597927 [clang] Fix typo in error message 2022-10-21 12:06:28 +02:00
Xiang Li 464926ef44 [HLSL] Disable integer promotion to avoid int16_t being promoted to int for HLSL.
short will be promoted to int in UsualUnaryConversions.
Disable it for HLSL to keep int16_t as 16bit.

Reviewed By: aaron.ballman, rjmccall

Differential Revision: https://reviews.llvm.org/D133668
2022-10-20 16:06:25 -07:00
Xiang Li a7183a158d [NFC] [DirectX backend] move ResourceClass into llvm.
Move ResourceClass into llvm/Frontend/HLSL/HLSLResource.h so it could be shared between clang and DirectX backend.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D136134
2022-10-20 13:26:56 -07:00
Phoebe Wang 62ca79102c [X86][1/2] Support PREFETCHI instructions
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136040
2022-10-20 08:46:01 +08:00
Prabhdeep Singh Soni 6149589127 [OMPIRBuilder] Support depend clause for task
This patch adds support for the `depend` clause for the `task`
construct.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D135695
2022-10-19 13:11:43 -04:00
Phoebe Wang bc1819389f [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics
This is an alternative of D120395 and D120411.

Previously we use `__bfloat16` as a typedef of `unsigned short`. The
name may give user an impression it is a brand new type to represent
BF16. So that they may use it in arithmetic operations and we don't have
a good way to block it.

To solve the problem, we introduced `__bf16` to X86 psABI and landed the
support in Clang by D130964. Now we can solve the problem by switching
intrinsics to the new type.

Reviewed By: LuoYuanke, RKSimon

Differential Revision: https://reviews.llvm.org/D132329
2022-10-19 23:47:04 +08:00
Xiang Li 14ae5d2b74 [HLSL] Add SV_DispatchThreadID
Support SV_DispatchThreadID attribute.
Translate it into dx.thread.id in clang codeGen.

Reviewed By: beanz, aaron.ballman

Differential Revision: https://reviews.llvm.org/D133983
2022-10-18 16:17:19 -07:00
Joseph Huber 8c1449a84d [OpenMP] Make kernels have protected visibility
This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D136198
2022-10-18 16:37:28 -05:00
Dominik Adamski ccd314d320 [OpenMP][OMPIRBuilder] Add generation of SIMD align assumptions to OMPIRBuilder
Currently generation of align assumptions for OpenMP simd construct is done
outside OMPIRBuilder for C code and it is not supported for Fortran.

According to OpenMP 5.0 standard (2.9.3) only pointers and arrays can be
aligned for C code.

If given aligned variable is pointer, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:

; memory allocation for pointer address:
%A.addr = alloca ptr, align 8
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%0 = load ptr, ptr %A.addr, align 8
call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 32) ]

If given aligned variable is array, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:

; memory allocation for array:
%B = alloca [10 x i32], align 16
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%arraydecay = getelementptr inbounds [10 x i32], ptr %B, i64 0, i64 0
call void @llvm.assume(i1 true) [ "align"(ptr %arraydecay, i64 32) ]

OMPIRBuilder was modified to generate aligned assumptions. It generates only
llvm.assume calls. Frontend is responsible for generation of aligned pointer
and getting the default alignment value if user does not specify it in aligned
clause.

Unit and regression tests were added to check if aligned clause was handled correctly.

Differential Revision: https://reviews.llvm.org/D133578

Reviewed By: jdoerfert
2022-10-18 02:04:18 -05:00
Xiang Li 3a671c8e91 [NFC] use llvm_unreachable instead of return on switch which all cases are covered. 2022-10-17 17:47:48 -07:00
Xiang Li 0674f2ec96 [NFC] Fix warning on no return after switch. 2022-10-17 15:52:23 -07:00
Artem Belevich a10eb07d1a Do not append terminating NUL to the binary string with embedded fatbin.
Extra NUL does not impact functionality of the generated code, but it confuses
various NVIDIA tools used to examine embedded GPU binaries.

Differential Revision: https://reviews.llvm.org/D135832
2022-10-17 15:39:39 -07:00
Xiang Li 13163dd8ab [HLSL] CodeGen hlsl resource binding.
''register(ID, space)'' like register(t3, space1) will be translated into
i32 3, i32 1 as the last 2 operands for resource annotation metadata.

NamedMetadata for CBuffers and SRVs are added as "hlsl.srvs" and "hlsl.cbufs".

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D130951
2022-10-17 14:29:19 -07:00
Ellis Hoag 970e1ea01a [clang] Fix crash with -funique-internal-linkage-names
Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage.

I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm.
9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)

Fixes https://github.com/llvm/llvm-project/issues/54139

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D135926
2022-10-17 08:57:23 -07:00
Ting Wang ee703b5cb1 [clang][PowerPC] PPC64 VAArg fix right-alignment for aggregates fit in register
PPC64 ABI pass aggregates smaller than a register into the least
significant bits of the register. In the case of variadic functions,
they will end up right-aligned in their argument slots in the argument
area on big-endian targets. Apply right-alignment for these aggregates.

Fixes #55900.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D133338
2022-10-16 22:01:47 -04:00
Kazu Hirata 647e48cf5f [clang] Use std::clamp (NFC)
Note that the constructor of MipsABIInfo guarantees that
MinABIStackAlignInBytes <= StackAlignInBytes, so we can use std::clamp
safely.
2022-10-16 10:11:29 -07:00
Jan Sjodin dd3d8ddb5f [OpenMP][OpenMPIRBuilder] Migrate OffloadEntriesInfoManager from clang to OMPIRbuilder
This patch moves the implementation of the OffloadEntriesInfoManager
to the OMPIRbuilder. This class will later be used by flang as well.

    Reviewed By: jdoerfert

    Differential Revision: https://reviews.llvm.org/D135786
2022-10-16 08:32:40 -04:00
Chris Bieneman 911d2dc230 [NFC] [HLSL] Move common metadata to LLVMFrontend
This change pulls some code from the DirectX backend into a new
LLVMFrontendHLSL library to share utility data structures between the
HLSL code generation in Clang and the backend in LLVM.

This is a small refactoring as a first start to get code into the
right structure and get the library built and dependencies correct.

Fixes #58000 (https://github.com/llvm/llvm-project/issues/58000)

Reviewed By: python3kgae

Differential Revision: https://reviews.llvm.org/D135110
2022-10-14 13:40:04 -05:00
Akira Hatanaka 28f7087c91 [CodeGen][ObjC] Call synthesized copy constructor/assignment operator
functions in getter/setter functions of non-trivial C struct properties

This fixes a bug where the getter/setter functions were doing a trivial
copy instead of calling the synthesized functions that copy non-trivial
C struct types.

This fixes https://github.com/llvm/llvm-project/issues/56680.

Differential Revision: https://reviews.llvm.org/D131701
2022-10-14 10:40:24 -07:00
Kazu Hirata 41ac5d258d [clang] Fix a warning
This patch fixes:

  clang/lib/CodeGen/CGCall.cpp:1867:64: error: '&&' within '||'
  [-Werror,-Wlogical-op-parentheses]
2022-10-14 08:36:59 -07:00
Zahira Ammarguellat 84a9ec2ff1 Remove redundant option -menable-unsafe-fp-math.
There are currently two options that are used to tell the compiler to perform
unsafe floating-point optimizations:
'-ffast-math' and '-funsafe-math-optimizations'.

'-ffast-math' is enabled by default. It automatically enables the driver option
'-menable-unsafe-fp-math'.
Below is a table illustrating the special operations enabled automatically by
'-ffast-math', '-funsafe-math-optimizations' and '-menable-unsafe-fp-math'
respectively.

Special Operations -ffast-math	-funsafe-math-optimizations -menable-unsafe-fp-math
MathErrno	       0	         1	                    1
FiniteMathOnly         1 	         0                          0
AllowFPReassoc	       1         	 1                          1
NoSignedZero	       1                 1                          1
AllowRecip             1                 1                          1
ApproxFunc             1                 1                          1
RoundingMath	       0                 0                          0
UnsafeFPMath	       1                 0                          1
FPContract	       fast	         on	                    on

'-ffast-math' enables '-fno-math-errno', '-ffinite-math-only',
'-funsafe-math-optimzations' and sets 'FpContract' to 'fast'. The driver option
'-menable-unsafe-fp-math' enables the same special options than
'-funsafe-math-optimizations'. This is redundant.
We propose to remove the driver option '-menable-unsafe-fp-math' and use
instead, the setting of the special operations to set the function attribute
'unsafe-fp-math'. This attribute will be enabled only if those special
operations are enabled and if 'FPContract' is either 'fast' or set to the
default value.

Differential Revision: https://reviews.llvm.org/D135097
2022-10-14 10:55:29 -04:00
Aaron Ballman 19e984ef8f Properly print unnamed TagDecl objects in diagnostics
The diagnostics engine is very smart about being passed a NamedDecl to
print as part of a diagnostic; it gets the "right" form of the name,
quotes it properly, etc. However, the result of using an unnamed tag
declaration was to print '' instead of anything useful.

This patch causes us to print the same information we'd have gotten if
we had printed the type of the declaration rather than the name of it,
as that's the most relevant information we can display.

Differential Revision: https://reviews.llvm.org/D134813
2022-10-14 08:18:28 -04:00
Benjamin Kramer c5d950f469 [HLSL] Simplify code and fix unused variable warnings. NFC. 2022-10-13 09:46:32 +02:00
Xiang Li ebe9c7f3e2 [HLSL] CodeGen hlsl cbuffer/tbuffer.
cbuffer A {
  float a;
  float b;
}

will be translated to a global variable.

Something like

struct CB_Ty {
  float a;
  float b;
};

CB_Ty A;

And all use of a and b will be replaced with A.a and A.b.

Only support none-legacy cbuffer layout now.
CodeGen for Resource binding will be in separate patch.
In the separate patch, resource binding will map the resource information to the global variable.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D130131
2022-10-12 21:17:38 -07:00
Nikita Popov 01bbe87fbb [CGStmt] Use helper functions to set memory attributes (NFC) 2022-10-12 16:38:39 +02:00
Michael Wyman 1fbb6d8b34 Fix assert in generated `direct` property getter/setters due to removal of `_cmd` parameter.
This fixes a bug from https://reviews.llvm.org/D131424 that removed the implicit `_cmd` parameter as an argument to `objc_direct` method implementations. In many cases the generated getter/setter will call `objc_getProperty` or `objc_setProperty`, both of which require the selector of the getter/setter; since `_cmd` didn't automatically have backing storage, attempting to load the address asserted.

For direct property generated getters/setters, this now passes an undefined/uninitialized/poison value as the `_cmd` argument to `objc_getProperty`/`objc_setProperty`. Prior to removing the `_cmd` argument from the ABI of direct methods, it was left uninitialized/undefined; although references within hand-implemented methods would load the selector in the method prologue, generated getters/setters never did and just forwarded the undefined value that was passed as the argument.

This change keeps the generated code mostly similar to before, passing an uninitialized/undefined/poison value; for setters, the value argument may be moved to another register.

Added a test that triggers the assert prior to the implementation code.

Differential Revision: https://reviews.llvm.org/D135091
2022-10-11 21:15:53 -07:00
Eli Friedman 1079662d2f [clang][codegen] Don't emit atomic loads for threadsafe init if they aren't inline
Performing a load before calling __cxa_guard_acquire is supposed to be
an optimization, but it isn't much of one if we're just going to emit a
call to __atomic_load_1 instead.  Instead, just skip the load, and
let __cxa_guard_acquire do whatever it wants.

(In practice, on such targets, the C++ library is just built with
threading turned off, so the result isn't actually threadsafe, but
there's not really anything clang can do about that.)

The alternative here is that we try to define some ABI for threadsafe
init that allows the speculative load without full atomics.  Almost any
target without full atomics has a load that's s "atomic enough" for this
purpose. But it's not clear how we emit an "atomic enough" load in LLVM
IR, and there isn't any ABI document we can refer to.

Or I guess we could turn off -fthreadsafe-statics by default on
Cortex-M0, but that seems like it would be surprising.

Fixes https://github.com/llvm/llvm-project/issues/58184

Differential Revision: https://reviews.llvm.org/D135628
2022-10-11 14:00:33 -07:00
David Green b879f99f0e [AArch64][ARM] Alter most of arm_neon.h to be target-based, not preprocessor based.
Similar to D131064, this alters most of the intrinsics in arm_neon.h to
be target based, not preprocessor based. The intrinsics that are changed
are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm
and bf16). The ones that are not yet altered are the ones without target
features like rdma (8.1) and complex (8.3). Those will be switched in a
followup patch that allows targeting architecture versions.

The existing ArchGuard in arm_neon.td is split into ArchGuard that still
adds ifdef defines (for example for intrinsics that require __aarch64__),
and TargetGuards for intrinsics dependant on target features. From there
the TargetGuards are used in two ways:
 - For intrinsics emitted as functions, __attribute__((target(TargetGuard)))
   is added to the definition of the function. Along with the existing
   always_inline intrinsic, this will give a compile time error if the
   function is used in a context where the target feature is not available.
 - For intrinsics emitted as macros, the __builtins are emitted into
   arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes
   the target feature and gives an error if the builtin is found in a
   function without the required features, similar to arm_sve.h.

The second method requires that the intrinsics be separable from the
existing _v intrinsics used in other types. For example
__builtin_neon_splat_lane_bf16 is used as opposed to
__builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin
to account for intrinsics that can be treated similarly, except for
their target features.

Differential Revision: https://reviews.llvm.org/D132034
2022-10-11 09:09:16 +01:00
Aaron Ballman 9ced729c2c Repair a confusing standards reference; NFC
There is no 6.9 in C++11, the quote actually lives in
[intro.multithread] for that revision. However, the words moved in
C++17 to [intro.progress] so I added that information as well.
2022-10-10 14:10:39 -04:00
Anton Bikineev 7b85e76500 [PGO] Consider parent context when weighing branches with likelyhood.
Generally, with PGO enabled the C++20 likelyhood attributes shall be
dropped assuming the profile has a good coverage. However, currently
this is not the case for the following code:

 if (always_false()) [[likely]] {
   ...
 }

The patch fixes this and drops the attribute, if the parent context was
executed in the profile. The patch still preserves the attribute, if the
parent context was not executed, e.g. to support the cases when the
profile has insufficient coverage.

Differential Revision: https://reviews.llvm.org/D134456
2022-10-08 23:49:27 +02:00