Commit Graph

16 Commits

Author SHA1 Message Date
Fangrui Song fd739804e0 [test] Add {{.*}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences
For a default visibility external linkage definition, dso_local is set for ELF
-fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar
to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences.

To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition,
which sets dso_local for default visibility external linkage definitions.

To make this flip smooth and enable future (dso_local as definition default),
this patch replaces (function) `define ` with `define{{.*}} `,
(variable/constant/alias) `= ` with `={{.*}} `, or inserts appropriate `{{.*}} `.
2020-12-31 00:27:11 -08:00
Matt Arsenault 0a7cd99a70 Reapply "OpaquePtr: Add type to sret attribute"
This reverts commit eb9f7c28e5.

Previously this was incorrectly handling linking of the contained
type, so this merges the fixes from D88973.
2020-10-16 11:05:02 -04:00
Tres Popp eb9f7c28e5 Revert "OpaquePtr: Add type to sret attribute"
This reverts commit 55c4ff91bd.

Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
2020-09-29 10:31:04 +02:00
Matt Arsenault 55c4ff91bd OpaquePtr: Add type to sret attribute
Make the corresponding change that was made for byval in
b7141207a4. Like byval, this requires a
bulk update of the test IR tests to include the type before this can
be mandatory.
2020-09-25 14:07:30 -04:00
Matt Arsenault 30eeb742f1 clang: Use byref for aggregate kernel arguments
Add address space to indirect abi info and use it for kernels.

Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.

Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).

This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.

This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
2020-08-06 15:52:26 -04:00
Erik Pilkington de98cf92e3 [CodeGen] Add an alignment attribute to all sret parameters
This fixes a miscompile when the parameter is actually underaligned.
rdar://58316406

Differential revision: https://reviews.llvm.org/D74183
2020-03-24 15:31:57 -04:00
Tim Northover c46827c7ed LLVM IR: Generate new-style byval-with-Type from Clang
LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.

For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.

llvm-svn: 362652
2019-06-05 21:12:14 +00:00
Tim Northover fcb00d4aec Reapply: LLVM IR: update Clang tests for byval being a typed attribute.
Since byval is now a typed attribute it gets sorted slightly differently by
LLVM when the order of attributes is being canonicalized. This updates the few
Clang tests that depend on the old order.

Clang patch is unchanged.

llvm-svn: 362129
2019-05-30 18:49:19 +00:00
Tim Northover 4b281755ae Revert "LLVM IR: update Clang tests for byval being a typed attribute."
The underlying LLVM change couldn't cope with llvm-link and broke LTO builds.

llvm-svn: 362028
2019-05-29 20:45:32 +00:00
Tim Northover 45e8cc6639 LLVM IR: update Clang tests for byval being a typed attribute.
Since byval is now a typed attribute it gets sorted slightly differently by
LLVM when the order of attributes is being canonicalized. This updates the few
Clang tests that depend on the old order.

llvm-svn: 362013
2019-05-29 19:13:29 +00:00
Alexey Sotkin 20f65928e1 [OpenCL] Add '-cl-uniform-work-group-size' compile option
Summary:
OpenCL 2.0 specification defines '-cl-uniform-work-group-size' option,
which requires that the global work-size be a multiple of the work-group
size specified to clEnqueueNDRangeKernel and allows optimizations that
are made possible by this restriction.

The patch introduces the support of this option.

To keep information about whether an OpenCL kernel has uniform work
group size or not, clang generates 'uniform-work-group-size' function
attribute for every kernel:
- "uniform-work-group-size"="true" for OpenCL 1.2 and lower,
- "uniform-work-group-size"="true" for OpenCL 2.0 and higher if
 '-cl-uniform-work-group-size' option was specified,
- "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no
 '-cl-uniform-work-group-size' options was specified.

If the function is not an OpenCL kernel, 'uniform-work-group-size'
attribute isn't generated.

Patch by: krisb

Reviewers: yaxunl, Anastasia, b-sumner

Reviewed By: yaxunl, Anastasia

Subscribers: nhaehnle, yaxunl, Anastasia, cfe-commits

Differential Revision: https://reviews.llvm.org/D43570

llvm-svn: 325771
2018-02-22 11:54:14 +00:00
Yaxun Liu f5f45e5e63 [AMDGPU] Switch to the new addr space mapping by default
This requires corresponding llvm change.

Differential Revision: https://reviews.llvm.org/D40956

llvm-svn: 324102
2018-02-02 16:08:24 +00:00
Yaxun Liu c325d30d2c CodeGen: Fix invalid bitcasts for memcpy
CreateCoercedLoad/CreateCoercedStore assumes pointer argument of
memcpy is in addr space 0, which is not correct and causes invalid
bitcasts for triple amdgcn---amdgiz.

It is fixed by using alloca addr space instead.

Differential Revision: https://reviews.llvm.org/D40806

llvm-svn: 320000
2017-12-07 01:39:52 +00:00
Matt Arsenault 3fe7395fbc AMDGPU: Use direct struct returns and arguments
This is an improvement over always using byval for
structs.

This will use registers until ~16 are used, and then
switch back to byval. This needs more work, since I'm
not sure it ever really makes sense to use byval. If
the register limit is exceeded, the arguments still
end up passed on the stack, but with a different ABI.
It also may make sense to base this on number of
registers used for non-struct arguments, rather than
just arguments that appear first in the argument list.

llvm-svn: 310527
2017-08-09 21:44:58 +00:00
Jan Vesely a6f369c727 [OpenCL] r600 needs OpenCL kernel calling convention
Differential Revision: https://reviews.llvm.org/D30236

llvm-svn: 295843
2017-02-22 15:01:42 +00:00
Matt Arsenault 88d7da01ca AMDGPU: Handle structs directly in AMDGPUABIInfo
Structs are currently handled as pointer + byval, which makes AMDGPU
LLVM backend generate incorrect code when structs are used. This patch
changes struct argument to be handled directly and without flattening,
which Clover (Mesa 3D Gallium OpenCL state tracker) will be able to
handle. Flattening would expand the struct to individual elements and
pass each as a separate argument, which Clover can not
handle. Furthermore, such expansion does not fit the OpenCL
programming model which requires to explicitely specify each argument
index, size and memory location.

Patch by Vedran Miletić

llvm-svn: 279463
2016-08-22 19:25:59 +00:00