llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	01adf1f3e5	AtomicExpand: Add some more overaligned atomic tests	2022-09-28 12:51:30 -04:00
Matt Arsenault	a61c3455c0	AtomicExpand: Use llvm.ptrmask instead of ptrtoint This removes the ptrtoint from the load's pointer operand, although we can't entirely eliminate these to get the LSB shift. In a future patch, this will avoid ptrtoint in the case where the atomic is overaligned to the word size.	2022-09-28 12:51:30 -04:00
Petar Avramovic	5cee9047d5	AMDGPU: Improve atomicrmw fadd selection Use same atomicrmw fadd expansion rules for gfx908, gfx940 and gfx11 as for gfx90a. Add missing globalisel legalizer support for flat atomicrmw fadd f32 on gfx940 and gfx11. Isel support for gfx11 will be added in D130579. Differential Revision: https://reviews.llvm.org/D131560	2022-09-23 17:52:10 +02:00
Petar Avramovic	48968c47b0	AMDGPU: Add detailed buffer, global and flat atomic fadd tests Precommit for D130579 that will remove manual selection and use patterns from td files. Tests are grouped based on target features. All patterns have rtn and no-rtn versions. buffer atomics patterns are selected based on the intrinsic used (raw or struct) and the offset operand (imm or vgpr): _offset raw with imm offset _offen raw with vgpr offset (or large imm offset) _idxen struct with imm offset _bothen struct with vgpr offset (or large imm offset) global and flat atomics are selected via intrinsic or the atomicrmw fadd. atomicrmw tests have amdgpu-unsafe-fp-atomics=true and non-system scope since they get expanded otherwise. atomicrmw fadd does not support vector type, test float and double. global atomics patterns are selected based on address type via (global or flat) intrinsic or atomicrmw fadd with global address(addrspace(1)). 'no suffix' vgpr addrspace(1) address _saddr sgpr addrspace(1)* address flat atomics patterns are selected via (flat)intrinsic or atomicrmw fadd with flat address (* - address space 0). Differential Revision: https://reviews.llvm.org/D131561	2022-09-23 17:52:10 +02:00
Matt Arsenault	b9a371f6d1	AtomicExpand: Use correct pointer size for integer This was using the default address space.	2022-09-20 16:51:05 -04:00
Matt Arsenault	4d322ba77b	AMDGPU: Add baseline test for expansion of 16-bit local atomics The expansion is currently using the wrong pointer size.	2022-09-20 16:51:05 -04:00
Matt Arsenault	784d2930c0	AtomicExpand: Switch test to generated checks	2022-09-20 16:51:05 -04:00
Matt Arsenault	28e03692ae	AMDGPU: Fix expansion of 16-bit atomicrmw Fixes issue 57830	2022-09-20 14:47:40 -04:00
Matt Arsenault	a4b1f7a8b5	AMDGPU: Add some tests for atomics with excess alignment	2022-09-19 19:27:21 -04:00
Matt Arsenault	3f77df8e29	AMDGPU: Update baseline test checks	2022-09-19 18:57:33 -04:00
Marco Elver	f0d6709e4a	[AtomicExpandPass] Always copy pcsections Metadata to expanded atomics When expanding IR atomics to target-specific atomics, copy all !pcsections Metadata to expanded atomics automatically. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130885	2022-09-07 11:36:01 +02:00
Kai Luo	ad2f7fd286	[AtomicExpand] Make floating point conversion happens before fence insertion IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations. This also fixes atomic load of floating point values which requires fence on PowerPC. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127609	2022-08-31 09:54:58 +08:00
gonglingqin	e9a4b8e397	[LoongArch] Optimize the atomic store with amswap_db.[w/d] When AtomicOrdering is release or stronger, use amswap_db.[w/d] $zero, $a1, $a0 instead of dbar 0 st.[w/d] $a0, $a1, 0 Thanks to @xry111 for the suggestion: https://reviews.llvm.org/D128901#3626635 Differential Revision: https://reviews.llvm.org/D129838	2022-08-23 17:11:57 +08:00
gonglingqin	47f3dc6d49	[LoongArch] Add codegen support for atomic fence, atomic load and atomic store Differential Revision: https://reviews.llvm.org/D128901	2022-07-13 15:25:45 +08:00
Kai Luo	6710b21d46	[PowerPC] Allow llvm.ppc.cfence to accept pointer types In the context of atomic load, integer, pointer and float point types are allowed, thus we should allow llvm.ppc.cfence to accept any type mentioned. Fixes https://github.com/llvm/llvm-project/issues/55983. Reviewed By: shchenz, vchuravy Differential Revision: https://reviews.llvm.org/D127554	2022-06-24 10:55:32 +08:00
Kai Luo	8091f7120c	[PowerPC] Correct test RUN line. NFC.	2022-06-14 14:56:00 +08:00
Kai Luo	029fc37270	[PowerPC][AtomicExpand] Precommit IR tests for D127609. NFC.	2022-06-14 14:24:21 +08:00
Kai Luo	18679ac0d7	[PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64 AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 explicitly, AtomicExpandPass won't emit `__atomic_` libcalls for those target unable to inline atomic ops and finally the backend emits `__sync_` libcalls. Thanks @efriedma for pointing it out. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122868	2022-04-09 00:03:09 +00:00
Kai Luo	dc77769fc6	[PowerPC] Add cmpxchg test for pwr7 in atomic expand pass. NFC.	2022-04-01 13:27:54 +08:00
Kai Luo	31906a6090	[AtomicExpand][PowerPC] Fix all-one mask value When generating a all-one mask value whose bitwidth is larger than 64, signed extension should be used rather then zero extension. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D120865	2022-03-18 13:35:54 +08:00
Arthur Eubanks	2371c5a0e0	[OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Basically the same as D120527. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D121847	2022-03-16 14:11:53 -07:00
Arthur Eubanks	250620f76e	[OpaquePtr][AArch64] Use elementtype on ldxr/stxr Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D120527	2022-03-14 10:09:59 -07:00
Kai Luo	1cfcbf197c	[PowerPC][atomics] Precommit test cases for i128 cmpxchg. NFC.	2022-03-03 10:47:52 +08:00
Kai Luo	1453f048cf	[PowerPC] Add lit.local.cfg in AtomicExpand tests Fixed build errors on other platforms.	2021-07-20 09:13:50 +00:00
Kai Luo	e2ee27b20b	[PowerPC] Fallback to base's implementation of shouldExpandAtomicCmpXchgInIR and shouldExpandAtomicCmpXchgInIR If we can't decide `shouldExpandAtomicCmpXchgInIR` or `shouldExpandAtomicCmpXchgInIR` in PPC's implementation after https://reviews.llvm.org/rGb9c3941cd61de1e1b9e4f3311ddfa92394475f4b, resort to base's implementation. This fixes internal build of OpenMP which uses atomic operations on float. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106234	2021-07-20 06:14:24 +00:00
LemonBoy	b577ec4956	[AtomicExpandPass][AArch64] Promote xchg with floating-point types to integer ones Follow the same strategy used for atomic loads/stores by converting the operands to equally-sized integer types. This change prevents the atomic expansion pass from generating illegal LL/SC pairs when targeting AArch64: `expand-atomicrmw-xchg-fp.ll` would previously instantiate intrinsics such as `llvm.aarch64.ldaxr.p0f32` that cannot be lowered. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103232	2021-05-29 08:57:27 +02:00
Tomas Matheson	9d86095ff8	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `753185031d`.	2021-05-03 21:48:20 +01:00
Tomas Matheson	753185031d	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as `3338290c18`. Reverted in `c7df6b1223`.	2021-05-03 20:25:15 +01:00
LemonBoy	4751cadcca	[AArch64] Prevent spilling between ldxr/stxr pairs Apply the same logic used to check if CMPXCHG nodes should be expanded at -O0: the register allocator may end up spilling some register in between the atomic load/store pairs, breaking the atomicity and possibly stalling the execution. Fixes PR48017 Reviewed By: efriedman Differential Revision: https://reviews.llvm.org/D101163	2021-05-01 17:17:05 +02:00
Tomas Matheson	c7df6b1223	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `3338290c18`. Broke expensive checks on debian.	2021-04-30 16:53:14 +01:00
Tomas Matheson	3338290c18	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164	2021-04-30 16:40:33 +01:00
Stanislav Mekhanoshin	30b3aab329	Copy syncscope when expanding atomicrmw into cmpxchg loop Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902	2021-04-05 17:29:38 -07:00
Konstantin Zhuravlyov	6054a456da	AMDGPU: Add support for amdgpu-unsafe-fp-atomics attribute If amdgpu-unsafe-fp-atomics is specified, allow {flat\|global}_atomic_add_f32 even if atomic modes don't match. Differential Revision: https://reviews.llvm.org/D95391	2021-02-04 08:09:34 -05:00
Pavel Iliin	4d7df43ffd	[AArch64] Out-of-line atomics (-moutline-atomics) implementation. This patch implements out of line atomics for LSE deployment mechanism. Details how it works can be found in llvm/docs/Atomics.rst Options -moutline-atomics and -mno-outline-atomics to enable and disable it were added to clang driver. This is clang and llvm part of out-of-line atomics interface, library part is already supported by libgcc. Compiler-rt support is provided in separate patch. Differential Revision: https://reviews.llvm.org/D91157	2020-11-20 13:30:12 +00:00
Alex Richardson	5bc438efcf	[AtomicExpand] Avoid creating an unnamed libcall I recently modified this pass to better support CHERI-RISC-V and while doing so I noticed that this pass was calling M->getOrInsertFunction() with the result of TLI->getLibcallName(RTLibType). However, AMDGPU fills the libcalls array with nullptr, so this creates an anonymous function instead. This patch changes expandAtomicOpToLibcall to return false in case the libcall does not exist and changes the assert() in the callees to a report_fatal_error() instead. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88800	2020-11-02 17:52:37 +00:00
Matt Arsenault	af0207f2ba	AMDGPU: Check global FP atomics match default FP mode We would always select global FP atomics from atomicrmw fadd, although they have a hardcoded FP mode.	2020-09-23 09:07:50 -04:00
Krzysztof Parzyszek	25a4b1904c	Handle part-word LL/SC in atomic expansion pass Differential Revision: https://reviews.llvm.org/D77213	2020-04-28 10:07:39 -05:00
Jonathan Roelofs	7c5d2bec76	[llvm] Fix missing FileCheck directive colons https://reviews.llvm.org/D77352	2020-04-06 09:59:08 -06:00
Matt Arsenault	32137699f7	AMDGPU: Fix copy-pasted test name error	2019-12-11 19:44:47 +05:30
Matt Arsenault	e16a71382d	AMDGPU: Select global atomicrmw fadd This only works if there is no use of the return value.	2019-11-06 16:06:38 -08:00
Fangrui Song	ac14f7b10c	[lit] Delete empty lines at the end of lit.local.cfg NFC llvm-svn: 363538	2019-06-17 09:51:07 +00:00
Matt Arsenault	c5830f5f05	AtomicExpand: Don't crash on non-0 alloca This now produces garbage on AMDGPU with a call to an nonexistent, anonymous libcall but won't assert. llvm-svn: 363022	2019-06-11 01:35:07 +00:00
Matt Arsenault	383e72fcfe	AMDGPU: Expand < 32-bit atomics Also fix AtomicExpand asserting on atomicrmw fadd/fsub. llvm-svn: 363021	2019-06-11 01:35:00 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Philip Reames	9549f7560f	[AtomicExpand] Allow libcall expansion for non-zero address spaces (try 2) Restore a reverted commit, with the silly mistake fixed. Sorry for the previous breakage. Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355540	2019-03-06 19:27:13 +00:00
Mitch Phillips	f0c21e2ff5	Revert "[AtomicExpand] Allow libcall expansion for non-zero address spaces" for buildbot failures. llvm-svn: 355461	2019-03-06 00:25:40 +00:00
Philip Reames	1e4c5d3611	[AtomicExpand] Allow libcall expansion for non-zero address spaces Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355453	2019-03-05 23:00:14 +00:00
Matt Arsenault	a5840c3c39	Codegen support for atomicrmw fadd/fsub llvm-svn: 351851	2019-01-22 18:36:06 +00:00
Matt Arsenault	39508331ef	Reapply "IR: Add fp operations to atomicrmw" This reapplies commits r351778 and r351782 with RISCV test fixes. llvm-svn: 351850	2019-01-22 18:18:02 +00:00

1 2

70 Commits