llvm-project

Commit Graph

Author	SHA1	Message	Date
Jessica Paquette	844d8e0337	[GlobalISel] Combine icmp eq/ne x, 0/1 -> x when x == 0 or 1 This adds the following combines: ``` x = ... 0 or 1 c = icmp eq x, 1 -> c = x ``` and ``` x = ... 0 or 1 c = icmp ne x, 0 -> c = x ``` When the target's true value for the relevant types is 1. This showed up in the following situation: https://godbolt.org/z/M5jKexWTW SDAG currently supports the `ne` case, but not the `eq` case. This can probably be further generalized, but I don't feel like thinking that hard right now. This gives some minor code size improvements across the board on CTMark at -Os for AArch64. (0.1% for 7zip and pairlocalalign in particular.) Differential Revision: https://reviews.llvm.org/D109130	2021-09-02 15:05:31 -07:00
Konstantin Schwarz	4b4bc1ea16	[GlobalISel] Do not generate illegal G_SEXTLOADs after legalization The sext_inreg_of_load combine did not have the isLegalOrBeforeLegalizer check, leading to the generation of potentially illegal G_SEXTLOADs when run after legalization. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108626	2021-08-25 10:13:39 +02:00
Sebastian Neubauer	fbae34635d	[GlobalISel] Add combine for PTR_ADD with regbanks Combine two G_PTR_ADDs, but keep the register bank of the constant. That way, the combine can be used in post-regbank-select combines. Introduce two helper methods in CombinerHelper, getRegBank and setRegBank that get and set an optional register bank to a register. That way, they can be used before and after register bank selection. Differential Revision: https://reviews.llvm.org/D103326	2021-08-17 13:58:16 +02:00
Jessica Paquette	50efbf9cbe	[GlobalISel] Narrow binops feeding into G_AND with a mask This is a fairly common pattern: ``` %mask = G_CONSTANT iN <mask val> %add = G_ADD %lhs, %rhs %and = G_AND %add, %mask ``` We have combines to eliminate G_AND with a mask that does nothing. If we combined the above to this: ``` %mask = G_CONSTANT iN <mask val> %narrow_lhs = G_TRUNC %lhs %narrow_rhs = G_TRUNC %rhs %narrow_add = G_ADD %narrow_lhs, %narrow_rhs %ext = G_ZEXT %narrow_add %and = G_AND %ext, %mask ``` We'd be able to take advantage of those combines using the trunc + zext. For this to work (or be beneficial in the best case) - The operation we want to narrow then widen must only be used by the G_AND - The G_TRUNC + G_ZEXT must be free - Performing the operation at a narrower width must not produce a different value than performing it at the original width after masking. Example comparison between SDAG + GISel: https://godbolt.org/z/63jzb1Yvj At -Os for AArch64, this is a 0.2% code size improvement on CTMark/pairlocalign. Differential Revision: https://reviews.llvm.org/D107929	2021-08-13 18:31:13 -07:00
Amara Emerson	7ec4ce157b	[AArch64][GlobalISel] Relax oneuse restriction for PTR_ADD chain combining to check addressing legality. With contributions by Sebastian Neubauer Differential Revision: https://reviews.llvm.org/D105676	2021-08-10 16:41:18 -07:00
Amara Emerson	4c2e01232c	[GlobalISel] Fix a combine causing DBG_VALUE with dangling vregs. We should use MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() instead of eraseFromParent(). We should probably use that in other places too but fix this issue which affects clang bootstrap builds for now.	2021-08-07 01:41:02 -07:00
Petar Avramovic	66de26b1f9	GlobalISel: Fix matchEqualDefs for instructions with multiple defs Instructions that produceSameValue produce same values for operands with same index. matchEqualDefs used to return true for any two values from different instructions that produce same values. Fix this by checking if values are defined by operands with the same index. Differential Revision: https://reviews.llvm.org/D107362	2021-08-05 15:05:45 +02:00
Dominik Montada	cc947e29ea	[GlobalISel] Combine shr(shl x, c1), c2 to G_SBFX/G_UBFX Reviewed By: foad Differential Revision: https://reviews.llvm.org/D107330	2021-08-05 13:52:10 +02:00
Amara Emerson	c54d5c9756	[GlobalISel] Use GMergeLikeOp to simplify a combine. NFC.	2021-07-29 13:53:16 -07:00
Amara Emerson	532c458fa8	[GlobalISel] Add GPtrAdd and use it in some combines.	2021-07-29 12:04:02 -07:00
Amara Emerson	c658b472f3	[GlobalISel] Add a constant folding combine. Use it AArch64 post-legal combiner. These don't always get folded because when the instructions are created the constants are obscured by artifacts. Differential Revision: https://reviews.llvm.org/D106776	2021-07-26 14:53:33 -07:00
Amara Emerson	dec34104bf	[GlobalISel] Add combine for merge(unmerge) and use AArch64 postlegal-combiner. Differential Revision: https://reviews.llvm.org/D106761	2021-07-26 10:37:31 -07:00
Amara Emerson	03cdb5221d	[GlobalISel] Fix load-or combine moving loads across potential aliasing stores. Although this combine checks that there's no load folding barriers between the loads that it's trying to merge, it was inserting the load at the MIRBuilder's default insertion point, which is the G_OR use inst. This was causing a miscompile in the test suite's SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-bswap-2 Differential Revision: https://reviews.llvm.org/D106251	2021-07-19 10:23:23 -07:00
Matt Arsenault	5a0d940f2a	GlobalISel: Preserve memory type for memset expansion	2021-07-16 11:41:32 -04:00
Amara Emerson	4e3dc6b8dd	GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI. This adds some level of type safety, allows helper functions to be added for specific opcodes for free, and also allows us to succinctly check for class membership with the usual dyn_cast/isa/cast functions. To start off with, add variants for the different load/store operations with some places using it. Differential Revision: https://reviews.llvm.org/D105751	2021-07-15 15:21:57 -07:00
Jessica Paquette	5da0f9ab61	[GlobalISel] Fix infinite loop in reassociationCanBreakAddressingModePattern It didn't update the opcode while walking through G_INTTOPTR/G_PTRTOINT. Differential Revision: https://reviews.llvm.org/D106080	2021-07-15 10:09:07 -07:00
Amara Emerson	f30251f527	[GlobalISel] Clean up CombinerHelper::apply* functions to return void. For some reason we/I started writing these as returning bool when the return value is actually ignored by the combiner.	2021-07-02 13:17:06 -07:00
Amara Emerson	0111da2ef8	[GlobalISel] Add re-association combine for G_PTR_ADD to allow better addressing mode usage. We're trying to match a few pointer computation patterns here for re-association opportunities. 1) Isolating a constant operand to be on the RHS, e.g.: G_PTR_ADD(BASE, G_ADD(X, C)) -> G_PTR_ADD(G_PTR_ADD(BASE, X), C) 2) Folding two constants in each sub-tree as long as such folding doesn't break a legal addressing mode. G_PTR_ADD(G_PTR_ADD(BASE, C1), C2) -> G_PTR_ADD(BASE, C1+C2) AArch64 code size improvements on CTMark with -Os: Program before after diff pairlocalalign 251048 251044 -0.0% consumer-typeset 421820 421812 -0.0% kc 431348 431320 -0.0% SPASS 413404 413300 -0.0% clamscan 384396 384220 -0.0% tramp3d-v4 370640 370412 -0.1% lencod 432096 431772 -0.1% bullet 479400 478796 -0.1% sqlite3 288504 288072 -0.1% 7zip-benchmark 573796 570768 -0.5% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D105069	2021-07-02 12:31:21 -07:00
Matt Arsenault	28f2f66200	GlobalISel: Use LLT in memory legality queries This enables proper lowering of non-byte sized loads. We still aren't faithfully preserving memory types everywhere, so the legality checks still only consider the size.	2021-06-30 17:44:13 -04:00
Jon Roelofs	a642872476	[GISel] Support llvm.memcpy.inline Differential revision: https://reviews.llvm.org/D105072	2021-06-30 12:39:05 -07:00
Brendon Cahoon	f9f5d41545	[AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX Adds legalizer, register bank select, and instruction select support for G_SBFX and G_UBFX. These opcodes generate scalar or vector ALU bitfield extract instructions for AMDGPU. The instructions allow both constant or register values for the offset and width operands. The 32-bit scalar version is expanded to a sequence that combines the offset and width into a single register. There are no 64-bit vgpr bitfield extract instructions, so the operations are expanded to a sequence of instructions that implement the operation. If the width is a constant, then the 32-bit bitfield extract instructions are used. Moved the AArch64 specific code for creating G_SBFX to CombinerHelper.cpp so that it can be used by other targets. Only bitfield extracts with constant offset and width values are handled currently. Differential Revision: https://reviews.llvm.org/D100149	2021-06-28 09:06:44 -04:00
Eli Friedman	74909e4b6e	Rename MachineMemOperand::getOrdering -> getSuccessOrdering. Since this method can apply to cmpxchg operations, make sure it's clear what value we're actually retrieving. This will help ensure we don't accidentally ignore the failure ordering of cmpxchg in the future. We could potentially introduce a getOrdering() method on AtomicSDNode that asserts the operation isn't cmpxchg, but not sure that's worthwhile. Differential Revision: https://reviews.llvm.org/D103338	2021-06-21 16:49:27 -07:00
Jon Roelofs	a2ab765029	[GISel] Eliminate redundant bitmasking This was a GISel vs SDAG regression that showed up at -Os on arm64 in: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.test https://llvm.godbolt.org/z/aecjodsjG Differential revision: https://reviews.llvm.org/D103334	2021-06-17 12:53:00 -07:00
Jessica Paquette	e7f501b5e7	[GlobalISel][AArch64] Combine and (lshr x, cst), mask -> ubfx x, cst, width Also add a target hook which allows us to get around custom legalization on AArch64. Differential Revision: https://reviews.llvm.org/D99283	2021-06-01 10:56:17 -07:00
Amara Emerson	57ea5d4f48	[GlobalISel] Fix div+rem -> divrem combine causing use-def violation.	2021-05-19 23:13:41 -07:00
Jessica Paquette	84ae1cf8ed	Recommit "[GlobalISel] Simplify G_ICMP to true/false when the result is known" Add missing REQUIRES line to prelegalizer-combiner-icmp-to-true-false-known-bits.	2021-05-19 09:29:19 -07:00
Nico Weber	52a7797626	Revert "[GlobalISel] Simplify G_ICMP to true/false when the result is known" This reverts commit `892497c806`. Breaks tests, see comments on https://reviews.llvm.org/D102542	2021-05-19 09:02:27 -04:00
Jessica Paquette	892497c806	[GlobalISel] Simplify G_ICMP to true/false when the result is known Use existing KnownBits helpers from KnownBits.h to simplify G_ICMPs. E.g. x == x -> true x != x -> false load(x) > 1 -> true (when the load is known to be greater than 1) And so on. Differential Revision: https://reviews.llvm.org/D102542	2021-05-18 09:26:41 -07:00
Amara Emerson	808bc11d9e	[GlobalISel] Don't form zero/sign extending loads for atomics. For importing patterns, we only support matching G_LOAD, not G_ZEXTLOAD or G_SEXTLOAD. Differential Revision: https://reviews.llvm.org/D101932	2021-05-07 16:41:48 -07:00
Amara Emerson	1ccebb18ef	[GlobalISel] Micro-optimize the conditional branch optimization. Convert a check into an assert and pass an MI instead of recomputing in the apply function.	2021-05-07 00:03:09 -07:00
Amara Emerson	96ec6d91e4	[AArch64][GlobalISel] Simplify out of range rotate amount. Differential Revision: https://reviews.llvm.org/D101005	2021-04-29 14:05:58 -07:00
Yang Fan	0d7fd9f0d0	[GlobalISel] Fix Wint-in-bool-context warning (NFC) GCC warning: ``` /llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp: In member function ‘bool llvm::CombinerHelper::matchFunnelShiftToRotate(llvm::MachineInstr&)’: /llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:3882:35: warning: ?: using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context] 3882 \| Opc == TargetOpcode::G_FSHL ? TargetOpcode::G_ROTL : TargetOpcode::G_ROTR; \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ```	2021-03-31 09:59:43 +08:00
Amara Emerson	91887cd4ec	[AArch64][GlobalISel] Combine funnel shifts to rotates. Differential Revision: https://reviews.llvm.org/D99388	2021-03-30 11:00:36 -07:00
Jessica Paquette	700431128e	[GlobalISel][AArch64] Combine G_SEXT_INREG + right shift -> G_SBFX Basically a port of isBitfieldExtractOpFromSExtInReg in AArch64ISelDAGToDAG. This is only done post-legalization for now. Once the legalizer knows how to decompose these back into shifts, this requirement can probably be removed. Differential Revision: https://reviews.llvm.org/D99230	2021-03-30 10:14:30 -07:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Christudasan Devadasan	4c6ab48fb1	GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM It is good to have a combined `divrem` instruction when the `div` and `rem` are computed from identical input operands. Some targets can lower them through a single expansion that computes both division and remainder. It effectively reduces the number of instructions than individually expanding them. Reviewed By: arsenm, paquette Differential Revision: https://reviews.llvm.org/D96013	2021-03-10 18:46:07 +05:30
Amara Emerson	55e760769b	[GlobalISel] Fold away G_BUILD_VECTOR with all elements extracted. If every element is extracted from a G_BUILD_VECTOR, pass through the source registers. This is different to the extract(build_vector) combine because this one tolerates multiple users as long as they're exhaustive. Differential Revision: https://reviews.llvm.org/D97890	2021-03-09 11:34:26 -08:00
Amara Emerson	e60ab72137	[AArch64][GlobalISel] Add combine for extract_vector_elt(build_vector, cst) Differential Revision: https://reviews.llvm.org/D97835	2021-03-09 11:08:02 -08:00
Petar Avramovic	d44f61f81c	Reland [GlobalISel] Combine zext(trunc x) to x Recommit `4112299ee7`. Depends on `4c8fb7ddd6` which was reverted. Combine zext(trunc x) to x when truncated bits are known to be zero. Differential Revision: https://reviews.llvm.org/D96031	2021-03-05 11:05:37 +01:00
Nico Weber	59beb1ef6d	Revert "[GlobalISel] Combine zext(trunc x) to x" This reverts commit `4112299ee7`. Seems to depend on `4c8fb7ddd6` which is being reverted.	2021-03-04 10:13:40 -05:00
Petar Avramovic	4112299ee7	[GlobalISel] Combine zext(trunc x) to x Combine zext(trunc x) to x when truncated bits are known to be zero. Differential Revision: https://reviews.llvm.org/D96031	2021-03-04 15:05:23 +01:00
Jay Foad	a6be26710b	[GlobalISel] Make more use of replaceSingleDefInstWithReg. NFC.	2021-02-23 17:08:34 +00:00
Amara Emerson	5d6d9b63a3	[GlobalISel] Propagate extends through G_PHIs into the incoming value blocks. This combine tries to do inter-block hoisting of extends of G_PHIs, into the originating blocks of the phi's incoming value. The idea is to expose further optimization opportunities that are normally obscured by the PHI. Some basic heuristics, and a target hook for AArch64 is added, to allow tuning. E.g. if the extend is used by a G_PTR_ADD, it doesn't perform this combine since it may be folded into the addressing mode during selection. There are very minor code size improvements on AArch64 -Os, but the real benefit is that it unlocks optimizations like AArch64 conditional compares on some benchmarks. Differential Revision: https://reviews.llvm.org/D95703	2021-02-12 11:52:52 -08:00
Amara Emerson	de035c18cf	[GlobalISel] Fix sext_inreg(load) combine to not move the originating load. The builder was using the extend user as the insertion point, which meant that we were incorrectly "moving" the load from its original position, and therefore could violate memory operation ordering.	2021-02-11 19:27:09 -08:00
Craig Topper	11ef356d9e	[TargetLowering] Use Align in allowsMisalignedMemoryAccesses. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D96097	2021-02-04 19:22:06 -08:00
Jessica Paquette	02d4b365bf	[GlobalISel] Check if branches use the same MBB in matchOptBrCondByInvertingCond If the G_BR + G_BRCOND in this combine use the same MBB, then it will infinite loop. Don't allow that to happen. Differential Revision: https://reviews.llvm.org/D95895	2021-02-02 15:38:48 -08:00
Jessica Paquette	cbf5246359	Fix buildbot after `cfc6073017` Windows buildbots were not happy with using find_if + instructionsWithoutDebug. In `cfc6073017`, instructionsWithoutDebug is not technically necessary. So, just iterate over the block directly. http://lab.llvm.org:8011/#/builders/127/builds/4732/steps/7/logs/stdio	2021-01-19 10:38:04 -08:00
Jessica Paquette	cfc6073017	[GlobalISel] Combine (a[0]) \| (a[1] << k1) \| ...\| (a[m] << kn) into a wide load This is a restricted version of the combine in `DAGCombiner::MatchLoadCombine`. (See D27861) This tries to recognize patterns like below (assuming a little-endian target): ``` s8* x = ... s32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) -> s32 val = ((i32)a) s8 x = ... s32 val = a[3] \| (a[2] << 8) \| (a[1] << 16) \| (a[0] << 24) -> s32 val = BSWAP(*((s32)a)) ``` (This patch also handles the big-endian target case as well, in which the first example above has a BSWAP, and the second example above does not.) To recognize the pattern, this searches from the last G_OR in the expression tree. E.g. ``` Reg Reg \ / OR_1 Reg \ / OR_2 \ Reg .. / Root ``` Each non-OR register in the tree is put in a list. Each register in the list is then checked to see if it's an appropriate load + shift logic. If every register is a load + potentially a shift, the combine checks if those loads + shifts, when OR'd together, are equivalent to a wide load (possibly with a BSWAP.) To simplify things, this patch (1) Only handles G_ZEXTLOADs (which appear to be the common case) (2) Only works in a single MachineBasicBlock (3) Only handles G_SHL as the bit twiddling to stick the small load into a specific location An IR example of this is here: https://godbolt.org/z/4sP9Pj (lifted from test/CodeGen/AArch64/load-combine.ll) At -Os on AArch64, this is a 0.5% code size improvement for CTMark/sqlite3, and a 0.4% improvement for CTMark/7zip-benchmark. Also fix a bug in `isPredecessor` which caused it to fail whenever `DefMI` was the first instruction in the block. Differential Revision: https://reviews.llvm.org/D94350	2021-01-19 10:24:27 -08:00
Matt Arsenault	1f9b6ef91f	GlobalISel: Add combine for G_UREM by power of 2 Really I want this in the legalizer, but this is a start.	2021-01-07 16:36:35 -05:00
Amara Emerson	7df3544e80	[GlobalISel] Fix assertion failures after "GlobalISel: Return APInt from getConstantVRegVal" landed. APInt binary ops don't promote types but instead assert, which a combine was relying on.	2020-12-26 23:51:44 -08:00
Matt Arsenault	581d13f8ae	GlobalISel: Return APInt from getConstantVRegVal Returning int64_t was arbitrarily limiting for wide integer types, and the functions should handle the full generality of the IR. Also changes the full form which returns the originally defined vreg. Add another wrapper for the common case of just immediately converting to int64_t (arguably this would be useful for the full return value case as well). One possible issue with this change is some of the existing uses did break without conversion to getConstantVRegSExtVal, and it's possible some without adequate test coverage are now broken.	2020-12-22 22:23:58 -05:00
Jessica Paquette	b184a2eccf	[GlobalISel] Add matchers for specific constants and a matcher for negations It's fairly common to need matchers for a specific constant value, or for common idioms like finding a negated register. Add - `m_SpecificICst`, which returns true when matching a specific value.. - `m_ZeroInt`, which returns true when an integer 0 is matched. - `m_Neg`, which returns when a register is negated. Also update a few places which use idioms related to the new matchers. Differential Revision: https://reviews.llvm.org/D91397	2020-11-13 09:24:54 -08:00
Mirko Brkusanin	a75d6178b8	[GlobalISel] Add combine for (x \| mask) -> x when (x \| mask) == x If we have a mask, and a value x, where (x \| mask) == x, we can drop the OR and just use x. Differential Revision: https://reviews.llvm.org/D90952	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	fb36ab0a42	[GlobalISel] Expand combine for (x & mask) -> x when (x & mask) == x We can use KnownBitsAnalysis to cover cases when mask is not trivial. It can also help with cases when mask is not constant but can still be folded into one. Since 'and' is comutative we should treat both operands as possible replacements. Differential Revision: https://reviews.llvm.org/D90674	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	53ae95c946	[AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands This sequence of instructions can be simplified if they are single use and some operands are constants. Additional combines may be applied afterwards. Differential Revision: https://reviews.llvm.org/D90223	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	de719586a8	[AMDGPU][GlobalISel] Fold a chain of two shift instructions with constant operands Sequence of same shift instructions with constant operands can be combined into a single shift instruction. Differential Revision: https://reviews.llvm.org/D90217	2020-11-10 11:32:12 +01:00
Aditya Nandakumar	bed8394047	[GISel]: Few InsertVecElt combines https://reviews.llvm.org/D88060 This adds the following combines 1) build_vector formation from insert_vec_elts 2) insert_vec_elts (build_vector) -> build_vector	2020-10-28 12:27:07 -07:00
Aditya Nandakumar	ef3d17482f	[GISel] Add combine for constant G_PTR_ADD offsets. https://reviews.llvm.org/D88865 This adds a single combine for GlobalISel to fold: ptradd (inttoptr C1) C2 Into: C1 + C2 Additionally, a small test for AArch64 is added. Patch by pnappa.	2020-10-13 17:26:12 -07:00
Mirko Brkusanin	52ba4fa6aa	[GlobalISel] Avoid making G_PTR_ADD with nullptr When the first operand is a null pointer we can avoid making a G_PTR_ADD and make a G_INTTOPTR with the offset operand. This helps us avoid making add with 0 later on for targets such as AMDGPU. Differential Revision: https://reviews.llvm.org/D87140	2020-10-13 13:02:55 +02:00
Jessica Paquette	a52e78012a	[GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y) When we see this: ``` %and = G_AND %x, %y %xor = G_XOR %and, %y ``` Produce this: ``` %not = G_XOR %x, -1 %new_and = G_AND %not, %y ``` as long as we are guaranteed to eliminate the original G_AND. Also matches all commuted forms. E.g. ``` %and = G_AND %y, %x %xor = G_XOR %y, %and ``` will be matched as well. Differential Revision: https://reviews.llvm.org/D88104	2020-09-28 10:08:14 -07:00
Matt Arsenault	c463fd136e	GlobalISel: Fix truncating shift amount in trunc (shl) combine The shift amount type does not necessarily match the result type. This was inserting a trunc from s32 to s32, which asserted. Just preserve the original shift amount type which can be legalized later.	2020-09-23 09:07:50 -04:00
Michael Kitzan	c4e589b795	[GISel] Add new combines for unary FP instrs with constant operand https://reviews.llvm.org/D86393 Patch adds five new `GICombinerRules`, one for each of the following unary FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The combine rules perform the FP operation on the constant operand and replace the original instr with the result. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules.	2020-09-16 10:34:15 -07:00
Volkan Keles	79378b1b75	GlobalISel: Fix a failing combiner test test/CodeGen/AArch64/GlobalISel/combine-trunc.mir was failing due to the different order for evaluating function arguments. This patch updates the related code to fix the issue.	2020-09-15 16:40:38 -07:00
Aditya Nandakumar	97203cfd6b	[GISel] Add new GISel combiners for G_MUL https://reviews.llvm.org/D87668 Patch adds two new GICombinerRules, one for G_MUL(X, 1) and another for G_MUL(X, -1). G_MUL(X, 1) is an identity combine, and G_MUL(X, -1) gets replaced with G_SUB(0, X). Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules, as well as updates AMDGPU GISel tests. Patch by mkitzan	2020-09-15 16:08:47 -07:00
Volkan Keles	a4e35cc2ec	GlobalISel: Add combines for G_TRUNC https://reviews.llvm.org/D87050	2020-09-15 15:50:34 -07:00
Quentin Colombet	b3afad0463	[GlobalISel] Add a `X, Y = G_UNMERGE(G_ZEXT Z)` -> X = G_ZEXT Z; Y = 0 combine Add a combiner helper to transform unmerge of zext into one zext and a constant 0 Differential Revision: https://reviews.llvm.org/D87427	2020-09-14 17:27:23 -07:00
Quentin Colombet	d2321129bd	[GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z Add a combiner helper that replaces G_UNMERGE where all the destination lanes are dead except the first one with a G_TRUNC. Differential Revision: https://reviews.llvm.org/D87174	2020-09-14 17:27:23 -07:00
Quentin Colombet	a36278c2f8	[GlobalISel] Add G_UNMERGE(Cst) -> Cst1, Cst2, ... combine Add a combiner helper that replaces G_UNMERGE of big constants into direct use of smaller constants. Differential Revision: https://reviews.llvm.org/D87166	2020-09-14 16:30:18 -07:00
Aditya Nandakumar	46f9137e43	[GISel]: Add combine for G_FABS to G_FABS https://reviews.llvm.org/D87554 Patch adds one new GICombinerRule for G_FABS. The combine rule folds G_FABS(G_FABS(X)) to G_FABS(X). Patch additionally adds new combiner tests for the AArch64 target to test this new combiner rule. Patch by mkitzan.	2020-09-14 15:56:24 -07:00
Quentin Colombet	670c276232	[GlobalISel] Add G_UNMERGE_VALUES(G_MERGE_VALUES) combine Add the matching and applying function to the combiner helper for G_UNMERGE_VALUES(G_MERGE_VALUES). This combine also supports any merge-like input nodes, like G_BUILD_VECTORS and is robust against bitcasts in between int unmerge and merge nodes. When the input type of the merge node and the output type of the unmerge node are not the same, but the sizes are, the combine still applies but creates bitcasts between the sources and the destinations instead of reusing the destinations directly. Long term, the artifact combiner should probably reuse that helper, but as of today, it doesn't use any outside helper, so I kept it this way. Differential Revision: https://reviews.llvm.org/D87117	2020-09-14 15:45:06 -07:00
Volkan Keles	d4bf90271f	GlobalISel: Combine fneg(fneg x) to x https://reviews.llvm.org/D87473	2020-09-10 12:57:38 -07:00
Amara Emerson	cc76da7ada	[GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less. This combine previously tried to take sequences like: %cond = G_ICMP pred, a, b G_BRCOND %cond, %truebb G_BR %falsebb %truebb: ... %falsebb: ... and by inverting the compare predicate and swapping branch targets, delete the G_BR and instead have a single conditional branch to the falsebb. Since in an earlier patch we have a combine to fold not(icmp) into just an inverted icmp, we don't need this combine to do as much. This patch instead generalizes the combine by just looking for: G_BRCOND %cond, %truebb G_BR %falsebb %truebb: ... %falsebb: ... and then inverting the condition using a not (xor). The xor can be folded away in a separate combine. This change also lets us avoid some optimization code in the IRTranslator. I also think that deleting G_BRs in the combiner is unnecessary. That's something that targets can decide to do at selection time and could simplify generic code in future. Differential Revision: https://reviews.llvm.org/D86664	2020-09-09 13:08:16 -07:00
Volkan Keles	1242dd330d	GlobalISel: Combine `op undef, x` to 0 https://reviews.llvm.org/D86611	2020-09-08 09:46:38 -07:00
Jay Foad	713c2ad60c	[GlobalISel] Extend not_cmp_fold to work on conditional expressions Differential Revision: https://reviews.llvm.org/D86709	2020-09-07 09:31:08 +01:00
Amara Emerson	d0abc75749	[GlobalISel] Disable the indexed loads combine completely unless forced. NFC. The post-index matcher, before it queries the target legality, walks uses of some instructions which in pathological cases can be massive. Since no targets actually support indexed loads yet, disable this to stop wasting compile time on something which is going to fail anyway.	2020-09-05 21:04:03 -07:00
Amara Emerson	520ab710fb	Revert "Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.")" This reverts commit `8693ddc743`. Re-committing with the test requiring asserts.	2020-09-01 14:29:04 -07:00
Jordan Rupprecht	8693ddc743	Revert "[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _)" (and dependent patch "Optimize away a Not feeding a brcond by using tbz instead of tbnz.") This reverts commit `8ad8f484b6`. It causes crashes when running `ninja check-llvm-codegen-aarch64-globalisel`, e.g. http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132/steps/test-stage1-compiler/logs/stdio. Note that the crash does not seem to reproduce in debug builds. `5ded444252` depends on this, so revert that too.	2020-09-01 13:31:57 -07:00
Amara Emerson	8ad8f484b6	[GlobalISel] Fold xor(cmp(pred, _, _), 1) -> cmp(inverse(pred), _, _) This is needed for an upcoming change to how we translate conditional branches which might generate these. Differential Revision: https://reviews.llvm.org/D86383	2020-09-01 10:57:17 -07:00
Volkan Keles	061182b7ba	GlobalISel: Add combines for extend operations https://reviews.llvm.org/D86516	2020-09-01 08:50:06 -07:00
Matt Arsenault	1b201914b5	GlobalISel: Combine out redundant sext_inreg The scalar tests don't work yet, since computeNumSignBits apparently doesn't handle sextload yet, and sext folds into the load first.	2020-08-28 17:57:31 -04:00
Aditya Nandakumar	db464a3dbf	[GISel] Add new GISel combiners for G_SELECT https://reviews.llvm.org/D83833 Patch adds two new GICombinerRules for G_SELECT. The rules include: combining selects with undef comparisons into their first selectee value, and to combine away selects with constant comparisons. Patch additionally adds a new combiner test for the AArch64 target to test these new G_SELECT combiner rules and the existing select_same_val combiner rule. Patch by mkitzan	2020-08-27 09:40:15 -07:00
Matt Arsenault	0b7f6cc71a	GlobalISel: Add generic instructions for memory intrinsics AArch64, X86 and Mips currently directly consumes these and custom lowering to produce a libcall, but really these should follow the normal legalization process through the libcall/lower action.	2020-08-26 20:08:45 -04:00
Matt Arsenault	eb074088c9	GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD This produces less work for addressing mode matching. I think this is safe since I don't think machine IR is supposed to give the same aliasing properties as getelementptr in the IR.	2020-08-26 08:57:15 -04:00
Matt Arsenault	e1644a3779	GlobalISel: Reduce G_SHL width if source is extension shl ([sza]ext x, y) => zext (shl x, y). Turns expensive 64 bit shifts into 32 bit if it does not overflow the source type: This is a port of an AMDGPU DAG combine added in `5fa289f0d8`. InstCombine does this already, but we need to do it again here to apply it to shifts introduced for lowered getelementptrs. This will help matching addressing modes that use 32-bit offsets in a future patch. TableGen annoyingly assumes only a single match data operand, so introduce a reusable struct. However, this still requires defining a separate GIMatchData for every combine which is still annoying. Adds a morally equivalent function to the existing getShiftAmountTy. Without this, we would have to do try to repeatedly query the legalizer info and guess at what type to use for the shift.	2020-08-24 09:42:40 -04:00
Jessica Paquette	d25b12bdc3	[GlobalISel] Add combine for (x & mask) -> x when (x & mask) == x If we have a mask, and a value x, where (x & mask) == x, we can drop the AND and just use x. This is about a 0.4% geomean code size improvement on CTMark at -O3 for AArch64. In AArch64, this is most useful post-legalization. Patterns like this often show up when legalizing s1s, which must be extended to larger types. e.g. ``` %cmp:_(s32) = G_ICMP ... %and:_(s32) = G_AND %cmp, 1 ``` Since G_ICMP only produces a single bit, there's no reason to mask it with the G_AND. Differential Revision: https://reviews.llvm.org/D85463	2020-08-19 10:20:57 -07:00
Amara Emerson	ed35344524	Use std::make_tuple instead of initializer lists to make a bot happy: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux	2020-08-18 14:55:52 -07:00
Amara Emerson	04a6ea5d77	[GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x This is restricted to single use loads, which if we fold to sextloads we can find more optimal addressing modes on AArch64. This also fixes an overload the MachineFunction::getMachineMemOperand() method which was incorrectly using the MF alignment instead of the MMO alignment. Differential Revision: https://reviews.llvm.org/D85966	2020-08-18 10:42:15 -07:00
Amara Emerson	40e269ea6d	[GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c' By detecting this sign extend pattern early, we can uncover opportunities for more optimizations. Differential Revision: https://reviews.llvm.org/D85965	2020-08-18 10:42:15 -07:00
Jessica Paquette	bebe6a6449	[GlobalISel] Combine (logic_op (op x...), (op y...)) -> (op (logic_op x, y)) This implements ``` (logic_op (op x...), (op y...)) -> (op (logic_op x, y)) ``` when `op` is an extend, a shift, or an and. This is similar to `DAGCombiner::hoistLogicOpWithSameOpcodeHands` (with a bunch of missing cases, e.g. G_TRUNC, G_BITCAST, etc.) This is implemented so it works both pre and post-legalization. This also adds a general way to add a series of instructions in a combine. (`applyBuildInstructionSteps`). Differential Revision: https://reviews.llvm.org/D85050	2020-08-11 10:40:06 -07:00
Aditya Nandakumar	2144a3bdbb	[GISel] Add combiners for G_INTTOPTR and G_PTRTOINT https://reviews.llvm.org/D84909 Patch adds two new GICombinerRules, one for G_INTTOPTR and one for G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if the cast is within the same address space. The G_PTRTOINT elides int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules. Patch by mkitzan	2020-07-31 10:13:36 -07:00
Amara Emerson	645e7fc542	[GlobalISel] Use existing MIR builder instead of creating one in combiner.	2020-07-23 14:16:45 -07:00
Amara Emerson	3b10e42ba1	[AArch64][GlobalISel] Add post-legalize combine for sext(trunc(sextload)) -> trunc/copy On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this. Differential Revision: https://reviews.llvm.org/D81993	2020-07-23 12:06:35 -07:00
Amara Emerson	791544422a	Revert "[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy" This reverts commit `64eb3a4915`. It caused miscompiles with optimizations enabled. Reverting while I investigate.	2020-07-21 16:01:18 -07:00
Amara Emerson	64eb3a4915	[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this. Differential Revision: https://reviews.llvm.org/D81993	2020-07-13 20:27:45 -07:00
Jessica Paquette	5a4c3f6b06	[GlobalISel] Look through extends etc in CombinerHelper::matchConstantOp It's possible to end up with a zext or something in the way of a G_CONSTANT, even pre-legalization. This can happen with memsets. e.g. https://godbolt.org/z/Bjc8cw To make sure we can catch these cases, use `getConstantVRegValWithLookThrough` instead of `mi_match`. Differential Revision: https://reviews.llvm.org/D81875	2020-06-15 16:34:25 -07:00
Amara Emerson	fc905ae003	[GlobalISel] Don't emit multiply by magic constant for zero memset values.	2020-06-15 14:42:14 -07:00
Jessica Paquette	1ac8451a9b	[GlobalISel] Simplify G_ADD when it has (0-X) on the LHS or RHS This implements the following combines: ((0-A) + B) -> B-A (A + (0-B)) -> A-B Porting over the basic algebraic combines from the DAGCombiner. There are several combines which fold adds away into subtracts. This is just the simplest one. I noticed that add combines are some of the most commonly hit across CTMark, (via print statements when they fire), so I'm porting over some of the obvious ones. This gives some minor code size improvements on CTMark at -O3 on AArch64. Differential Revision: https://reviews.llvm.org/D77453	2020-06-15 09:43:24 -07:00
Guillaume Chatelet	3b6196c9b3	[Alignment][NFC] TargetLowering::allowsMisalignedMemoryAccesses Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMisalignedMemoryAccesses` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81374	2020-06-09 10:17:42 +00:00
Christopher Tetreault	caa2fddce7	[SVE] Eliminate calls to default-false VectorType::get() from CodeGen Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet Reviewed By: spatel, gchatelet Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80313	2020-06-08 10:26:10 -07:00
Stanislav Mekhanoshin	f6a6de288b	GlobalISel: fix CombinerHelper::matchEqualDefs() This matcher was always returning true for the different results of a same instruction. Differential Revision:	2020-05-29 09:30:02 -07:00

1 2 3 4 5

221 Commits