llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	23da702d69	GlobalISel: Translate llvm.fshl/llvm.fshr	2020-03-30 11:34:42 -07:00
Guillaume Chatelet	bdf77209b9	[Alignment][NFC] Use Align version of getMachineMemOperand Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77059	2020-03-30 15:46:27 +00:00
Matt Arsenault	cc3b5590d2	GlobalISel: Minor cleanups	2020-03-30 11:26:22 -04:00
Jessica Paquette	98d05f88d5	[GlobalISel] Fix equality for copies from physregs in matchEqualDefs When we see this: ``` %a = COPY $physreg ... SOMETHING implicit-def $physreg ... %b = COPY $physreg ``` The two copies are not equivalent, and so we shouldn't perform any folding on them. When we have two instructions which use a physical register check that they define the same virtual register(s) as well. e.g., if we run into this case ``` %a = COPY $physreg ... %b = COPY %a ``` we can say that the two copies are the same, and can be folded. Differential Revision: https://reviews.llvm.org/D76890	2020-03-27 17:52:21 -07:00
Guillaume Chatelet	74eac9031a	[Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76925	2020-03-27 15:49:13 +00:00
Dominik Montada	9fedb6900d	[GlobalISel] add helper function to create arbitrary libcalls Summary: The existing helper function can only create a libcall to functions available in RTLIB. Add a helper function that can create a libcall to a given function name using the provided calling convention. Reviewers: aditya_nandakumar, t.p.northover, rovka, arsenm, dsanders Reviewed By: arsenm Subscribers: wdng, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76845	2020-03-26 16:11:13 +01:00
Matt Arsenault	39c55cef21	GlobalISel: Introduce bitcast legalize action For some operations, the type is unimportant and only the number of bits matters. For example I don't want to treat <4 x s8> as a legal type, but I also don't want to decompose loads of this into smaller pieces to get legal register types. On AMDGPU in SelectionDAG, we legalize a number of operations (most notably load and store) by coercing all types to vectors of i32. For GlobalISel, I'm trying very hard to avoid doing this for every type, but I don't think this strategy can be completely avoided. I'm trying to avoid bitcasts for any legitimately legal type we can operate on, since the intervening bitcasts have proven to be a hassle. For loads, I think I can get away without ever casting the result type, and handling any arbitrary bitwidth during selection (I will eventually want new tablegen support to help with this, rather than having to add every possible type as legal). The unmerge required to do anything with the value should expand to the expected shifts. This is trickier for stores, since it would now require handling a wide array of truncates during selection which I don't want. Future potentially interesting case are for vector indexing, where sub-dword type should be indexed in s32 pieces.	2020-03-24 19:33:33 -04:00
Jessica Paquette	02187ed45a	[GlobalISel] Combine G_SELECTs of the form (cond ? x : x) into x When we find something like this: ``` %a:_(s32) = G_SOMETHING ... ... %select:_(s32) = G_SELECT %cond(s1), %a, %a ``` We can remove the select and just replace it entirely with `%a` because it's always going to result in `%a`. Same if we have ``` %select:_(s32) = G_SELECT %cond(s1), %a, %b ``` where we can deduce that `%a == %b`. This implements the following cases: - `%select:_(s32) = G_SELECT %cond(s1), %a, %a` -> `%a` - `%select:_(s32) = G_SELECT %cond(s1), %a, %some_copy_from_a` -> `%a` - `%select:_(s32) = G_SELECT %cond(s1), %a, %b` -> `%a` when `%a` and `%b` are defined by identical instructions This gives a few minor code size improvements on CTMark at -O3 for AArch64. Differential Revision: https://reviews.llvm.org/D76523	2020-03-23 16:46:03 -07:00
Matt Arsenault	aa63eb6a46	GlobalISel: Add computeKnownBitsForTargetInstr I think we can save the MRI argument from these since it's in GISelKnownBits already, but currently not accessible. Implementation deferred to avoid dependency on other patches.	2020-03-23 15:02:30 -04:00
Jay Foad	0444d16a16	[GlobalISel] Add generic opcodes for saturating add/subtract Summary: Add new generic MIR opcodes G_SADDSAT etc. Add support in IRTranslator for translating the saturating add/subtract intrinsics to the new opcodes. Reviewers: aemerson, dsanders, paquette, arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76600	2020-03-23 15:16:45 +00:00
Dominik Montada	ccf49b9ef0	[GlobalISel] support widen unmerge if WideTy > SrcTy Summary: Widening G_UNMERGE_VALUES to a type which is larger than the original source type is the same as widening it to the same type as the source type: in both cases, G_UNMERGE_VALUES has to be replaced with bit arithmetic which. Although the arithmetic itself is independent of whether the source type is smaller or equal to the widen type, widening the source type to the widen type should result in less artifacts being emitted, since this is the type that the user explicitly requested. Reviewers: arsenm, dsanders, aemerson, aditya_nandakumar Reviewed By: arsenm, dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76494	2020-03-23 09:16:45 +01:00
Adrian Kuegel	baa6f6a782	Revert "[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes" This reverts commit `e9f22fd429`. When building with -DLLVM_USE_SANITIZER="Thread", check-llvm has 70 failing tests with this revision, and 29 without this revision.	2020-03-20 11:02:50 +01:00
Jessica Paquette	c999084619	[GlobalISel] Port some basic shufflevector undef combines from the DAGCombiner Port over the following: - shuffle undef, undef, any_mask -> undef - shuffle anything, anything, undef_mask -> undef This sort of thing shows up a lot when you try to bugpoint code containing shufflevector. Differential Revision: https://reviews.llvm.org/D76382	2020-03-19 16:46:06 -07:00
lewis-revill	e9f22fd429	[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes This patch generates TableGen descriptions for the specified register banks which contain a list of register sizes corresponding to the available HwModes. The appropriate size is used during codegen according to the current HwMode. As this HwMode was not available on generation, it is set upon construction of the RegisterBankInfo class. Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed). Differential Revision: https://reviews.llvm.org/D76007	2020-03-18 19:52:23 +00:00
Jessica Paquette	dc5f982639	[GlobalISel] Port some basic undef combines from DAGCombiner.cpp This ports some combines from DAGCombiner.cpp which perform some trivial transformations on instructions with undef operands. Not having these can make it extremely annoying to find out where we differ from SelectionDAG by looking at existing lit tests. Without them, we tend to produce pretty bad code generation when we run into instructions which use undef operands. Also remove the nonpow2_store_narrowing testcase from arm64-fallback.ll, since we no longer fall back on the add. Differential Revision: https://reviews.llvm.org/D76339	2020-03-18 11:05:44 -07:00
Matt Arsenault	2e77362626	GlobalISel: Fix lower bswap for vectors This would hit an assertion from trying to use the wrong bitwidth for the constants.	2020-03-16 13:59:08 -04:00
Matt Arsenault	19a0350187	GlobalISel: Fix round lowering I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.	2020-03-16 11:37:30 -04:00
Dominik Montada	8ff2dcb18b	[GlobalISel] add additional lowering support for G_INSERT Summary: Add lowering support for inserting pointers or scalars into scalars, vectors or pointers Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75994	2020-03-16 16:27:17 +01:00
Dominik Montada	6b96623dcb	[GlobalISel] fix crash in narrowScalarExtract if DstRegs only has one register Summary: When narrowing a scalar G_EXTRACT where the destination lines up perfectly with a single result of the emitted G_UNMERGE_VALUES a COPY should be emitted instead of unconditionally trying to emit a G_MERGE_VALUES. Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75743	2020-03-12 09:14:35 +01:00
Matt Arsenault	c0ad75e758	GlobalISel: Don't try to narrow extending loads/trunc store If the loaded memory size was smaller than the result size, this would produce out of bounds memory accesses. I'm wondering if we need a distinct narrow memory legalize action type, since a case I care about is decomposing a 4-byte unaligned access into 4 extending loads, which would leave the original result register type. I'm currently awkwardly using narrowScalar to handle unaligned accesses that need to be split.	2020-03-10 23:34:10 -04:00
Matt Arsenault	b17a81f8b2	GlobalISel: Add missing add/sub with carries to MachineIRBuilder	2020-03-10 22:39:55 -04:00
Matt Arsenault	ce8a1f7294	GlobalISel: Implement fewerElementsVector for G_TRUNC Extend fewerElementsVectorBasic to handle operands with different element types.	2020-03-10 15:17:20 -07:00
Amara Emerson	c1a97e992d	Revert "Revert "[GlobalISel][Localizer] Enable intra-block localization of already-local uses."" This reverts commit `5583c2f2fb`. The lldb bot failure was a test that was fragile and sensitive to irrelevant changes in instruction ordering. Re-committing this as the test should have been skipped for AArch64 now. Differential Revision: https://reviews.llvm.org/D75555	2020-03-06 21:35:08 -08:00
Dominik Montada	feb20a1594	[GlobalISel] add missing libcalls and 128-bit support for floating points Add libcall support for G_FMINNUM, G_FMAXNUM, G_FSQRT, G_FRINT, G_FNEARBYINT. Add 128-bit libcall support for all simple libcalls. Reviewers: arsenm, Petar.Avramovic, dsanders, petarj, paquette Subscribers: wdng, rovka, hiraditya, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D75516	2020-03-06 09:06:13 +01:00
Muhammad Omair Javaid	5583c2f2fb	Revert "[GlobalISel][Localizer] Enable intra-block localization of already-local uses." This reverts commit `e91e1df6ab`.	2020-03-05 03:12:28 +05:00
Matt Arsenault	b71203a751	GlobalISel: Move some legalizer functions to utils	2020-03-04 16:40:00 -05:00
Matt Arsenault	fb0c35fa34	GlobalISel: Set alignment on function argument stack load/store	2020-03-04 16:38:46 -05:00
Amara Emerson	e91e1df6ab	[GlobalISel][Localizer] Enable intra-block localization of already-local uses. This changes the localizer to attempt intra-block localizer of instructions that have local uses. This is useful because sometimes the entry block itself has many uses of constant-like instructions, which would benefit from shortening live ranges. Previously if an inst had no non-local uses, we wouldn't add it to the list of instructions to attempt further intra-block localization. This gives a 0.7% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D75555	2020-03-03 18:14:57 -08:00
Volkan Keles	4167645d1e	GlobalISel: Move Localizer::shouldLocalize(..) to TargetLowering Add a new target hook for shouldLocalize so that targets can customize the logic. https://reviews.llvm.org/D75207	2020-03-02 09:15:40 -08:00
Matt Arsenault	6fc0d00823	GlobalISel: Fix lowering for G_UADDE/G_USUBE The type parameter passed into lower is invalid and should be removed from the function.	2020-02-26 19:10:52 -08:00
Matt Arsenault	c7e8d8b13e	GlobalISel: Cleanup code with MachineIRBuilder features	2020-02-26 19:10:34 -08:00
Quentin Colombet	5bf0023b0d	[GISel][KnownBits] Update a comment regarding the effect of cache on PHIs Unlike what I claimed in my previous commit. The caching is actually not NFC on PHIs. When we put a big enough max depth, we end up simulating loops. The cache is effectively cutting the simulation short and we get less information as a result. E.g., ``` v0 = G_CONSTANT i8 0xC0 jump v1 = G_PHI i8 v0, v2 v2 = G_LSHR i8 v1, 1 ``` Let say we want the known bits of v1. - With cache: Set v1 cache to we know nothing v1 is v0 & v2 v0 gives us 0xC0 v2 gives us known bits of v1 >> 1 v1 is in the cache => v1 is 0, thus v2 is 0x80 Finally v1 is v0 & v2 => 0x80 - Without cache and enough depth to do two iteration of the loop: v1 is v0 & v2 v0 gives us 0xC0 v2 gives us known bits of v1 >> 1 v1 is v0 & v2 v0 is 0xC0 v2 is v1 >> 1 Reach the max depth for v1... unwinding v1 is know nothing v2 is 0x80 v0 is 0xC0 v1 is 0x80 v2 is 0xC0 v0 is 0xC0 v1 is 0xC0 Thus now v1 is 0xC0 instead of 0x80. I've added a unittest demonstrating that. NFC	2020-02-25 15:56:15 -08:00
Jay Foad	ccee390767	GlobalISel: NFC minor cleanup to avoid a couple of fixed size local arrays	2020-02-25 09:49:19 +00:00
Matt Arsenault	11e3dde625	GlobalISel: Reimplement fewerElementsVectorBasic Changes the handling of odd breakdowns, and avoids using G_EXTRACT/G_INSERT. Pad with undef to a wider size, and unmerge. Also avoid introducing instructions for the fully undef components.	2020-02-24 21:19:47 -05:00
Quentin Colombet	b6d63c92ec	[GISel][KnownBits] Suppress unused warning on the dump method NFC	2020-02-21 21:07:04 -08:00
Quentin Colombet	618dec2aef	[GISel][KnownBits] Add a cache mechanism to speed compile time This patch adds a cache that is valid only for the duration of a call to getKnownBits. With such short lived cache we avoid all the problems of cache invalidation while still getting the benefits of reusing the information we already computed. This cache is useful whenever an instruction occurs more than once in a chain of computation. E.g., v0 = G_ADD v1, v2 v3 = G_ADD v0, v1 Previously we would compute the known bits for: v1, v2, v0, then v1 again and finally v3. With the patch, now we won't have to recompute v1 again. NFC	2020-02-21 14:31:42 -08:00
Jay Foad	cab39e4b8c	GlobalISel: Fix narrowing of (G_ASHR i64:x, 32) Reviewers: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74950	2020-02-21 16:51:03 +00:00
Quentin Colombet	e4a9225f5d	[GISel][KnownBits] Give up on PHI analysis as soon as we don't know anything When analyzing PHIs, we gather the known bits for every operand and merge them together to get the known bits of the result of the PHI. It is not unusual that merging the information leads to know nothing on the result (e.g., phi a: i8 3, b: i8 unknown, ..., after looking at the second argument we know we will know nothing on the result), thus, as soon as we reach that state, stop analyzing the following operand (i.e., on the previous example, we won't process anything after looking at `b`). This improves compile time in particular with PHIs with a large number of operands. NFC.	2020-02-20 11:34:01 -08:00
Aditya Nandakumar	b91d9ec0bb	[GlobalISel]: Fix some non determinism exposed in CSE due to not notifying observers about mutations + add verification for CSE https://reviews.llvm.org/D67133 While investigating some non determinism (CSE doesn't produce wrong code, it just doesn't CSE some times) in GISel CSE on an out of tree target, I realized that the core issue was that there were lots of code that mutates (setReg, setRegClass etc), but doesn't notify observers (CSE in this case but this could be any other observer). In order to make the Observer be available in various parts of code and to avoid having to thread it through various API, the MachineFunction now has the observer as field. This allows it to be easily used in helper functions such as constrainOperandRegClass. Also added some invariant verification method in CSEInfo which can catch these issues (when CSE is enabled).	2020-02-18 14:54:57 -08:00
Matt Arsenault	0e2eb357e0	GlobalISel: Extend narrowing to G_ASHR	2020-02-17 10:42:59 -08:00
Matt Arsenault	8550859535	GlobalISel: Extend shift narrowing to G_SHL	2020-02-17 09:13:37 -08:00
Matt Arsenault	78d455adf0	GlobalISel: Add combine to narrow G_LSHR Produce an unmerge to a narrower type and introduce a narrower shift if needed. I wasn't sure if there was a better way to parameterize the target's preferred shift type for the GICombineRule, so manually call the combine helper.	2020-02-17 08:04:52 -08:00
Matt Arsenault	3bb0ff8341	GlobalISel: Remove unused function argument	2020-02-14 15:57:39 -08:00
Matt Arsenault	bfbfa18591	GlobalISel: Lower s64->s16 G_FPTRUNC This is more or less directly ported from the AMDGPU custom lowering for FP_TO_FP16. I made a few minor fixups (using G_UNMERGE_VALUES instead of creating shift/trunc to extract the two halves, and zexting an inverted compare instead of select_cc). This also does not include the fast math expansion the DAG which converts to f32 and then to f16. I think that belongs in a pre-legalize combine instead.	2020-02-14 10:46:58 -08:00
Volkan Keles	187686a22f	[GlobalISel] LegalizationArtifactCombiner: Fix a bug in tryCombineMerges Like COPY instructions explained in D70616, we don't check the constraints when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check if registers can be replaced, or a COPY instruction needs to be built. https://reviews.llvm.org/D70564	2020-02-14 10:45:58 -08:00
Matt Arsenault	de256478e6	GlobalISel: Don't use LLT references These should always be passed by value	2020-02-13 15:25:30 -05:00
Jay Foad	32aac25637	[KnownBits] Introduce anyext instead of passing a flag into zext Summary: This was a very odd API, where you had to pass a flag into a zext function to say whether the extended bits really were zero or not. All callers passed in a literal true or false. I think it's much clearer to make the function name reflect the operation being performed on the value we're tracking (rather than on the KnownBits Zero and One fields), so zext means the value is being zero extended and new function anyext means the value is being extended with unknown bits. NFC. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74482	2020-02-12 19:06:53 +00:00
Amara Emerson	067dd9c6b1	[GlobalISel][CallLowering] Use stripPointerCasts(). A downstream test exposed a simple logic bug with the manual pointer stripping code, fix that by just using stripPointerCasts() on the value. I don't think there's a way to expose this issue upstream.	2020-02-10 15:43:57 -08:00
Amara Emerson	21c9d9ad43	[GlobalISel][CallLowering] Tighten constantexpr check for callee. I'm not sure there's a test case for this, but it's better to be safe.	2020-02-09 22:59:48 -08:00
Matt Arsenault	312a9d1b83	GlobalISel: Fix narrowScalar for G_{CTLZ\|CTTZ}_ZERO_UNDEF Narrow these for 64-bit VALU for AMDGPU.	2020-02-09 19:02:38 -05:00
Matt Arsenault	6135f5eda4	GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ The result type is separate from the source type.	2020-02-09 18:11:43 -05:00
Amara Emerson	35c63d66aa	[GlobalISel][CallLowering] Look through bitcasts from constant function pointers. Calls to ObjC's objc_msgSend function are done by bitcasting the function global to the required function type signature. This patch looks through this bitcast so that we can do a direct call with bl on arm64 instead of using an indirect blr. Differential Revision: https://reviews.llvm.org/D74241	2020-02-07 15:32:54 -08:00
Petar Avramovic	7df5fc9e03	[GlobalISel] Add buildMerge with SrcOp initializer list Allows more flexible use of buildMerge in places where use operands are available as SrcOp since it does not require explicit conversion to Register. Simplify code with new buildMerge. Differential Revision: https://reviews.llvm.org/D74223	2020-02-07 18:43:45 +01:00
Amara Emerson	28d22c2c9c	[GlobalISel][IRTranslator] Add special case support for ~memory inline asm clobber. This is a one off special case, since actually implementing full inline asm support will be much more involved. This lets us compile a lot more code as a common simple case. Differential Revision: https://reviews.llvm.org/D74201	2020-02-07 08:55:23 -08:00
Matt Arsenault	3b198518ad	GlobalISel: Fix narrowing of G_CTPOP The result type is separate from the source type. Tests will be included in a future AMDGPU patch which uses this from RegBankSelect/applyMappingImpl.	2020-02-07 06:58:00 -08:00
Matt Arsenault	8de2dad9e0	GlobalISel: Fix lowering of G_CTLZ/G_CTTZ The type passed to lower was invalid, so I'm not sure how this was even working before. The source and destination type also do not have to match, so make sure to use the right ones.	2020-02-07 06:54:12 -08:00
Guillaume Chatelet	f85d3408e6	[NFC] Introduce an API for MemOp Summary: This patch introduces an API for MemOp in order to simplify and tighten the client code. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73964	2020-02-07 11:32:27 +01:00
Amara Emerson	ac8a12c874	[GlobalISel] Use G_ZEXTLOAD instead of an anyextending load for non-pow-2 legalization. Fixes PR43288	2020-02-06 14:36:36 -08:00
Konstantin Schwarz	76986bdc46	[GlobalISel] Legalize more G_FP(EXT\|TRUNC) libcalls. This adds a new helper function for retrieving the floating point type corresponding to the specified bit-width.	2020-02-06 11:41:34 -08:00
Jeremy Morse	6531a78ac4	Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field" This reverts commit `ed29dbaafa`. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in `e39e2b4a79` that's technically unrelated. dbg-addr-dse.ll conflicted because `41206b61e3` (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref.	2020-02-06 14:41:40 +00:00
Matt Arsenault	7464e8d6ad	GlobalISel: Remove check for illegal MIR The verifier will catch this.	2020-02-05 18:37:17 -05:00
Matt Arsenault	9087ef0765	GlobalISel: Allow CSE of G_IMPLICIT_DEF The legalizer produces a lot of these, and they make reading legalized MIR annoying. For some reason, this does seem to sometimes introduce copies of implicit def, which is dumb.	2020-02-05 17:47:21 -05:00
Shu-Chun Weng	ce9633633c	[GlobalISel][AArch64] Fix contract cross-bank copies with SIMD instructions contractCrossBankCopyIntoStore() finds the instruction defines the source register and uses its output to replace the register. There are, however, instructions that have multiple outputs, e.g. G_UNMERGE_VALUES. Current implementation hardcodes to operand 0 and has no way of knowing which output should be used. This change adds another function to directly return the register that is the source of the register and use that for folding. This fixes https://bugs.llvm.org/show_bug.cgi?id=44783 Differential Revision: https://reviews.llvm.org/D74005	2020-02-05 10:38:35 -08:00
Matt Arsenault	de8451fe4d	GlobalISel: Fold SmallVector resizes into constructors	2020-02-04 10:28:08 -08:00
Guillaume Chatelet	b8144c0536	[NFC] Encapsulate MemOp logic Summary: This patch simply introduces functions instead of directly accessing the fields. This helps introducing additional check logic. A second patch will add simplifying functions. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73945	2020-02-04 10:36:26 +01:00
Matt Arsenault	cd7650c186	GlobalISel: Implement fewerElementsVector for G_SEXT_INREG Start using a new strategy with a combination of merge and unmerges. This allows scalarizing before lowering, which in cases like <2 x s128> avoids producing giant illegal shifts.	2020-02-03 11:47:33 -08:00
Guillaume Chatelet	fc19465965	[Alignment][NFC] Use Align for code creating MemOp Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73874	2020-02-03 14:10:30 +01:00
Matt Arsenault	bc101ffd77	GlobalISel: Support widening unmerge results with pointer source	2020-02-01 10:47:03 -05:00
Jay Foad	f465b1aff4	[GlobalISel] Tweak lowering of G_SMULO/G_UMULO Summary: Applying this cleanup: - MIRBuilder.buildInstr(TargetOpcode::G_ASHR) - .addDef(Shifted) - .addUse(Res) - .addUse(ShiftAmt); + MIRBuilder.buildAShr(Shifted, Res, ShiftAmt); caused an assertion failure here: llc: /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:404: llvm::MachineInstr *llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() \|\| std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition"' failed. #4 0x00000000050a6d96 in llvm::MachineRegisterInfo::getVRegDef (this=0x74606a0, Reg=2147483650) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:403 #5 0x00000000066148f6 in llvm::getConstantVRegValWithLookThrough (VReg=2147483650, MRI=..., LookThroughInstrs=false, HandleFConstant=true) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:244 #6 0x00000000066147da in llvm::getConstantVRegVal (VReg=2147483650, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:210 #7 0x0000000006615367 in llvm::ConstantFoldBinOp (Opcode=101, Op1=2147483650, Op2=2147483656, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:341 #8 0x000000000657eee0 in llvm::CSEMIRBuilder::buildInstr (this=0x7465010, Opc=101, DstOps=..., SrcOps=..., Flag=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp:160 #9 0x0000000003645958 in llvm::MachineIRBuilder::buildAShr (this=0x7465010, Dst=..., Src0=..., Src1=..., Flags=...) at /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1298 #10 0x00000000065c35b1 in llvm::LegalizerHelper::lower (this=0x7fffffffb5f8, MI=..., TypeIdx=0, Ty=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2020 because at this point there are two instructions defining Res: the original G_SMULO/G_UMULO and the new G_MUL that we built. The fix is to modify the original mul in place, so that there is only ever one definition of Res. Reviewers: arsenm, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72842	2020-01-31 19:21:01 +00:00
Jay Foad	2a1b5af299	[GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister Summary: As a side effect some redundant copies of constant values are removed by CSEMIRBuilder. Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73789	2020-01-31 17:07:16 +00:00
Guillaume Chatelet	3c89b75f23	[NFC] Introduce a type to model memory operation Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785	2020-01-31 17:29:01 +01:00
Quentin Colombet	cfebd77742	[GISel][KnownBits] Fix a bug where we could run out of stack space One of the exit criteria of computeKnownBits is whether we reach the max recursive call depth. Before this patch we would check that the depth is exactly equal to max depth to exit. Depth may get bigger than max depth if it gets passed to a different GISelKnownBits object. This may happen when say a generic part uses a GISelKnownBits object with some max depth, but then we hit TL.computeKnownBitsForTargetInstr which creates a new GISelKnownBits object with a different and smaller depth. In that situation, when we hit the max depth check for the first time in the target specific GISelKnownBits object, depth may already be bigger than the current max depth. Hence we would continue to compute the known bits, until we ran through the full depth of the chain of computation or ran out of stack space. For instance, let say we have GISelKnownBits Info(/MaxDepth/ = 10); Info.getKnownBits(Foo) // 9 recursive calls to computeKnownBitsImpl. // Then we hit a target specific instruction. // The target specific GISelKnownBits does this: GISelKnownBits TargetSpecificInfo(/MaxDepth/ = 6) TargetSpecificInfo.computeKnownBitsImpl() // <-- next max depth checks would // always return false. This commit does not have any test case, none of the in-tree targets use computeKnownBitsForTargetInstr.	2020-01-30 19:30:39 -08:00
Amara Emerson	84bd851108	[GlobalISel][IRTranslator] When translating vector geps, splat the base pointer if required. We can have geps that have a scalar base pointer, and a vector index value, which means that the base pointer must be splatted into a vector of pointers. This fixes crashes on arm64 GlobalISel with optimizations enabled.	2020-01-30 16:27:27 -08:00
Matt Arsenault	eb7f74e300	CodeGen: Use Register	2020-01-30 15:01:56 -08:00
Matt Arsenault	ea956685a1	GlobalISel: Implement s32->s64 G_FPTOSI lowering Port directly from DAG version. The lowering for G_FPTOUI used to fail on AMDGPU because it uses G_FPTOSI.	2020-01-30 08:47:07 -05:00
Dominik Montada	dc141af755	[GlobalISel] (fix) Use pointer type size for offset constant when lowering stores Commit `9965b12fd1` was supposed to change the offset constant when lowering load/stores, but only introduced this change for loads. This patch adds the same fix for stores.	2020-01-30 08:32:35 -05:00
Matt Arsenault	c5fffa4da3	GlobalISel: Add observer argument to legalizeIntrinsic This is passed to legalizeCustom, but not intrinsic. Also remove the MRI argument, since you can get that from the MachineIRBuilder. I'm not sure why MachineIRBuilder has a private observer member, and this is passed separately.	2020-01-29 18:33:45 -05:00
Amara Emerson	c12f046eb9	[GlobalISel] Add new combine to convert scalar G_MUL to G_SHL. For pow2 constants we should use G_SHL for pattern matching (and perf) purposes later. Vector support not yet implemented. Differential Revision: https://reviews.llvm.org/D73659	2020-01-29 13:39:00 -08:00
Amara Emerson	0da937bb5c	[GlobalISel][IRTranslator] Follow convention and put constant offset of getelementptr arithmetic on RHS. We were needlessly putting known constant values on the LHS of a G_MUL, which is suboptimal. Differential Revision: https://reviews.llvm.org/D73650	2020-01-29 11:37:19 -08:00
Matt Arsenault	b63629a58d	GlobalISel: Fix mask computation in lowerInsert This is supposed to be the high bit index, not the width. Use the wrapping form of getBitsSet and avoid the bitflip.	2020-01-29 08:25:36 -08:00
Matt Arsenault	f717483acd	GlobalISel: Assert on invalid bitcast in MIRBuilder The other casts validate, so this should too.	2020-01-29 07:49:39 -08:00
Matt Arsenault	c5c1bb3374	GlobalISel: Lower G_WRITE_REGISTER	2020-01-29 06:48:24 -08:00
David Stenberg	6a2413c435	[ARM64] Debug info for structure argument missing DW_AT_location Summary: Prevent eliminating dbg_val due to COPY. Fixes this https://bugs.llvm.org/show_bug.cgi?id=40709 Patch by: Kamlesh Kumar (kamleshbhalui) Reviewers: aprantl, dblaikie, vsk, dsanders Reviewed By: dsanders Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73159	2020-01-29 10:56:23 +01:00
Jay Foad	cbbbd5b5f6	[GlobalISel] Make use of KnownBits::computeForAddSub Summary: This is mostly NFC. computeForAddSub may give more precise results in some cases, but that doesn't seem to affect any existing GlobalISel tests. Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73431	2020-01-27 22:22:56 +00:00
Dominik Montada	9965b12fd1	Use pointer type size for offset constant when lowering load/stores	2020-01-27 06:55:32 -08:00
Matt Arsenault	2a160ba5b0	GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results Only use shifts if the requested type exactly matches the source type, and create sub-unmerges otherwise.	2020-01-27 06:18:26 -08:00
Matt Arsenault	06d9230fef	GlobalISel: Translate vector GEPs	2020-01-27 05:35:05 -08:00
Petar Avramovic	cbf03aee6d	[MIPS GlobalISel] Select population count (popcount) G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates these intrinsics from __builtin_popcount and __builtin_popcountll. Add lower and narrow scalar for G_CTPOP. Lower G_CTPOP for MIPS32. Differential Revision: https://reviews.llvm.org/D73216	2020-01-27 09:59:50 +01:00
Petar Avramovic	8bc7ba5b9e	[MIPS GlobalISel] Select count trailing zeros llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_ctz and __builtin_ctzll. G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true). Clang generates such intrinsics as parts of expansion of builtin_ffs and builtin_ffsll. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ. Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32. Differential Revision: https://reviews.llvm.org/D73215	2020-01-27 09:51:06 +01:00
Petar Avramovic	2b66d32f3f	[MIPS GlobalISel] Select count leading zeros llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. MIPS clz instruction returns 32 for zero input. G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_clz and __builtin_clzll. G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as second argument. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF). Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32. Differential Revision: https://reviews.llvm.org/D73214	2020-01-27 09:43:38 +01:00
Quentin Colombet	5d87b5d202	[GISelKnownBits] Add support for PHIs Teach the GISelKnowBits analysis how to deal with PHI operations. PHIs are essentially COPYs happening on edges, so we can just reuse the code for COPY. This is NFC COPY-wise has we leave Depth untouched when calling computeKnownBitsImpl for COPYs, like it was before this patch. Increasing Depth is however required for PHIs as they may loop back to themselves and we would end up in an infinite loop if we were not increasing Depth. Differential Revision: https://reviews.llvm.org/D73317	2020-01-24 16:43:52 -08:00
Guillaume Chatelet	805c157e8a	[Alignment][NFC] Deprecate Align::None() Summary: This is a follow up on https://reviews.llvm.org/D71473#inline-647262. There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case. Reviewers: xbolva00, courbet, bollu Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73099	2020-01-24 12:53:58 +01:00
Jay Foad	b482e1bfe2	[CodeGen] Make use of MachineInstrBuilder::getReg Reviewers: arsenm Subscribers: wdng, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73262	2020-01-23 13:38:13 +00:00
Quentin Colombet	ff1f3cc1a1	[GISelKnownBits] Make the max depth a parameter of the analysis Allow users of that analysis to define the cut off depth of the analysis instead of hardcoding 6. NFC as the default parameter is 6.	2020-01-21 11:35:31 -08:00
Matt Arsenault	a66d2817ca	GlobalISel: Don't ignore requested ext narrowing type This was assuming the narrow target was the source type. Respect the requested type when these don't match by using intermediate merges. This avoids producing very wide, illegal shift expansions.	2020-01-16 14:29:37 -05:00
Matt Arsenault	be31a7b7ee	GlobalISel: Move extension scalar narrowing to separate function Also rename a few things. Handling a different requested type will require this to become much more complex.	2020-01-16 14:29:37 -05:00
Matt Arsenault	d0943537e1	GlobalISel: Apply target MMO flags to atomics Unify MMO flag handling with SelectionDAG like with loads and stores.	2020-01-16 13:49:43 -05:00
Matt Arsenault	0d0fce42b0	GlobalISel: Preserve load/store metadata in IRTranslator This was dropping the invariant metadata on dead argument loads, so they weren't deleted. Atomics still need to be fixed the same way. Also, apparently store was never preserving dereferencable which should also be fixed.	2020-01-16 13:49:43 -05:00
Jay Foad	885260d5d8	[GlobalISel] Don't arbitrarily limit a mask to 64 bits Reviewers: arsenm Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72853	2020-01-16 16:13:20 +00:00
Jay Foad	63f73545dd	[GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods Reviewers: arsenm, aditya_nandakumar, aemerson Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72849	2020-01-16 16:04:21 +00:00
Jay Foad	28bb43bdf8	[GlobalISel] Use more MachineIRBuilder helper methods Reviewers: arsenm, nhaehnle Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72833	2020-01-16 15:34:51 +00:00
Matt Arsenault	25e9938a45	GlobalISel: Handle more cases of G_SEXT narrowing This now develops the same problem G_ZEXT/G_ANYEXT have where the requested type is assumed to be the source type. This will be fixed separately by creating intermediate merges.	2020-01-15 18:33:15 -05:00
Matt Arsenault	936483fb7d	GlobalISel: Implement lower for G_BITCAST Bitcast only really applies between scalars and vectors. Implement as an unmerge and remerge. The test needs to tolerate failure since one of the unmerges currently fails to legalize.	2020-01-15 08:58:58 -05:00
Matt Arsenault	91715617ad	GlobalISel: Fix narrowScalar for G_ANYEXT results This is nearly the same as G_ZEXT.	2020-01-15 08:58:57 -05:00
Eli Friedman	e68e4cbcc5	[GlobalISel] Change representation of shuffle masks in MachineOperand. We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663	2020-01-13 16:55:41 -08:00
Matt Arsenault	0ea3c7291f	GlobalISel: Handle llvm.read_register Compared to the attempt in `bdcc6d3d26`, this uses intermediate generic instructions.	2020-01-09 17:37:52 -05:00
Matt Arsenault	595ac8c46e	GlobalISel: Move getLLTForMVT/getMVTForLLT As an intermediate step, some TLI functions can be converted to using LLT instead of MVT. Move this somewhere out of GlobalISel so DAG functions can use these.	2020-01-09 16:32:51 -05:00
Matt Arsenault	fba1fbb9c7	GlobalISel: Don't assert on MoreElements creating vectors If the original type was a scalar, it should be valid to add elements to turn it into a vector. Tests included with following legalization change.	2020-01-09 16:29:44 -05:00
Amara Emerson	b6598bcf4b	[AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets. E.g. %addr1 = G_PTR_ADD %base, G_CONSTANT 20 %addr2 = G_PTR_ADD %addr1, G_CONSTANT 8 --> %addr2 = G_PTR_ADD %base, G_CONSTANT 28 Differential Revision: https://reviews.llvm.org/D72351	2020-01-07 14:12:42 -08:00
Matt Arsenault	f3de8ab5cc	GlobalISel: Implement lower for G_INTRINSIC_ROUND Mostly copied from AMDGPU lowering implementation, except used G_SITOFP instead of directly creating a select on -1.0, 0.0.	2020-01-06 18:26:42 -05:00
Matt Arsenault	1060b9e23b	GlobalISel: Correct result type for G_FCMP in lowerFPTOUI Using the final result type doesn't make any sense. Use the natural default boolean type for the select condition.	2020-01-06 17:21:51 -05:00
Matt Arsenault	0b093f0212	GlobalISel: Start adding computeNumSignBits to GISelKnownBits	2020-01-06 17:21:51 -05:00
Matt Arsenault	d12f2a2998	GlobalISel: Scalarize all division operations This only handled G_SDIV, but they all are trivially scalarizable. Also define placeholder AMDGPU division legalizer rules.	2020-01-04 13:47:10 -05:00
Matt Arsenault	1f950ced50	GlobalISel: Define G_READCYCLECOUNTER	2020-01-04 13:10:19 -05:00
Matt Arsenault	21309eafde	GlobalISel: Add type argument to getRegBankFromRegClass AMDGPU can't unambiguously go back from the selected instruction register class to the register bank without knowing if this was used in a boolean context.	2020-01-03 16:25:10 -05:00
Reid Kleckner	9c2b72821b	Move tail call disabling code to target independent code When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in `d9699bc7bd` (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118	2020-01-03 11:27:41 -08:00
Petar Avramovic	98f72a5107	[MIPS GlobalISel] Select bitreverse. Recommit G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Recommit notes: Introduce temporary variables in order to make sure instructions get inserted into MachineFunction in same order regardless of compiler used to build llvm. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 18:06:29 +01:00
Matt Arsenault	9fd31fdbd3	GlobalISel: moreElementsVector for FP min/max	2019-12-30 10:39:53 -05:00
Dmitri Gribenko	32cc14100e	Revert "[MIPS GlobalISel] Select bitreverse" This reverts commit `dbc136e0fe`. It broke buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066	2019-12-30 14:29:47 +01:00
Petar Avramovic	dbc136e0fe	[MIPS GlobalISel] Select bitreverse G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 11:26:45 +01:00
Petar Avramovic	94a24e7a40	[MIPS GlobalISel] Select bswap G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates these intrinsics from __builtin_bswap32 and __builtin_bswap64. Add lower and narrowscalar for G_BSWAP. Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later. Differential Revision: https://reviews.llvm.org/D71362	2019-12-30 11:13:22 +01:00
Matt Arsenault	0d47399167	GlobalISel: Update syntax in debug printing Physical register names now start with $, not %	2019-12-24 10:37:36 -05:00
Matt Arsenault	9b61641564	GlobalISel: Fix naming variables "brank" instead of "bank"	2019-12-24 10:36:54 -05:00
Martin Storsjö	5a751e747d	[AArch64] [Windows] Use COFF stubs for calls to extern_weak functions As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in `6bf108d77a`. Differential Revision: https://reviews.llvm.org/D71721	2019-12-23 12:13:49 +02:00
Daniel Sanders	c3cb089a87	[gicombiner] Import tryCombineIndexedLoadStore() Summary: Now that arbitrary data is supported, import tryCombineIndexedLoadStore() Depends on D69147 Reviewers: bogner, volkan Reviewed By: volkan Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69151	2019-12-18 14:41:38 +00:00
Roman Tereshin	8731799fc6	[Legalizer] Making artifact combining order-independent Legalization algorithm is complicated by two facts: 1) While regular instructions should be possible to legalize in an isolated, per-instruction, context-free manner, legalization artifacts can only be eliminated in pairs, which could be deeply, and ultimately arbitrary nested: { [ () ] }, where which paranthesis kind depicts an artifact kind, like extend, unmerge, etc. Such structure can only be fully eliminated by simple local combines if they are attempted in a particular order (inside out), or alternatively by repeated scans each eliminating only one innermost pair, resulting in O(n^2) complexity. 2) Some artifacts might in fact be regular instructions that could (and sometimes should) be legalized by the target-specific rules. Which means failure to eliminate all artifacts on the first iteration is not a failure, they need to be tried as instructions, which may produce more artifacts, including the ones that are in fact regular instructions, resulting in a non-constant number of iterations required to finish the process. I trust the recently introduced termination condition (no new artifacts were created during as-a-regular-instruction-retrial of artifacts not eliminated on the previous iteration) to be efficient in providing termination, but only performing the legalization in full if and only if at each step such chains of artifacts are successfully eliminated in full as well. Which is currently not guaranteed, as the artifact combines are applied only once and in an arbitrary order that has to do with the order of creation or insertion of artifacts into their worklist, which is a no particular order. In this patch I make a small change to the artifact combiner, making it to re-insert into the worklist immediate (modulo a look-through copies) artifact users of each vreg that changes its definition due to an artifact combine. Here the first scan through the artifacts worklist, while not being done in any guaranteed order, only needs to find the innermost pair(s) of artifacts that could be immediately combined out. After that the process follows def-use chains, making them shorter at each step, thus combining everything that can be combined in O(n) time. Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders Reviewed By: aditya_nandakumar, paquette Tags: #llvm Differential Revision: https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	18bf9670aa	[Legalizer] Refactoring out legalizeMachineFunction and introducing new unittests/CodeGen/GlobalISel/LegalizerTest.cpp relying on it to unit test the entire legalizer algorithm (including the top-level main loop). See also https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	8207c81597	[Legalizer] More detailed debugging printing in main loop	2019-12-13 15:45:18 -08:00
Kiran Chandramohan	965ed1e974	[AArch64] Fix issues with large arrays on stack Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496	2019-12-10 11:44:41 +00:00
Volkan Keles	bfa3d260b8	[GlobalISel] Localizer: Allow targets not to run the pass conditionally Summary: Previously, it was not possible to skip running the localizer pass conditionally. This patch adds an input function to the pass which decides if the pass should run on the given MachineFunction or not. No test case as there is no upstream target needs this functionality. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71038	2019-12-05 11:09:50 -08:00
Amara Emerson	28f5ad5801	[GlobalISel] Fix compiler crash lowering G_LOAD in AArch64. Patch by Daniel Rodríguez Troitiño. Differential Revision: https://reviews.llvm.org/D70794	2019-12-04 17:04:54 -08:00
Aditya Nandakumar	6da7dbb806	[GlobalISel]: Allow targets to override how to widen constants during legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.	2019-12-03 10:41:10 -08:00
Volkan Keles	3d02fa6da7	[GlobalISel] CombinerHelper: Fix a bug in matchCombineCopy Summary: When combining COPY instructions, we were replacing the destination registers with the source register without checking register constraints. This patch adds a simple logic to check if the constraints match before replacing registers. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic Reviewed By: aditya_nandakumar Subscribers: rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70616	2019-12-02 12:05:09 -08:00
Tom Stellard	ab411801b8	[cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179	2019-11-21 10:48:08 -08:00
Matt Arsenault	b696b9dba7	DAG: Add function context to isFMAFasterThanFMulAndFAdd AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.	2019-11-19 19:25:26 +05:30
Quentin Colombet	98ceac4981	[GISel][CombinerHelper] Use uses() instead of operands() when traversing use operands. NFC	2019-11-15 13:54:33 -08:00
Quentin Colombet	304abde077	[GISel][CombinerHelper] Add support for scalar type for the result of shuffle vector LLVM IR of 1-element vectors get lower into scalar in GISel. As a result, shuffle vector may also produce a scalar. This patch teaches the shuffle combiner how to deal with scalars when they are in the destination type of a shuffle vector. For now, we just support the easy case where this can be lowered to a plain copy. For other cases, we leave the shuffle vector as is. This type of IR are seen in O0 pipelines. E.g., as produced with SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c. rdar://problem/57198904	2019-11-15 13:54:33 -08:00
Matt Arsenault	bc276c6379	GlobalISel: Lower s1 source G_SITOFP/G_UITOFP	2019-11-15 13:37:20 +05:30
Daniel Sanders	b2839c442e	[globalisel][irtanslator] The IRTranslator should preserve TBAA information	2019-11-14 12:11:27 -08:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Daniel Sanders	e74c5b9661	[globalisel] Rename G_GEP to G_PTR_ADD Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734	2019-11-05 10:31:17 -08:00
Hiroshi Yamauchi	0d987e411a	[PGO][PGSO] TargetLowering/TargetTransformationInfo/SwitchLoweringUtils part. Summary: (Split of off D67120) TargetLowering/TargetTransformationInfo/SwitchLoweringUtils changes for profile guided size optimization. Reviewers: davidxl Subscribers: eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69580	2019-10-31 13:22:56 -07:00
Quentin Colombet	f0eeb3c7a7	[GISel][CombinerHelper] Combine shuffle_vector scalar to build_vector Teach the combiner helper how to replace shuffle_vector of scalars into build_vector. I am not particularly happy about having to add this combine, but we currently get those from <1 x iN> from the IR. Bonus: This fixes an assert in the shuffle_vector combines since before this patch, we were expecting vector types.	2019-10-30 18:20:37 -07:00
Greg Bedwell	b1c4b4d5cb	Fix a spelling mistake in a comment. NFC	2019-10-29 12:19:52 +00:00
Andrew Paverd	d157a9bc8b	Add Windows Control Flow Guard checks (/guard:cf). Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761	2019-10-28 15:19:39 +00:00
Matt Arsenault	1a276d1e8c	GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT	2019-10-25 13:55:07 -07:00
Austin Kerbow	c35b358b74	AMDGPU/GlobalISel: Legalize FDIV16 Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69347	2019-10-25 11:07:17 -07:00
Quentin Colombet	6f0ae81512	[GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors Teach the CombinerHelper how to turn shuffle_vectors, that concatenate vectors, into concat_vectors and add this combine to the AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69149 llvm-svn: 375452	2019-10-21 20:39:58 +00:00
Guillaume Chatelet	5df90cd71c	[Alignment][NFC] TargetCallingConv::setByValAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69248 llvm-svn: 375410	2019-10-21 12:05:33 +00:00
Guillaume Chatelet	bac5f6bd21	[Alignment][NFC] TargetCallingConv::setOrigAlign and TargetLowering::getABIAlignmentForCallingConv Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69243 llvm-svn: 375407	2019-10-21 11:01:55 +00:00
Reid Kleckner	1d7b41361f	Prune two MachineInstr.h includes, fix up deps MachineInstr.h included AliasAnalysis.h, which includes a world of IR constructs mostly unneeded in CodeGen. Prune it. Same for DebugInfoMetadata.h. Noticed with -ftime-trace. llvm-svn: 375311	2019-10-19 00:22:07 +00:00
Daniel Sanders	149a020425	Fix unused variable in r375066 llvm-svn: 375070	2019-10-17 01:21:40 +00:00
Daniel Sanders	329e748c8c	[gicombiner] Add the run-time rule disable option Summary: Each generated helper can be configured to generate an option that disables rules in that helper. This can be used to bisect rulesets. The disable bits are stored in a SparseVector as this is very cheap for the common case where nothing is disabled. It gets more expensive the more rules are disabled but you're generally doing that for debug purposes where performance is less of a concern. Depends on D68426 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68438 llvm-svn: 375067	2019-10-17 00:37:04 +00:00
Quentin Colombet	c319afc903	[GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector Teach the combiner helper how to flatten concat_vectors of build_vectors into a build_vector. Add this combine as part of AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69071 llvm-svn: 375066	2019-10-17 00:34:32 +00:00
Daniel Sanders	ec5208fd65	[gicombiner] Hoist pure C++ combine into the tablegen definition Summary: This is just moving the existing C++ code around and will be NFC w.r.t AArch64. Renamed 'CombineBr' to something more descriptive ('ElideByByInvertingCond') at the same time. The remaining combines in AArch64PreLegalizeCombiner require features that aren't implemented at this point and will be hoisted as they are added. Depends on D68424 Reviewers: bogner, volkan Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68426 llvm-svn: 375057	2019-10-16 23:53:35 +00:00
Matt Arsenault	34ed76e180	GlobalISel: Implement lower for G_SADDO/G_SSUBO Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. llvm-svn: 375042	2019-10-16 20:46:32 +00:00
Simon Pilgrim	e2163f96ab	CombinerHelper - silence dead assignment warnings. NFCI. Copy the NewAlignment value to Alignment first and then use that to update the stack frame object alignments. llvm-svn: 375019	2019-10-16 17:21:50 +00:00
Jeremy Morse	ed29dbaafa	[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field This patch kills off a significant user of the "IsIndirect" field of DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is techncally redundant as it can be expressed by the DIExpression of a DBG_VALUE inst, and it isn't helpful to have two ways of expressing things. Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref to the insts DIExpression. There should now be no appearences of IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter, which is ensured by an assertion in LDVImpl::handleDebugValue. This means we also get to delete the IsIndirect handling in LiveDebugVariables. Tests can be upgraded by for example swapping the following IsIndirect=True DBG_VALUE: DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo) With one where the indirection is in the DIExpression, by _appending_ a deref: DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref) Which both mean the same thing. Most of the test changes in this patch are updates of that form; also some changes in how the textual assembly printer handles these insts. Differential Revision: https://reviews.llvm.org/D68945 llvm-svn: 374877	2019-10-15 10:46:24 +00:00
Joerg Sonnenberger	9681ea9560	Reapply r374743 with a fix for the ocaml binding Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784	2019-10-14 16:15:14 +00:00
Dmitri Gribenko	1a21f98ac3	Revert "Add a pass to lower is.constant and objectsize intrinsics" This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768	2019-10-14 12:22:48 +00:00
Joerg Sonnenberger	e4300c392d	Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743	2019-10-13 23:00:15 +00:00
Simon Pilgrim	944a051ebb	IRTranslator - silence static analyzer null dereference warnings. NFCI. The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst. llvm-svn: 374716	2019-10-13 11:29:35 +00:00
Quentin Colombet	9c36ec5941	[GISel][CallLowering] Enable vector support in argument lowering The exciting code is actually already enough to handle the splitting of vector arguments but we were lacking a test case. This commit adds a test case for vector argument lowering involving splitting and enable the related support in call lowering. llvm-svn: 374589	2019-10-11 20:22:57 +00:00
Quentin Colombet	7720f11498	[MachineIRBuilder] Fix an assertion failure with buildMerge Teach buildMerge how to deal with scalar to vector kind of requests. Prior to this patch, buildMerge would issue either a G_MERGE_VALUES when all the vregs are scalars or a G_CONCAT_VECTORS when the destination vreg is a vector. G_CONCAT_VECTORS was actually not the proper instruction when the source vregs were scalars and the compiler would assert that the sources must be vectors. Instead we want is to issue a G_BUILD_VECTOR when we are in this situation. This patch fixes that. llvm-svn: 374588	2019-10-11 20:22:47 +00:00
Marcello Maggioni	a064edf55e	[GISel] Simplifying return from else in function. NFC Forgot to integrate this little change in previous commit llvm-svn: 374463	2019-10-10 21:51:30 +00:00
Marcello Maggioni	0112123eea	[GISel] Allow getConstantVRegVal() to return G_FCONSTANT values. In GISel we have both G_CONSTANT and G_FCONSTANT, but because in GISel we don't really have a concept of Float vs Int value the only difference between the two is where the data originates from. What both G_CONSTANT and G_FCONSTANT return is just a bag of bits with the constant representation in it. By making getConstantVRegVal() return G_FCONSTANTs bit representation as well we allow ConstantFold and other things to operate with G_FCONSTANT. Adding tests that show ConstantFolding to work on mixed G_CONSTANT and G_FCONSTANT sources. Differential Revision: https://reviews.llvm.org/D68739 llvm-svn: 374458	2019-10-10 21:46:26 +00:00
Guillaume Chatelet	ff054b9e32	[Alignment][NFC] Use llv::Align in GISelKnownBits Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68786 llvm-svn: 374369	2019-10-10 15:38:22 +00:00
Matt Arsenault	3cd3959fe2	GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. llvm-svn: 374252	2019-10-09 22:44:43 +00:00
Matt Arsenault	4bcdcad91b	GlobalISel: Partially implement lower for G_INSERT llvm-svn: 373946	2019-10-07 19:13:27 +00:00
Matt Arsenault	27269054d2	GlobalISel: Add target pre-isel instructions Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. llvm-svn: 373937	2019-10-07 18:43:29 +00:00
Jordan Rose	fdaa742174	Second attempt to add iterator_range::empty() Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935	2019-10-07 18:14:24 +00:00
Matt Arsenault	a5b9c75674	GlobalISel: Partially implement lower for G_EXTRACT Turn into shift and truncate. Doesn't yet handle pointers. llvm-svn: 373838	2019-10-06 01:37:35 +00:00
Dmitri Gribenko	827a7fab78	Revert "GlobalISel: Handle llvm.read_register" This reverts commit r373294. It broke Clang's CodeGen/arm64-microsoft-status-reg.cpp: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483 llvm-svn: 373310	2019-10-01 08:24:01 +00:00
Matt Arsenault	bdcc6d3d26	GlobalISel: Handle llvm.read_register SelectionDAG has a bunch of machinery to defer this to selection time for some reason. Just directly emit a copy during IRTranslator. The x86 usage does somewhat questionably check hasFP, which could depend on the whole function being at minimum translated. This does lose the convergent bit if the callsite had it, which may be a problem. We also lose that in general for intrinsics, which may also be a problem. llvm-svn: 373294	2019-10-01 02:07:16 +00:00
Matt Arsenault	ed85b0cee6	GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287	2019-10-01 01:06:48 +00:00
Daniel Sanders	cbe13a1461	[globalisel][knownbits] Allow targets to call GISelKnownBits::computeKnownBitsImpl() Summary: It seems we missed that the target hook can't query the known-bits for the inputs to a target instruction. Fix that oversight Reviewers: aditya_nandakumar Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67380 llvm-svn: 373264	2019-09-30 20:55:53 +00:00
Jessica Paquette	b1c1095fdc	[AArch64][GlobalISel] Support lowering variadic musttail calls This adds support for lowering variadic musttail calls. To do this, we have to... - Detect a musttail call in a variadic function before attempting to lower the call's formal arguments. This is done in the IRTranslator. - Compute forwarded registers in `lowerFormalArguments`, and add copies for those registers. - Restore the forwarded registers in `lowerTailCall`. Because there doesn't seem to be any nice way to wrap these up into the outgoing argument handler, the restore code in `lowerTailCall` is done separately. Also, irritatingly, you have to make sure that the registers don't overlap with any passed parameters. Otherwise, the scheduler doesn't know what to do with the extra copies and asserts. Add call-translator-variadic-musttail.ll to test this. This is pretty much the same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to base this off of, but the idea is the same. Differential Revision: https://reviews.llvm.org/D68043 llvm-svn: 373226	2019-09-30 16:49:13 +00:00
Amara Emerson	509a4947c9	Add an operand to memory intrinsics to denote the "tail" marker. We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140	2019-09-28 05:33:21 +00:00
Guillaume Chatelet	18f805a7ea	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Jessica Paquette	8535a8672e	[AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call lowering When checking for tail call eligibility, we should use the correct CCAssignFn for each argument, rather than just checking if the caller/callee is varargs or not. This is important for tail call lowering with varargs. If we don't check it, then basically any varargs callee with parameters cannot be tail called on Darwin, for one thing. If the parameters are all guaranteed to be in registers, this should be entirely safe. On top of that, not checking for this could potentially make it so that we have the wrong stack offsets when checking for tail call eligibility. Also refactor some of the stuff for CCAssignFnForCall and pull it out into a helper function. Update call-translator-tail-call.ll to show that we can now correctly tail call on Darwin. Also add two extra tail call checks. The first verifies that we still respect the caller's stack size, and the second verifies that we still don't tail call when a varargs function has a memory argument. Differential Revision: https://reviews.llvm.org/D67939 llvm-svn: 372897	2019-09-25 16:45:35 +00:00
Amara Emerson	adec1209e6	[GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not unsigned. We were miscompiling switch value comparisons with the wrong signedness, which shows up when we have things like switch case values with i1 types, which end up being legalized incorrectly. Fixes PR43383 llvm-svn: 372675	2019-09-24 00:09:23 +00:00
Guillaume Chatelet	1ae7905fc8	[Alignment][NFC] DataLayout migration to llvm::Align Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67914 llvm-svn: 372596	2019-09-23 12:41:36 +00:00
Simon Pilgrim	f6f6c6ca3b	Localizer - fix "variable used but never read" analyzer warning. NFCI. Simplify the code by separating the modification of the Changed variable from returning it. llvm-svn: 372583	2019-09-23 11:38:10 +00:00
Amara Emerson	7ac1039957	[GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time. We currently always set the HasCalls on MFI during translation and legalization if we're handling a call or legalizing to a libcall. However, if that call is later optimized to a tail call then we don't need the flag. The flag being set to true causes frame lowering to always save and restore FP/LR, which adds unnecessary code. This change does the same thing as SelectionDAG and ports over some code that scans instructions after selection, using TargetInstrInfo to determine if target opcodes are known calls. Code size geomean improvements on CTMark: -O0 : 0.1% -Os : 0.3% Differential Revision: https://reviews.llvm.org/D67868 llvm-svn: 372443	2019-09-20 23:52:07 +00:00
Matt Arsenault	3ecab8e455	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338	2019-09-19 16:26:14 +00:00
Hans Wennborg	13bdae8541	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314	2019-09-19 12:33:07 +00:00
Matt Arsenault	d8399d12cd	GlobalISel: Don't materialize immarg arguments to intrinsics Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285	2019-09-19 01:33:14 +00:00
Amara Emerson	9d64721ca5	[GlobalISel] Partially revert r371901. r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050	2019-09-16 23:46:03 +00:00
Simon Pilgrim	a8a4953fdf	[GlobalISel] findGISelOptimalMemOpLowering - remove dead initalization. NFCI. Fixes static analyzer warning that "Value stored to 'NewTySize' during its initialization is never read". llvm-svn: 371937	2019-09-15 16:56:06 +00:00
Amara Emerson	02bcc86b08	[GlobalISel] Fix insertion point of new instructions to be after PHIs. For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901	2019-09-13 21:49:24 +00:00
Jessica Paquette	727328ab63	[AArch64][GlobalISel] Tail call memory intrinsics Because memory intrinsics are handled differently than other calls, we need to check them for tail call eligiblity in the legalizer. This allows us to still inline them when it's beneficial to do so, but also tail call when possible. This adds simple tail calling support for when the intrinsic is followed by a return. It ports the attribute checks from `TargetLowering::isInTailCallPosition` into a similarly-named function in LegalizerHelper.cpp. The target-specific `isUsedByReturnOnly` hook is not ported here. Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call memory intrinsics. Update legalize-memcpy-et-al.mir to have a case where we don't tail call. Differential Revision: https://reviews.llvm.org/D67566 llvm-svn: 371893	2019-09-13 20:25:58 +00:00
Matt Arsenault	4d33918034	AMDGPU/GlobalISel: Legalize G_FMAD Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800	2019-09-13 00:44:35 +00:00
Jessica Paquette	a42070a6aa	[AArch64][GlobalISel] Support sibling calls with outgoing arguments This adds support for lowering sibling calls with outgoing arguments. e.g ``` define void @foo(i32 %a) ``` Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`. The only thing that is missing is a full port of `TargetLowering::parametersInCSRMatch`. So, if we're using swiftself, we'll never tail call. - Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used for both outgoing and incoming arguments - Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for stack arguments. - Teach `lowerFormalArguments` to set the bytes in the caller's stack argument area. This is used later to check if the tail call's parameters will fit on the caller's stack. - Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on the callee's outgoing arguments. For testing: - Update call-translator-tail-call to verify that we can now tail call with outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the size of the caller's stack - Remove GISel-specific check lines from speculation-hardening.ll, since GISel now tail calls like the other selectors - Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that test now - Add a GISel test line to tailcall_misched_graph.ll since we tail call there now. Add specific check lines for GISel, since the debug output from the machine-scheduler differs with GlobalISel. The dependency still holds, but the output comes out in a different order. Differential Revision: https://reviews.llvm.org/D67471 llvm-svn: 371780	2019-09-12 22:10:36 +00:00
Amara Emerson	55d86f04c7	[AArch64][GlobalISel] Fall back on attempts to allocate split types on the stack. First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually make a difference for us in CallLowering but fix that anyway to silence the assert. The bigger issue was that after fixing the assert we were generating invalid MIR because the merging/unmerging of values split across multiple registers wasn't also implemented for memory locs. This happens when we run out of registers and have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but for now just fall back. llvm-svn: 371693	2019-09-11 23:53:23 +00:00
Jessica Paquette	469d42fcf6	[GlobalISel] When a tail call is emitted in a block, stop translating it This fixes a crash in tail call translation caused by assume and lifetime_end intrinsics. It's possible to have instructions other than a return after a tail call which will still have `Analysis::isInTailCallPosition` return true. (Namely, lifetime_end and assume intrinsics.) If we emit a tail call, we should stop translating instructions in the block. Otherwise, we can end up emitting an extra return, or dead instructions in general. This makes the verifier unhappy, and is generally unfortunate for codegen. This also removes the code from AArch64CallLowering that checks if we have a tail call when lowering a return. This is covered by the new code now. Also update call-translator-tail-call.ll to show that we now properly tail call in the presence of lifetime_end and assume. Differential Revision: https://reviews.llvm.org/D67415 llvm-svn: 371572	2019-09-10 23:34:45 +00:00
Jessica Paquette	2af5b193d5	[AArch64][GlobalISel] Support sibling calls with mismatched calling conventions Add support for sibcalling calls whose calling convention differs from the caller's. - Port over `CCState::resultsCombatible` from CallingConvLower.cpp into CallLowering. This is used to verify that the way the caller and callee CC handle incoming arguments matches up. - Add `CallLowering::analyzeCallResult`. This is basically a port of `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`. - Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks that the calling conventions are compatible, and that the caller and callee preserve the same registers. For testing: - Update call-translator-tail-call.ll to show that we can now handle this. - Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call when the regmasks don't line up. Differential Revision: https://reviews.llvm.org/D67361 llvm-svn: 371570	2019-09-10 23:25:12 +00:00
Aditya Nandakumar	5112b71126	[GlobalISel]: Fix a bug where we could dereference None getConstantVRegVal returns None when dealing with constants > 64 bits. Don't assume we always have a value in GISelKnownBits. llvm-svn: 371465	2019-09-09 22:51:41 +00:00
Tim Northover	06d93e0a25	GlobalISel: fix unused warnings in release builds. llvm-svn: 371385	2019-09-09 10:36:58 +00:00
Tim Northover	36147adc0b	GlobalISel: add combiner to form indexed loads. Loosely based on DAGCombiner version, but this part is slightly simpler in GlobalIsel because all address calculation is performed by G_GEP. That makes the inc/dec distinction moot so there's just pre/post to think about. No targets can handle it yet so testing is via a special flag that overrides target hooks. llvm-svn: 371384	2019-09-09 10:04:23 +00:00
Matt Arsenault	cf10372119	GlobalISel: Add G_FMAD instruction llvm-svn: 371254	2019-09-06 20:49:10 +00:00
Daniel Sanders	f803237926	[globalisel][knownbits] Account for missing type constraints Now that we look through copies, it's possible to visit registers that have a register class constraint but not a type constraint. Avoid looking through copies when this occurs as the SrcReg won't be able to determine it's bit width or any known bits. Along the same lines, if the initial query is on a register that doesn't have a type constraint then the result is a default-constructed KnownBits, that is, a 1-bit fully-unknown value. llvm-svn: 371116	2019-09-05 20:26:02 +00:00
Jessica Paquette	20e8667098	Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls" Recommit basic sibling call lowering (https://reviews.llvm.org/D67189) The issue was that if you have a return type other than void, call lowering will emit COPYs to get the return value after the call. Disallow sibling calls other than ones that return void for now. Also proactively disable swifterror tail calls for now, since there's a similar issue with COPYs there. Update call-translator-tail-call.ll to include test cases for each of these things. llvm-svn: 371114	2019-09-05 20:18:34 +00:00
Simon Pilgrim	071287c5a9	Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls This adds support for basic sibling call lowering in AArch64. The intent here is to only handle tail calls which do not change the ABI (hence, sibling calls.) At this point, it is very restricted. It does not handle - Vararg calls. - Calls with outgoing arguments. - Calls whose calling conventions differ from the caller's calling convention. - Tail/sibling calls with BTI enabled. This patch adds - `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent to the same function in AArch64ISelLowering.cpp (albeit with the restrictions above.) - `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in AArch64ISelLowering.cpp. - `getCallOpcode`, which is exactly what it sounds like. Tail/sibling calls are lowered by checking if they pass target-independent tail call positioning checks, and checking if they satisfy `isEligibleForTailCallOptimization`. If they do, then a tail call instruction is emitted instead of a normal call. If we have a sibling call (which is always the case in this patch), then we do not emit any stack adjustment operations. When we go to lower a return, we check if we've already emitted a tail call. If so, then we skip the return lowering. For testing, this patch - Adds call-translator-tail-call.ll to test which tail calls we currently lower, which ones we don't, and which ones we shouldn't. - Updates branch-target-enforcement-indirect-calls.ll to show that we fall back as expected. Differential Revision: https://reviews.llvm.org/D67189 ........ This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll llvm-svn: 371051	2019-09-05 10:38:39 +00:00
Jessica Paquette	b78324fc40	[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls This adds support for basic sibling call lowering in AArch64. The intent here is to only handle tail calls which do not change the ABI (hence, sibling calls.) At this point, it is very restricted. It does not handle - Vararg calls. - Calls with outgoing arguments. - Calls whose calling conventions differ from the caller's calling convention. - Tail/sibling calls with BTI enabled. This patch adds - `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent to the same function in AArch64ISelLowering.cpp (albeit with the restrictions above.) - `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in AArch64ISelLowering.cpp. - `getCallOpcode`, which is exactly what it sounds like. Tail/sibling calls are lowered by checking if they pass target-independent tail call positioning checks, and checking if they satisfy `isEligibleForTailCallOptimization`. If they do, then a tail call instruction is emitted instead of a normal call. If we have a sibling call (which is always the case in this patch), then we do not emit any stack adjustment operations. When we go to lower a return, we check if we've already emitted a tail call. If so, then we skip the return lowering. For testing, this patch - Adds call-translator-tail-call.ll to test which tail calls we currently lower, which ones we don't, and which ones we shouldn't. - Updates branch-target-enforcement-indirect-calls.ll to show that we fall back as expected. Differential Revision: https://reviews.llvm.org/D67189 llvm-svn: 370996	2019-09-04 22:54:52 +00:00
Matt Arsenault	5ff310e298	GlobalISel: Add basic legalization for G_BITREVERSE llvm-svn: 370979	2019-09-04 20:46:15 +00:00
Daniel Sanders	b276a9a51e	[globalisel] Support trivial COPY in GISelKnownBits Summary: Allow GISelKnownBits to look through the trivial case of TargetOpcode::COPY Reviewers: aditya_nandakumar Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67131 llvm-svn: 370955	2019-09-04 18:59:43 +00:00
Matt Arsenault	70becc20fa	GlobalISel: Add G_BITREVERSE This is the first failing pattern for AMDGPU and is trivial to handle. llvm-svn: 370927	2019-09-04 17:06:53 +00:00
Amara Emerson	5d5150f0b4	[GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type combination. Similar to the issue with G_ZEXT that was fixed earlier, this is a quick to fall back if the source type is not exactly half of the dest type. Fixes the clang-cmake-aarch64-lld bot build. llvm-svn: 370847	2019-09-04 07:58:45 +00:00
Amara Emerson	2a2c25ba48	[AArch64][GlobalISel] Legalize 128 bit divisions to libcalls. Now that we have the infrastructure to support s128 types as parameters we can expand these to libcalls. Differential Revision: https://reviews.llvm.org/D66185 llvm-svn: 370823	2019-09-03 21:42:32 +00:00
Amara Emerson	fbaf425b79	[GlobalISel][CallLowering] Add support for splitting types according to calling conventions. On AArch64, s128 types have to be split into s64 GPRs when passed as arguments. This change adds the generic support in call lowering for dealing with multiple registers, for incoming and outgoing args. Support for splitting for return types not yet implemented. Differential Revision: https://reviews.llvm.org/D66180 llvm-svn: 370822	2019-09-03 21:42:28 +00:00
Amara Emerson	453ef4e376	[AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creating the merges. Fixes PR43171. llvm-svn: 370627	2019-09-02 08:18:55 +00:00
Matt Arsenault	466ec2d552	GlobalISel: Fix missing pass dependency llvm-svn: 370496	2019-08-30 17:41:58 +00:00
Petar Avramovic	6412b56513	[MIPS GlobalISel] Lower fptoui Add lower for G_FPTOUI. Algorithm is similar to the SDAG version in TargetLowering::expandFP_TO_UINT. Lower G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D66929 llvm-svn: 370431	2019-08-30 05:44:02 +00:00
Matt Arsenault	093ebf9275	GlobalISel: Don't compute known bits for non-integral GEP llvm-svn: 370392	2019-08-29 17:55:05 +00:00
Matt Arsenault	b2b9a23758	GlobalISel: Add maskedValueIsZero and signBitIsZero to known bits I dropped the DemandedElts since it seems to be missing from some of the new interfaces, but not others. llvm-svn: 370389	2019-08-29 17:24:36 +00:00
Matt Arsenault	caff0a88dd	GlobalISel: Add known bits to InstructionSelector AMDGPU uses this for some addressing mode selection patterns. The analysis run itself doesn't do anything so it seems easier to just always require this than adding a way to opt in. llvm-svn: 370388	2019-08-29 17:24:32 +00:00
Jessica Paquette	af0bd41e06	[AArch64][GlobalISel] Fall back when translating musttail calls These are currently translated as normal functions calls in AArch64. Until we have proper tail call lowering, we shouldn't translate these. Differential Revision: https://reviews.llvm.org/D66842 llvm-svn: 370225	2019-08-28 16:19:01 +00:00
Amara Emerson	e20b91c265	[GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC. This change moves the actual stack pointer manipulation into the legalizer, available to targets via lower(). The codegen is slightly different because we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than adding a negative amount via G_GEP. Differential Revision: https://reviews.llvm.org/D66678 llvm-svn: 370104	2019-08-27 19:54:27 +00:00
Petar Avramovic	a393238422	[GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFC Main difference is in the way Hi for Long shift (HiL) is made. G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value. Differential Revision: https://reviews.llvm.org/D66589 llvm-svn: 370064	2019-08-27 14:33:05 +00:00
Petar Avramovic	d568ed40e0	[GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062	2019-08-27 14:22:32 +00:00
Volkan Keles	277631e3b8	[GlobalISel] Legalizer: Retry combining illegal artifacts as long as there new artifacts Summary: Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s possible to combine them after processing the rest of the instruction because the legalization is likely to generate more artifacts that allow ArtifactCombiner to combine away them. Instead, move illegal artifacts to another list called RetryList and wait until all of the instruction in InstList are legalized. After that, check if there is any new artifacts and try to combine them again if that’s the case. If not, abort. The idea is similar to D59339, but the approach is a bit different. This patch fixes the issue described above, but the legalizer still may be unable to handle some cases depending on when to legalize artifacts. So, in the long run, we probably need a different legalization strategy that handles this dependency in a better way. Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette Reviewed By: dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65894 llvm-svn: 369805	2019-08-23 20:30:35 +00:00
Matt Arsenault	fba82858f2	GlobalISel: Don't create G_UADDE with constant false carry in The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673	2019-08-22 17:29:17 +00:00
Matt Arsenault	954a012b4c	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547	2019-08-21 16:59:10 +00:00
Petar Avramovic	5b4c5c2c54	[MIPS GlobalISel] NarrowScalar G_TRUNC Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source. NarrowScalar G_TRUNC to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66202 llvm-svn: 369509	2019-08-21 09:26:39 +00:00
Amara Emerson	56606a4db3	[AArch64][GlobalISel] Add support for narrowScalar of G_ZEXT We do this by merging the source with the high bits set to 0. Differential Revision: https://reviews.llvm.org/D66181 llvm-svn: 369480	2019-08-21 00:12:37 +00:00
Aditya Nandakumar	08bd080872	[GlobalISel] Handle multiple registers in dbg.value intrinsic https://reviews.llvm.org/D66077 The value passed into dbg.value may relate to multiple registers, each of which need a DBG_VALUE. This fix calls MIRBuilder.buildDirectDbgValue for each register. Without this, IR passed in from flang-compiler/flang may fail an assertion in getOrCreateVReg. Patch by : peterwaller-arm. llvm-svn: 369403	2019-08-20 16:28:37 +00:00
Amara Emerson	c809230a69	[AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask. Again, it's weird that these are allowed. Since lowering support was added in r368709 we started crashing on compiling the neon intrinsics test in the test suite. This fixes the lowering to fold the 1 elt src/mask case into copies. llvm-svn: 369135	2019-08-16 18:06:53 +00:00
Volkan Keles	0ae6006bee	[GlobalISel] CSEMIRBuilder: Add support for G_GEP Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070	2019-08-15 23:45:45 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Aditya Nandakumar	c65ac865c3	[GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong source index https://reviews.llvm.org/D66182 llvm-svn: 368781	2019-08-14 01:23:33 +00:00
Aditya Nandakumar	615eee6402	[GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources https://reviews.llvm.org/D66171 llvm-svn: 368753	2019-08-13 21:49:11 +00:00
Matt Arsenault	28215caa60	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES Odd sized vectors aren't handled yet. llvm-svn: 368713	2019-08-13 16:26:28 +00:00
Matt Arsenault	690645bda0	GlobalISel: Implement lower for G_SHUFFLE_VECTOR llvm-svn: 368709	2019-08-13 16:09:07 +00:00
Matt Arsenault	5af9cf042f	GlobalISel: Change representation of shuffle masks Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704	2019-08-13 15:34:38 +00:00
Amara Emerson	e14c91b71a	[GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652	2019-08-13 06:26:59 +00:00
Aditya Nandakumar	70fdfed45f	[GlobalISel]: Add KnownBits for G_XOR https://reviews.llvm.org/D66119 llvm-svn: 368648	2019-08-13 04:32:33 +00:00
Aditya Nandakumar	55371e697c	[GISel]: Fix a bug in KnownBits where we should have been using SizeInBits https://reviews.llvm.org/D66039 We were using getIndexSize instead of getIndexSizeInBits(). Added test case for G_PTRTOINT and G_INTTOPTR. llvm-svn: 368618	2019-08-12 21:28:12 +00:00
Daniel Sanders	e9a57c2b23	[globalisel] Add G_SEXT_INREG Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487	2019-08-09 21:11:20 +00:00
Tim Northover	e1a5f668b3	GlobalISel: pack various parameters for lowerCall into a struct. I've now needed to add an extra parameter to this call twice recently. Not only is the signature getting extremely unwieldy, but just updating all of the callsites and implementations is a pain. Putting the parameters in a struct sidesteps both issues. llvm-svn: 368408	2019-08-09 08:26:38 +00:00
Tim Northover	3c10f346dc	GlobalISel: factor common code from translateCall and translateInvoke. NFC. llvm-svn: 368166	2019-08-07 12:43:53 +00:00
Aditya Nandakumar	6bbfde5c48	[GISel]: Fix trivial build breakage llvm-svn: 368067	2019-08-06 17:53:04 +00:00
Aditya Nandakumar	c8ac029d0a	[GISel]: Add GISelKnownBits analysis https://reviews.llvm.org/D65698 This adds a KnownBits analysis pass for GISel. This was done as a pass (compared to static functions) so that we can add other features such as caching queries(within a pass and across passes) in the future. This patch only adds the basic pass boiler plate, and implements a lazy non caching knownbits implementation (ported from SelectionDAG). I've also hooked up the AArch64PreLegalizerCombiner pass to use this - there should be no compile time regression as the analysis is lazy. llvm-svn: 368065	2019-08-06 17:18:29 +00:00
Amara Emerson	bc1172df14	[GlobalISel][CallLowering] Rename isArgumentHandler() -> isIncomingArgumentHandler() Previous name and comment incorrectly implied it was just for formal arg handlers, which is not true. llvm-svn: 367945	2019-08-05 23:05:28 +00:00
Amara Emerson	85e5e28ab4	[AArch64][GlobalISel] Inline tiny memcpy et al at -O0. FastISel already does this since the initial arm64 port was upstreamed, so it seems there are no issues with doing this at -O0 for very small memcpys. Gives a 0.2% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D65758 llvm-svn: 367919	2019-08-05 20:02:52 +00:00
Guillaume Chatelet	c97a3d15d2	[LLVM][Alignment] Introduce Alignment Type Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Reviewed By: jfb Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65514 llvm-svn: 367828	2019-08-05 11:02:05 +00:00
Guillaume Chatelet	65e4b47aad	[LLVM][Alignment] Introduce Alignment Type in DataLayout Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65521 Make getFunctionPtrAlign() return MaybeAlign llvm-svn: 367817	2019-08-05 09:00:43 +00:00
Amara Emerson	c835164a47	Re-commit "[GlobalISel] Add legalization support for non-power-2 loads and stores"" This is an old commit that exposed a bug in the GISel importer, which caused non-truncating stores to be selected for truncating store patterns. Now that's been fixed in r367737 this can go back in. llvm-svn: 367739	2019-08-02 23:44:24 +00:00
Tim Northover	522fb7eedc	GlobalISel: support swiftself attribute llvm-svn: 367683	2019-08-02 14:09:49 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Matt Arsenault	d9d30a408e	GlobalISel: Lower scalarizing unmerge of a vector to shifts AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604	2019-08-01 19:10:05 +00:00
Matt Arsenault	5faa533e47	GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointer AMDGPU testcase isn't broken now, but will be in a future patch without this. llvm-svn: 367591	2019-08-01 18:13:16 +00:00
Matt Arsenault	7bedceb5b2	GlobalISel: moreElementsVector for G_LOAD/G_STORE AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503	2019-08-01 01:44:22 +00:00
Mark Lacey	7b8d3eb9e2	[GISel] Pass MD_callees metadata down in call lowering. Summary: This will make it possible to improve IPRA by taking into account register usage in indirect calls. NFC yet; this is just laying the groundwork to start building up patches to take advantage of the information for improved register allocation. Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65488 llvm-svn: 367476	2019-07-31 20:34:02 +00:00
Matt Arsenault	9cf980d4a7	GlobalISel: Add G_ATOMICRMW_{FADD\|FSUB} llvm-svn: 367369	2019-07-30 23:56:30 +00:00
Austin Kerbow	c99f62e313	[AMDGPU/GlobalISel] Add llvm.amdgcn.fdiv.fast legalization. Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64966 llvm-svn: 367344	2019-07-30 18:49:16 +00:00
Amara Emerson	7bc4fad0fb	[AArch64][GlobalISel] Implement narrowing of G_SEXT. We need this to narrow a sext to s128. Differential Revision: https://reviews.llvm.org/D65357 llvm-svn: 367164	2019-07-26 23:46:38 +00:00
Amara Emerson	13af1ed8e3	[GlobalISel] Support for inlining memcpy, memset and memmove calls. This introduces a new family of combiner helper routines that re-use the target specific cost model from SelectionDAG, and generate inline implementations of the memcpy family of intrinsics. The combines are only enabled at optimization levels higher than -O0, and give very substantial performance improvements. Differential Revision: https://reviews.llvm.org/D65167 llvm-svn: 366951	2019-07-24 22:17:31 +00:00
Amara Emerson	a1997ce2e5	[AArch64][GlobalISel] Fix a crash during s128 G_ICMP legalization due to r366317. r366317 added a legalization for s128 G_ICMP narrow scalar which tried to hard code the result type of the new legalized G_SELECT. Change this to instead use type of the original G_ICMP result and allow the target to legalize it if necessary later. llvm-svn: 366943	2019-07-24 20:46:42 +00:00
Aditya Nandakumar	d7504a1569	[GISel]: Attach missing range metadata while translating G_LOADs https://reviews.llvm.org/D65048 Attach range information to G_LOAD when only defining one register. reviewed by: arsenm llvm-svn: 366656	2019-07-21 14:07:54 +00:00
Amara Emerson	cf12c7815f	[GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516	2019-07-19 00:24:45 +00:00
Matt Arsenault	0966dd0d69	GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sources Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367	2019-07-17 20:22:44 +00:00
Matt Arsenault	914a59cad8	GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUES Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366	2019-07-17 20:22:38 +00:00
Petar Avramovic	1e62635d05	[MIPS GlobalISel] ClampScalar and select pointer G_ICMP Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317	2019-07-17 12:08:01 +00:00
Matt Arsenault	1c3f4ec7fc	GlobalISel: Add overload of handleAssignments with CCState AMDGPU needs to allocate special argument registers separately from the user function argument list, so needs direct control over the CCState. The ArgLocs argument is only really necessary because CCState doesn't allow access to it. llvm-svn: 366279	2019-07-16 22:41:34 +00:00
Matt Arsenault	434d664095	GlobalISel: Implement narrowScalar for vector extract/insert indexes llvm-svn: 366113	2019-07-15 19:37:34 +00:00
Fangrui Song	b251cc0d91	Delete dead stores llvm-svn: 365903	2019-07-12 14:58:15 +00:00
Matt Arsenault	7e71902b79	GlobalISel: Use Register llvm-svn: 365780	2019-07-11 14:18:19 +00:00
Amara Emerson	7a4d2df04a	[AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690	2019-07-10 19:21:43 +00:00
Matt Arsenault	6ce1b4fec5	GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM llvm-svn: 365658	2019-07-10 16:31:19 +00:00
Matt Arsenault	e595a2c964	GlobalISel: Define the full family of FP min/max instructions llvm-svn: 365657	2019-07-10 16:31:15 +00:00
Matt Arsenault	b1843e130a	GlobalISel: Implement lower for G_FCOPYSIGN In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583	2019-07-09 23:34:29 +00:00
Matt Arsenault	14a4495155	GlobalISel: Combine unmerge of merge with intermediate cast This eliminates some illegal intermediate vectors when operations are scalarized. llvm-svn: 365566	2019-07-09 22:19:13 +00:00
Amara Emerson	6616e269a6	[AArch64][GlobalISel] Optimize conditional branches followed by unconditional branches If we have an icmp->brcond->br sequence where the brcond just branches to the next block jumping over the br, while the br takes the false edge, then we can modify the conditional branch to jump to the br's target while inverting the condition of the incoming icmp. This means we can eliminate the br as an unconditional branch to the fallthrough block. Differential Revision: https://reviews.llvm.org/D64354 llvm-svn: 365510	2019-07-09 16:05:59 +00:00
Petar Avramovic	be20e36107	[MIPS GlobalISel] Register bank select for G_PHI. Select i64 phi Select gprb or fprb when def/use register operand of G_PHI is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 phi is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64351 llvm-svn: 365494	2019-07-09 14:36:17 +00:00
Matt Arsenault	079f77b590	GlobalISel: Convert some build functions to using SrcOp/DstOp llvm-svn: 365343	2019-07-08 16:27:47 +00:00
Matt Arsenault	bd791b57f8	GlobalISel: widenScalar for G_BUILD_VECTOR llvm-svn: 365320	2019-07-08 13:48:06 +00:00
Matt Arsenault	43cbca50e4	GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUES llvm-svn: 365093	2019-07-03 23:08:06 +00:00
Matt Arsenault	ce690544a6	GlobalISel: Add G_FENCE The pattern importer is for some reason emitting checks for G_CONSTANT for the immediate operands. llvm-svn: 364926	2019-07-02 14:16:39 +00:00
Matt Arsenault	c9f14f29f5	GlobalISel: Try to widen merges with other merges If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841	2019-07-01 19:36:10 +00:00
Aditya Nandakumar	1023a2eca3	[GlobalISel]: Allow backends to custom legalize Intrinsics https://reviews.llvm.org/D31359 Add a hook "legalizeInstrinsic" to allow backends to override this and custom lower/legalize intrinsics. llvm-svn: 364821	2019-07-01 17:53:50 +00:00
Matt Arsenault	6f74f55750	GlobalISel: Implement lower for min/max llvm-svn: 364816	2019-07-01 17:18:03 +00:00
Diana Picus	2ba16011c1	Fixup r364512 Fix stack-use-after-scope errors from r364512. One instance was already fixed in r364611 - this patch simplifies that fix and addresses one more instance of similar code. Discussed in: https://reviews.llvm.org/D63905 llvm-svn: 364778	2019-07-01 15:07:38 +00:00
Fangrui Song	78ee2fbf98	Cleanup: llvm::bsearch -> llvm::partition_point after r364719 llvm-svn: 364720	2019-06-30 11:19:56 +00:00
Matt Arsenault	3018d1845b	GlobalISel: Use Register llvm-svn: 364618	2019-06-28 01:47:44 +00:00
Matt Arsenault	5e66db6b8c	GlobalISel: Convert rest of MachineIRBuilder to using Register llvm-svn: 364615	2019-06-28 01:16:41 +00:00
Amara Emerson	ecb7ac35f9	[GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when optimizations are used. The new switch lowering code that tries to generate jump tables and range checks were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to town on trying to generate optimized lowerings, e.g. multiple jump tables, range checks etc. This exposed bugs in the way PHI nodes are handled because the CFG looks even stranger after all of this is done. llvm-svn: 364613	2019-06-27 23:56:34 +00:00
Rumeet Dhindsa	ddc2804e1a	Fix ASAN error caused by commit r364512. This patch intends to fix ASAN stack-use-after-scope error. This is at least a short-term fix to unbreak LLVM's mainline. Differential Revision: https://reviews.llvm.org/D63905 llvm-svn: 364611	2019-06-27 23:37:04 +00:00
Diana Picus	74a50a723b	[GlobalISel] Remove [un]packRegs from IRTranslator Remove the last use of packRegs from IRTranslator and delete pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics with aggregate arguments, since we don't have a testcase for them so it's hard to tell how we'd want to handle them. Discussed in https://reviews.llvm.org/D63551 llvm-svn: 364514	2019-06-27 09:49:07 +00:00
Diana Picus	43fb5ae50c	[GlobalISel] Accept multiple vregs for lowerCall's args Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512	2019-06-27 09:18:03 +00:00
Diana Picus	8138996128	[GlobalISel] Accept multiple vregs for lowerCall's result Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511	2019-06-27 09:15:53 +00:00
Diana Picus	c3dbe23977	[GlobalISel] Accept multiple vregs in lowerFormalArgs Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510	2019-06-27 08:54:17 +00:00
Diana Picus	69ce1c1319	[GlobalISel] Allow multiple VRegs in ArgInfo. NFC Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509	2019-06-27 08:50:53 +00:00
Matt Arsenault	faeaedf8e9	GlobalISel: Remove unsigned variant of SrcOp Force using Register. One downside is the generated register enums require explicit conversion. llvm-svn: 364194	2019-06-24 16:16:12 +00:00
Matt Arsenault	e3a676e9ad	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191	2019-06-24 15:50:29 +00:00
Fangrui Song	43e14390b0	Make GlobalISel depend on SelectionDAG after D63169 GlobalISel/IRTranslator.cpp now references SelectionDAG/FunctionLoweringInfo.cpp. This fixes a link error in -DBUILD_SHARED_LIBS=on builds: ld.lld: error: undefined symbol: llvm::FunctionLoweringInfo::clear() >>> referenced by IRTranslator.cpp:2198 (../lib/CodeGen/GlobalISel/IRTranslator.cpp:2198) >>> lib/CodeGen/GlobalISel/CMakeFiles/LLVMGlobalISel.dir/IRTranslator.cpp.o:(llvm::IRTranslator::finalizeFunction()) llvm-svn: 364124	2019-06-22 01:30:17 +00:00
Amara Emerson	fe4625fb24	[GlobalISel][IRTranslator] Change switch table translation to generate jump tables and range checks. This change makes use of the newly refactored SwitchLoweringUtils code from SelectionDAG to in order to generate jump tables and range checks where appropriate. Much of this code is ported from SDAG with some modifications. We generate G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means that targets which previously relied on the naive one MBB per case stmt translation will now start falling back until they add support for the new opcodes. For range checks, we don't generate any previously unused operations. This just recognizes contiguous ranges of case values and generates a single block per range. Single case value blocks are just a special case of ranges so we get that support almost for free. There are still some optimizations missing that I haven't ported over, and bit-tests are also unimplemented. This patch series is already complex enough. Actual arm64 support for selection of jump tables is coming in a later patch. Differential Revision: https://reviews.llvm.org/D63169 llvm-svn: 364085	2019-06-21 18:10:38 +00:00
Amara Emerson	bc0d08e0ee	[GlobalISel][Localizer] Allow localization of G_INTTOPTR and chains of instructions. G_INTTOPTR can prevent the localizer from moving G_CONSTANTs, but since it's essentially a side effect free cast instruction we can remat both instructions. This patch changes the localizer to enable localization of the chains by iterating over the entry block instructions in reverse order. That way, uses will localized first, and then the defs are free to be localized as well. This also changes the previous SmallPtrSet of localized instructions to use a SetVector instead. We're dealing with pointers and need deterministic iteration order. Overall, this change improves ARM64 -O0 CTMark code size by around 0.7% geomean. Differential Revision: https://reviews.llvm.org/D63630 llvm-svn: 364001	2019-06-21 00:36:19 +00:00
Petar Avramovic	153bd24eda	[MIPS GlobalISel] Select integer to floating point conversions Select G_SITOFP and G_UITOFP for MIPS32. Differential Revision: https://reviews.llvm.org/D63542 llvm-svn: 363912	2019-06-20 09:05:02 +00:00
Petar Avramovic	4b4dae1c76	[MIPS GlobalISel] Select floating point to integer conversions Select G_FPTOSI and G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D63541 llvm-svn: 363911	2019-06-20 08:52:53 +00:00

... 4 5 6 7 8 ...

1404 Commits