llvm-project

Commit Graph

Author	SHA1	Message	Date
Nicolai Hähnle	3fa989d4fd	DomTree: remove explicit use of DomTreeNodeBase::iterator Summary: Almost all uses of these iterators, including implicit ones, really only need the const variant (as it should be). The only exception is in NewGVN, which changes the order of dominator tree child nodes. Change-Id: I4b5bd71e32d71b0c67b03d4927d93fe9413726d4 Reviewers: arsenm, RKSimon, mehdi_amini, courbet, rriddle, aartbik Subscribers: wdng, Prazek, hiraditya, kuhar, rogfer01, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, vkmr, Kayjukh, jurahul, msifontes, cfe-commits, llvm-commits Tags: #clang, #mlir, #llvm Differential Revision: https://reviews.llvm.org/D83087	2020-07-08 18:18:49 +02:00
hsmahesha	0ed2c04636	[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545	2020-06-01 22:52:34 +05:30
Christopher Tetreault	5a99ec10f5	[SVE] Eliminate calls to default-false VectorType::get() from X86 Reviewers: efriedma, sdesmalen, c-rhodes, craig.topper Reviewed By: craig.topper Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80331	2020-05-29 16:16:07 -07:00
Craig Topper	8c72b0271b	[CodeGen] Use Align in MachineConstantPool.	2020-05-12 10:06:40 -07:00
Craig Topper	e4c454b065	[X86] Add a few more shuffles to hasUndefRegUpdate. Mostly found by asserting on tests that have undef operands. I'm sure this isn't an exhaustive list.	2020-05-10 12:30:16 -07:00
Craig Topper	c7be6a86f4	[X86] Teach getUndefRegClearance that we use undef for inputs to PUNPCK in some cases. This enables the register to be changed from XMM/YMM/ZMM0 to instead match the other source. This prevents a false dependency. I added all the integer unpck instructions, but the tests only show changes for BW and WD. Unfortunately, we can have undef on operand 1 or 2 of the AVX instructions. This breaks the interface with hasUndefRegUpdate which used to tell which operand to check. Now we scan the input operands looking for an undef register and then ask hasUndefRegUpdate if its an instruction we care about and which operands of that instruction we care about. I also had to make some changes to the load folding code to always pass operand 1 to hasUndefRegUpdate. I've updated hasUndefRegUpdate to return false when ForLoadFold is set for instructions that are not explicitly blocked for load folding in isel patterns. Differential Revision: https://reviews.llvm.org/D79615	2020-05-09 12:19:30 -07:00
Craig Topper	446a3be8f1	[X86] Add PACK instructions to hasUndefRegUpdate so the BreakFalseDeps pass will reassign an undef second source to match the first source We generate PACK instructions with an undef second source when we are truncating from a 128-bit vector to something narrower and we don't care about the upper bits of the vector register. The register allocation process will always assign untied undef uses to xmm0. This creates a false dependency on xmm0. By adding these instructions to hasUndefRegUpdate, we can get the BreakFalseDeps pass to reassign the source to match the other input. Normally this interface is used for instructions that might need an xor inserted to break the dependency. But the pass also has a heuristic that tries to use the same register as other sources. That should always be possible for these instructions so we'll never trigger the xor dependency break. Differential Revision: https://reviews.llvm.org/D79032	2020-04-28 15:11:32 -07:00
Nick Desaulniers	bc7f3240e6	[X86] remove derived method w/ same impl as base Summary: While looking into issues with IfConverter, I noticed that X86InstrInfo::isUnpredicatedTerminator matched its overriden implementation in TargetInstrInfo::isUnpredicatedTerminator. Reviewers: craig.topper, hfinkel, MaskRay, echristo Reviewed By: MaskRay, echristo Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62749	2020-04-27 17:41:00 -07:00
Kazuaki Ishizaki	0312b9f550	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469	2020-04-23 14:26:32 +09:00
Andrew Litteken	8d5024f7fe	fix to outline cfi instruction when can be grouped in a tail call [MachineOutliner] fix test for excluding CFI and add test to include CFI in outlining New test to check that we only outline CFI instruction if all CFI Instructions in the function would be captured by the outlining adding x86 tests analagous to AARCH64 cfi tests Revision: https://reviews.llvm.org/D77852	2020-04-17 22:26:34 -07:00
Matt Arsenault	30ebafaa56	CodeGen: Convert some TII hooks to use Register	2020-04-03 14:52:54 -04:00
Yuanfang Chen	ece79f4708	[X86] make sure POP has implicit def/use of stack pointer when materializing 8-bit immediates for minsize Summary: Otherwise PostRA list scheduler may reorder instruction, such as schedule this ''' pushq $0x8 pop %rbx lea 0x2a0(%rsp),%r15 ''' to ''' pushq $0x8 lea 0x2a0(%rsp),%r15 pop %rbx ''' by mistake. The patch is to prevent this to happen by making sure POP has implicit use of SP. Reviewers: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77031	2020-03-30 09:25:31 -07:00
Guillaume Chatelet	bdf77209b9	[Alignment][NFC] Use Align version of getMachineMemOperand Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, jfb, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77059	2020-03-30 15:46:27 +00:00
Guillaume Chatelet	74eac9031a	[Alignment][NFC] MachineMemOperand::getAlign/getBaseAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, jrtc27, atanasyan, jfb, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76925	2020-03-27 15:49:13 +00:00
Guillaume Chatelet	3ba550a05a	[Alignment][NFC] Use TFL::getStackAlign() Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: dylanmckay, sdardis, nemanjai, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76551	2020-03-23 13:48:29 +01:00
Djordje Todorovic	3b984641a7	[DebugInfo] Fix build failure on the mingw Add the workaround for the X86::MOV16ri when describing call site parameters.	2020-03-12 08:18:01 +01:00
George Burgess IV	174c3eb69f	[x86][slh] Move isDataInvariant* functions Patch by Zola Bridges! From the review: """ I moved these functions to X86InstrInfo.cpp, so they are available from another pass. In addition, this is a step toward resolving the FIXME to move this metadata to the instruction tables. This is the final step to make these two data invariance checks available for non-SLH passes. The other two steps were here: - https://reviews.llvm.org/D70283 - https://reviews.llvm.org/D75650 Tested via llvm-lit llvm/test/CodeGen/X86/speculative-load-hardening* """ Differential Revision: https://reviews.llvm.org/D75654	2020-03-09 17:07:44 -07:00
Craig Topper	6ca96765c7	[X86] Disable commuting for the first source operand of zero masked scalar fma intrinsic instructions. I believe this is the correct fix for D75506 rather than disabling all commuting. We can still commute the remaining two sources. Differential Revision:m https://reviews.llvm.org/D75526	2020-03-04 14:35:53 -08:00
Sanjay Patel	90fd859f51	[x86] use instruction-level fast-math-flags to drive MachineCombiner The code changes here are hopefully straightforward: 1. Use MachineInstruction flags to decide if FP ops can be reassociated (use both "reassoc" and "nsz" to be consistent with IR transforms; we probably don't need "nsz", but that's a safer interpretation of the FMF). 2. Check that both nodes allow reassociation to change instructions. This is a stronger requirement than we've usually implemented in IR/DAG, but this is needed to solve the motivating bug (see below), and it seems unlikely to impede optimization at this late stage. 3. Intersect/propagate MachineIR flags to enable further reassociation in MachineCombiner. We managed to make MachineCombiner flexible enough that no changes are needed to that pass itself. So this patch should only affect x86 (assuming no other targets have implemented the hooks using MachineIR flags yet). The motivating example in PR43609 is another case of fast-math transforms interacting badly with special FP ops created during lowering: https://bugs.llvm.org/show_bug.cgi?id=43609 The special fadd ops used for converting int to FP assume that they will not be altered, so those are created without FMF. However, the MachineCombiner pass was being enabled for FP ops using the global/function-level TargetOption for "UnsafeFPMath". We managed to run instruction/node-level FMF all the way down to MachineIR sometime in the last 1-2 years though, so we can do better now. The test diffs require some explanation: 1. llvm/test/CodeGen/X86/fmf-flags.ll - no target option for unsafe math was specified here, so MachineCombiner kicks in where it did not previously; to make it behave consistently, we need to specify a CPU schedule model, so use the default model, and there are no code diffs. 2. llvm/test/CodeGen/X86/machine-combiner.ll - replace the target option for unsafe math with the equivalent IR-level flags, and there are no code diffs; we can't remove the NaN/nsz options because those are still used to drive x86 fmin/fmax codegen (special SDAG opcodes). 3. llvm/test/CodeGen/X86/pow.ll - similar to #1 4. llvm/test/CodeGen/X86/sqrt-fastmath.ll - similar to #1, but MachineCombiner does some reassociation of the estimate sequence ops; presumably these are perf wins based on latency/throughput (and we get some reduction of move instructions too); I'm not sure how it affects numerical accuracy, but the test reflects reality better now because we would expect MachineCombiner to be enabled if the IR was generated via something like "-ffast-math" with clang. 5. llvm/test/CodeGen/X86/vec_int_to_fp.ll - this is the test added to model PR43609; the fadds are not reassociated now, so we should get the expected results. 6. llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll - similar to #1 7. llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll - similar to #1 Differential Revision: https://reviews.llvm.org/D74851	2020-02-27 15:19:37 -05:00
Sander de Smalen	8fbc925807	Add OffsetIsScalable to getMemOperandWithOffset Summary: Making `Scale` a `TypeSize` in AArch64InstrInfo::getMemOpInfo, has the effect that all places where this information is used (notably, TargetInstrInfo::getMemOperandWithOffset) will need to consider Scale - and derived, Offset - possibly being scalable. This patch adds a new operand `bool &OffsetIsScalable` to TargetInstrInfo::getMemOperandWithOffset and fixes up all the places where this function is used, to consider the offset possibly being scalable. In most cases, this means bailing out because the algorithm does not (or cannot) support scalable offsets in places where it does some form of alias checking for example. Reviewers: rovka, efriedma, kristof.beyls Reviewed By: efriedma Subscribers: wuzish, kerbowa, MatzeB, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, javed.absar, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72758	2020-02-18 15:53:29 +00:00
Craig Topper	e629674176	[X86] Add more scalar intrinsic instructions to isNonFoldablePartialRegisterLoad. I think this covers most if not all of the scalar intrinsic instructions.	2020-02-08 20:41:36 -08:00
Simon Pilgrim	10417ad2e4	[X86] Standardize BROADCAST enum names (PR31079) Tweak EVEX implementation names so it matches the other variants by adding the 'r' prefix. Oddly some of the subvec broadcast ops already matched.	2020-02-08 16:55:00 +00:00
Craig Topper	600f2e1c4d	[X86] Remove SETB_C8r/SETB_C16r pseudo instructions. Use SETB_C32r and EXTRACT_SUBREG instead. Only 32 and 64 bit SBB are dependency breaking instructons on some CPUs. The 8 and 16 bit forms have to preserve upper bits of the GPR. This patch removes the smaller forms and selects the wider form instead. I had to do this with custom code as the tblgen generated code glued the eflags copytoreg to the extract_subreg instead of to the SETB pseudo. Longer term I think we can remove X86ISD::SETCC_CARRY and use (X86ISD::SBB zero, zero). We'll want to keep the pseudo and select (X86ISD::SBB zero, zero) to either a MOV32r0+SBB for targets where there is no dependency break and SETB_C32/SETB_C64 for targets that have a dependency break. May want some way to avoid the MOV32r0 if the instruction that produced the carry flag happened to def a register that we can use for the dependency. I think the flag copy lowering should be using NEG instead of SUB to handle SETB. That would avoid the MOV32r0 there. Or maybe it should use a ADC with -1 to recreate the carry flag and keep the SETB? That would avoid a MOVZX on the input of the SUB. Differential Revision: https://reviews.llvm.org/D74024	2020-02-06 10:22:24 -08:00
Simon Moll	5c8ba508b2	[NFC] unsigned->Register in storeRegTo/loadRegFromStack Summary: This patch makes progress on the 'unsigned -> Register' rewrite for `TargetInstrInfo::loadRegFromStack` and `TII::storeRegToStack`. Reviewers: arsenm, craig.topper, uweigand, jpienaar, atanasyan, venkatra, robertlytton, dylanmckay, t.p.northover, kparzysz, tstellar, k-ishizaka Reviewed By: arsenm Subscribers: wuzish, merge_guards_bot, jyknight, sdardis, nemanjai, jvesely, wdng, nhaehnle, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73870	2020-02-03 14:22:16 +01:00
Jay Foad	b482e1bfe2	[CodeGen] Make use of MachineInstrBuilder::getReg Reviewers: arsenm Subscribers: wdng, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73262	2020-01-23 13:38:13 +00:00
Jay Foad	e0f0d0e55c	[MachineScheduler] Allow clustering mem ops with complex addresses The generic BaseMemOpClusterMutation calls into TargetInstrInfo to analyze the address of each load/store instruction, and again to decide whether two instructions should be clustered. Previously this had to represent each address as a single base operand plus a constant byte offset. This patch extends it to support any number of base operands. The old target hook getMemOperandWithOffset is now a convenience function for callers that are only prepared to handle a single base operand. It calls the new more general target hook getMemOperandsWithOffset. The only requirements for the base operands returned by getMemOperandsWithOffset are: - they can be sorted by MemOpInfo::Compare, such that clusterable ops get sorted next to each other, and - shouldClusterMemOps knows what they mean. One simple follow-on is to enable clustering of AMDGPU FLAT instructions with both vaddr and saddr (base register + offset register). I've left a FIXME in the code for this case. Differential Revision: https://reviews.llvm.org/D71655	2020-01-22 14:28:24 +00:00
Amara Emerson	67a8775322	[AArch64] Don't generate gpr CSEL instructions in early-ifcvt if regclasses aren't compatible. In GlobalISel we may in some unfortunate circumstances generate PHIs with operands that are on separate banks. If-conversion doesn't currently check for that case and ends up generating a CSEL on AArch64 with incorrect register operands. Differential Revision: https://reviews.llvm.org/D72961	2020-01-21 16:51:31 -08:00
Craig Topper	b1dcd84c7e	[X86] Copy the nofpexcept flag when folding a load into an instruction using the load folding tables./	2020-01-13 22:02:45 -08:00
Craig Topper	86f48999f4	[X86] Fix typo in getCMovOpcode. The 64-bit HasMemoryOperand line was using CMOV32rm instead of CMOV64rm. Not sure how to test this. We have no test coverage that passes true for HasMemoryOperand.	2019-12-31 21:50:38 -08:00
Kristof Beyls	870f39d310	Fix assertion failure in getMemOperandWithOffsetWidth This fixes an assertion failure that triggers inside getMemOperandWithOffset when Machine Sinking calls it on a MachineInstr that is not a memory operation. Different backends implement getMemOperandWithOffset differently: some return false on non-memory MachineInstrs, others assert. The Machine Sinking pass in at least SinkingPreventsImplicitNullCheck relies on getMemOperandWithOffset to return false on non-memory MachineInstrs, instead of asserting. This patch updates the documentation on getMemOperandWithOffset that it should return false on any MachineInstr it cannot handle, instead of asserting. It also adapts the in-tree backends accordingly where necessary. Differential Revision: https://reviews.llvm.org/D71359	2019-12-17 10:56:09 +00:00
David Stenberg	6965f835b4	[DebugInfo] Make describeLoadedValue() reg aware Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431	2019-12-09 10:47:49 +01:00
David Stenberg	f3696533f2	Revert "[DebugInfo] Make describeLoadedValue() reg aware" This reverts commit `3cd93a4efc`. I'll recommit with a well-formatted arcanist commit message.	2019-12-09 10:45:13 +01:00
David Stenberg	3cd93a4efc	[DebugInfo] Make describeLoadedValue() reg aware Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.	2019-12-09 10:44:17 +01:00
Wang, Pengfei	c1c673303d	[X86] Model MXCSR for all AVX512 instructions Summary: Model MXCSR for all AVX512 instructions Reviewers: craig.topper, RKSimon, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70881	2019-12-04 08:07:38 +08:00
Matt Arsenault	e6c9a9af39	Use MCRegister in copyPhysReg	2019-11-11 14:42:33 +05:30
Djordje Todorovic	8d2ccd1ac3	Reland: [TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-11-08 13:00:39 +01:00
Simon Pilgrim	3842b94c4e	Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375	2019-10-31 18:00:29 +00:00
Djordje Todorovic	57ee0435bd	[TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-10-31 15:34:49 +01:00
David Candler	92aa0c2dbc	[cfi] Add flag to always generate .debug_frame This adds a flag to LLVM and clang to always generate a .debug_frame section, even if other debug information is not being generated. In situations where .eh_frame would normally be emitted, both .debug_frame and .eh_frame will be used. Differential Revision: https://reviews.llvm.org/D67216	2019-10-31 09:48:30 +00:00
Craig Topper	6cb181f086	[X86] Rewrite hasReassociableOperands and setSpecialOperandAttr to not hardcode number of operands or position of the EFLAGS operand. This makes the code immune to the MXCSR addition in D68121.	2019-10-30 14:34:10 -07:00
David Stenberg	74a72e6848	[DebugInfo] Stop describing imms in TargetInstrInfo's describeLoadedValue() impl Summary: The default implementation of the describeLoadedValue() hook uses the MoveImm property to determine if an instruction moves an immediate. If an instruction has that property the function returns the second operand, assuming that that is the immediate value the instruction moves. As far as I can tell, the MoveImm property does not imply that the second operand is the immediate value, nor that any other operand necessarily holds the immediate value; it just means that the instruction moves some immediate value. One example where the second operand is not the immediate is SystemZ's LZER instruction, which moves a zero immediate implicitly: $f0S = LZER. That case triggered an out-of-bound assertion when getting the operand. I have added a test case for that instruction. Another example is ARM's MVN instruction, which holds the logical bitwise NOT'd value of the immediate that is moved. For the following reproducer: extern void foo(int); int main() { foo(-11); } an incorrect call site value would be emitted: $ clang --target=arm foo.c -O1 -g -Xclang -femit-debug-entry-values \ -c -o - \| ./build/bin/llvm-dwarfdump - \| \ grep -A2 call_site_parameter 0x00000058: DW_TAG_GNU_call_site_parameter DW_AT_location (DW_OP_reg0 R0) DW_AT_GNU_call_site_value (DW_OP_lit10) Another example is the A2_combineii instruction on Hexagon which moves two immediates to a super-register: $d0 = A2_combineii 20, 10. Perhaps these are rare exceptions, and most MoveImm instructions hold the immediate in the second operand, but in my opinion the default implementation of the hook should only describe values that it can, by some contract, guarantee are safe to describe, rather than leaving it up to the targets to override the exceptions, as that can silently result in incorrect call site values. This patch adds X86's relevant move immediate instructions to the target's hook implementation, so this commit should be a NFC for that target. We need to do the same for ARM and AArch64. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D69109	2019-10-23 11:41:29 +02:00
Reid Kleckner	0ad6c191de	Prune Analysis includes from SelectionDAG.h Only forward declarations are needed here. Follow-on to r375311. llvm-svn: 375319	2019-10-19 01:07:48 +00:00
Craig Topper	912870573c	[X86] convertToThreeAddress, make sure second operand of SUB32ri is really an immediate before calling getImm(). It might be a symbol instead. We can't fold those since we can't negate them. Similar for other SUB with immediates. Fixes PR43529. llvm-svn: 373397	2019-10-01 21:55:55 +00:00
Craig Topper	d3f82b8b97	[X86] Add VMOVSSZrrk/VMOVSDZrrk/VMOVSSZrrkz/VMOVSDZrrkz to getUndefRegClearance. We have isel patterns that can put an IMPLICIT_DEF on one of the sources for these instructions. So we should make sure we break any dependencies there. This should be done by just using one of the other sources. llvm-svn: 373025	2019-09-26 22:56:06 +00:00
Simon Pilgrim	5f2d8b2618	[TargetInstrInfo] Let findCommutedOpIndices take const MachineInstr& Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future. Committed on behalf of @hvdijk (Harald van Dijk) Differential Revision: https://reviews.llvm.org/D66138 llvm-svn: 372882	2019-09-25 14:55:57 +00:00
Craig Topper	769dd59a27	[X86] Allow masked VBROADCAST instructions to be turned into BLENDM with a broadcast load to avoid a copy. The BLENDM instructions allow an 2 sources and an independent destination while masked VBROADCAST has the destination tied to the source. llvm-svn: 372068	2019-09-17 04:41:10 +00:00
Craig Topper	2cc57bedd5	[X86] Add support for commuting EVEX VCMP instructons with any immediate value. Previously we limited to the EQ/NE/TRUE/FALSE/ORD/UNORD immediates. llvm-svn: 372067	2019-09-17 04:41:05 +00:00
Craig Topper	359918dadf	[X86] Enable commuting of EVEX VCMP for all immediate values during isel. llvm-svn: 372065	2019-09-17 04:40:58 +00:00
Craig Topper	72624b0e59	[X86] Use xorps to create fp128 +0.0 constants. This matches what we do for f32/f64. gcc also does this for fp128. llvm-svn: 371357	2019-09-09 01:35:00 +00:00
David Stenberg	5a583665f4	[DebugInfo][X86] Describe call site values for zero-valued imms Summary: Add zero-materializing XORs to X86's describeLoadedValue() hook in order to produce call site values. I have had to change the defs logic in collectCallSiteParameters() a bit to be able to describe the XORs. The XORs implicitly define $eflags, which would cause them to never be considered, due to a guard condition that I->getNumDefs() is one. I have changed that condition so that we now only consider instructions where a forwarded register overlaps with the instruction's single explicit define. We still need to collect the implicit defines of other forwarded registers to remove them from the work list. I'm not sure how to move towards supporting instructions with multiple explicit defines, cases where forwarded register are implicitly defined, and/or cases where an instruction produces values for multiple forwarded registers. Perhaps the describeLoadedValue() hook should take a register argument, and we then leave it up to the hook to describe the loaded value in that register? I have not yet encountered a situation where that would be necessary though. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: ychen, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67225 llvm-svn: 371333	2019-09-08 14:22:06 +00:00
David Stenberg	8b70139e95	[NFC] Make the describeLoadedValue() hook return machine operand objects Summary: This changes the ParamLoadedValue pair which the describeLoadedValue() hook returns so that MachineOperand objects are returned instead of pointers. When describing call site values we may need to describe operands which are not part of the instruction. One such example is zero-materializing XORs on x86, which I have implemented support for in a child revision. Instead of having to return a pointer to an operand stored somewhere outside the instruction, start returning objects directly instead, as that simplifies the code. The MachineOperand class only holds POD members, and on x86-64 it is 32 bytes large. That combined with copy elision means that the overhead of returning a machine operand object from the hook does not become very large. I benchmarked this on a 8-thread i7-8650U machine with 32 GB RAM. The benchmark consisted of building a clang 8.0 binary configured with: -DCMAKE_BUILD_TYPE=RelWithDebInfo \ -DLLVM_TARGETS_TO_BUILD=X86 \ -DLLVM_USE_SANITIZER=Address \ -DCMAKE_CXX_FLAGS="-Xclang -femit-debug-entry-values -stdlib=libc++" The average wall clock time increased by 4 seconds, from 62:05 to 62:09, which is an 0.1% increase. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: hiraditya, ychen, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67261 llvm-svn: 371332	2019-09-08 14:05:10 +00:00
Simon Pilgrim	67991a59cb	[X86] X86InstrInfo::optimizeCompareInstr - fix potential null dereference. Fixes clang static-analyzer warning. Technically the MachineInstr *Sub might still be null if we're comparing zero (IsCmpZero == true), although this probably won't happen as SrcReg2 is probably == 0. llvm-svn: 371047	2019-09-05 10:18:24 +00:00
Craig Topper	3ab210862a	[X86] Add initial support for unfolding broadcast loads from arithmetic instructions to enable LICM hoisting of the load MachineLICM can hoist an invariant load, but if that load is folded it needs to be unfolded. On AVX512 sometimes this load is an broadcast load which we were previously unable to unfold. This patch adds initial support for that with a very basic list of supported instructions as a starting point. Differential Revision: https://reviews.llvm.org/D67017 llvm-svn: 370620	2019-09-01 22:14:36 +00:00
Craig Topper	1329cc6e01	[X86] Compress the flag bits in the folding tables to make room for more bits in an upcoming patch. llvm-svn: 370600	2019-08-31 23:52:21 +00:00
Craig Topper	66f03ba17d	[X86] Merge X86InstrInfo::loadRegFromAddr/storeRegToAddr into their only call site. I'm looking at unfolding broadcast loads on AVX512 which will require refactoring this code to select broadcast opcodes instead of regular load/stores in some cases. Merging them to avoid further complicating their interfaces. llvm-svn: 370484	2019-08-30 16:05:57 +00:00
Craig Topper	160ed4cab4	[X86] Explicitly list all the always trivially rematerializable instructions. Add a default with an llvm_unreachable for anything we don't expect. This seems safer that just blindly returning true for anything missing from the switch. llvm-svn: 370424	2019-08-30 00:54:36 +00:00
Craig Topper	bccd183217	[X86] Mark VPDPWSSD and VPDPWSSDS as commutable. Add stack folding tests. llvm-svn: 369792	2019-08-23 18:05:37 +00:00
Craig Topper	a17d1d2250	[X86] Use Register/MCRegister in more places in X86 This was a quick pass through some obvious places. I haven't tried the clang-tidy check. I also replaced the zeroes in getX86SubSuperRegister with X86::NoRegister which is the real sentinel name. Differential Revision: https://reviews.llvm.org/D66363 llvm-svn: 369151	2019-08-16 20:50:23 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Djordje Todorovic	b9973f87c6	Reland "[DwarfDebug] Dump call site debug info" The build failure found after the rL365467 has been resolved. Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 367446	2019-07-31 16:51:28 +00:00
Craig Topper	51193871da	[X86] Teach convertToThreeAddress to handle SUB with immediate We mostly avoid sub with immediate but there are a couple cases that can create them. One is the add 128, %rax -> sub -128, %rax trick in isel. The other is when a SUB immediate gets created for a compare where both the flags and the subtract value is used. If we are unable to linearize the SelectionDAG to satisfy the flag user and the sub result user from the same instruction, we will clone the sub immediate for the two uses. The one that produces flags will eventually become a compare. The other will have its flag output dead, and could then be considered for LEA creation. I added additional test cases to add.ll to show the the sub -128 trick gets converted to LEA and a case where we don't need to convert it. This showed up in the current codegen for PR42571. Differential Revision: https://reviews.llvm.org/D64574 llvm-svn: 366151	2019-07-15 23:07:56 +00:00
Craig Topper	9450b0084a	[X86] Remove offset of 8 from the call to FuseInst for UNPCKLPDrr folding added in r365287. This was copy/pasted from above and I forgot to change it. We just need the default offset of 0 here. Fixes PR42616. llvm-svn: 366011	2019-07-14 04:13:33 +00:00
Craig Topper	b828f0b90a	[X86] Use MachineInstr::findRegisterDefOperand to simplify some code in optimizeCompareInstr. NFCI llvm-svn: 365946	2019-07-12 19:26:35 +00:00
Craig Topper	98f931639b	[X86] Add NEG to isUseDefConvertible. We can use the C flag from NEG to detect that the input was zero. Really we could probably use the Z flag too. But C matches what we'd do for usubo 0, X. Haven't found a test case for this due to the usubo formation in CGP. But I verified if I comment out the CGP code this transformation catches some of the same cases. llvm-svn: 365929	2019-07-12 17:52:17 +00:00
Djordje Todorovic	0739ccd3b5	Revert "[DwarfDebug] Dump call site debug info" A build failure was found on the SystemZ platform. This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc. llvm-svn: 365886	2019-07-12 09:45:12 +00:00
Craig Topper	d916f23b83	[X86] Add BLSR and BLSMSK to isUseDefConvertible. Unfortunately subo formation in CGP prevents obvious ways of testing this. But we already have BLSI in here and the flag behavior is well understood. Might become more useful if we improve PR42571. llvm-svn: 365702	2019-07-10 22:14:39 +00:00
Djordje Todorovic	01eaae6dd1	[DwarfDebug] Dump call site debug info Dump the DWARF information about call sites and call site parameters into debug info sections. The patch also provides an interface for the interpretation of instructions that could load values of a call site parameters in order to generate DWARF about the call site parameters. ([13/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 365467	2019-07-09 11:33:56 +00:00
Craig Topper	1deca50ab1	[X86] Allow execution domain fixing to turn SHUFPD into SHUFPS. This can help with code size on SSE targets where SHUFPD requires a 0x66 prefix and SHUFPS doesn't. llvm-svn: 365293	2019-07-08 06:52:49 +00:00
Craig Topper	d8261f0288	[X86] Make movsd commutable to shufpd with a 0x02 immediate on pre-SSE4.1 targets. This can help avoid a copy or enable load folding. On SSE4.1 targets we can commute it to blendi instead. I had to make shufpd with a 0x02 immediate commutable as well since we expect commuting to be reversible. llvm-svn: 365292	2019-07-08 06:52:43 +00:00
Craig Topper	46f2b583a2	[X86] Add MOVSDrr->MOVLPDrm entry to load folding table. Add custom handling to turn UNPCKLPDrr->MOVHPDrm when load is under aligned. If the load is aligned we can turn UNPCKLPDrr into UNPCKLPDrm. llvm-svn: 365287	2019-07-08 02:10:20 +00:00
Craig Topper	317d6093df	[X86] Remove patterns from MOVLPSmr and MOVHPSmr instructions. These patterns are the same as the MOVLPDmr and MOVHPDmr patterns, but with a bitcast at the end. We can just select the PD instruction and let execution domain fixing switch to PS. llvm-svn: 365267	2019-07-06 17:59:51 +00:00
Craig Topper	d22b2d01ca	[X86] Correct the size check in foldMemoryOperandCustom. The Size either needs to be 0 meaning we aren't folding a stack reload. Or the stack slot needs to be at least 16 bytes. I've also added a paranoia check ensure the RCSize is at leat 16 bytes as well. This avoids any FR32/FR64 surprises, but I think we already filtered those earlier. All of our test case have Size as either 0 or 16 and RCSize == 16. So the Size <= 16 check worked for those cases. llvm-svn: 365234	2019-07-05 18:54:00 +00:00
Matt Arsenault	e3a676e9ad	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191	2019-06-24 15:50:29 +00:00
Craig Topper	9e1665f2d6	[X86] Add BLSI to isUseDefConvertible. Summary: BLSI sets the C flag is the input is not zero. So if its followed by a TEST of the input where only the Z flag is consumed, we can replace it with the opposite check of the C flag. We should be able to do the same for BLSMSK and BLSR, but the naive test case for those is being optimized to a subo by CodeGenPrepare. Reviewers: spatel, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63589 llvm-svn: 363957	2019-06-20 17:52:53 +00:00
Craig Topper	b4ea64570c	[X86] Remove memory instructions form isUseDefConvertible. The caller of this is looking for comparisons of the input to these instructions with 0. But the memory instructions input is an addess not a value input in a register. llvm-svn: 363907	2019-06-20 04:58:40 +00:00
Craig Topper	8582ecd8d9	[X86] Introduce new MOVSSrm/MOVSDrm opcodes that use VR128 register class. Rename the old versions that use FR32/FR64 to MOVSSrm_alt/MOVSDrm_alt. Use the new versions in patterns that previously used a COPY_TO_REGCLASS to VR128. These patterns expect the upper bits to be zero. The current set up appears to work, but I'm not sure we should be enforcing upper bits being zero through a COPY_TO_REGCLASS. I wanted to flip the arrangement and use a COPY_TO_REGCLASS to FR32/FR64 for the patterns that need an f32/f64 result, but that complicated fastisel and globalisel. I've been doing some experiments with reducing some isel patterns and ended up in a situation where I had a (SUBREG_TO_REG (COPY_TO_RECLASS (VMOVSSrm), VR128)) and our post-isel peephole was unable to avoid using an instruction for the SUBREG_TO_REG due to the COPY_TO_REGCLASS. Having a VR128 instruction removes the COPY_TO_REGCLASS that was breaking this. llvm-svn: 363643	2019-06-18 03:23:11 +00:00
Simon Pilgrim	6b56ad164c	[CodeGen] Add getMachineMemOperand + MachineMemOperand::Flags allocator helper wrapper. NFCI. Pre-commit for D62726 on behalf of @luke (Luke Lau) llvm-svn: 363257	2019-06-13 12:58:55 +00:00
Craig Topper	ed4cd44870	[X86] Add VCMPSSZrr_Intk and VCMPSDZrr_Intk to isNonFoldablePartialRegisterLoad. The non-masked versions are already in there. I'm having some trouble coming up with a way to test this right now. Most load folding should happen during isel so I'm not sure how to get peephole pass to do it. llvm-svn: 363125	2019-06-12 06:29:53 +00:00
Craig Topper	627d8168e7	[X86] Add load folding isel patterns to scalar_math_patterns and AVX512_scalar_math_fp_patterns. Also add a FIXME for the peephole pass not being able to handle this. llvm-svn: 363032	2019-06-11 04:30:53 +00:00
Jonas Paulsson	fdc4ea34e3	[SystemZ, RegAlloc] Favor 3-address instructions during instruction selection. This patch aims to reduce spilling and register moves by using the 3-address versions of instructions per default instead of the 2-address equivalent ones. It seems that both spilling and register moves are improved noticeably generally. Regalloc hints are passed to increase conversions to 2-address instructions which are done in SystemZShortenInst.cpp (after regalloc). Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are the same), foldMemoryOperandImpl() can no longer trivially fold a spilled source register since the reg/reg instruction is now 3-address. In order to remedy this, new 3-address pseudo memory instructions are used to perform the folding only when the dst and lhs virtual registers are known to be allocated to the same physreg. In order to not let MachineCopyPropagation run and change registers on these transformed instructions (making it 3-address), a new target pass called SystemZPostRewrite.cpp is run just after VirtRegRewriter, that immediately lowers the pseudo to a target instruction. If it would have been possibe to insert a COPY instruction and change a register operand (convert to 2-address) in foldMemoryOperandImpl() while trusting that the caller (e.g. InlineSpiller) would update/repair the involved LiveIntervals, the solution involving pseudo instructions would not have been needed. This is perhaps a potential improvement (see Phabricator post). Common code changes: * A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a target pass immediately before MachineCopyPropagation. * VirtRegMap is passed as an argument to foldMemoryOperand(). Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D60888 llvm-svn: 362868	2019-06-08 06:19:15 +00:00
Craig Topper	6b67dfa54c	[X86] Make masked floating point equality/ordered compares commutable for load folding purposes. Same as what is supported for the unmasked form. llvm-svn: 362717	2019-06-06 16:39:04 +00:00
Craig Topper	d0fff89b81	[X86] Add the vector integer min/max instructions to isAssociativeAndCommutative. As far as I know these should be freely reassociatable just like the floating point MAXC/MINC instructions. The reduce test changes are largely regressions and caused by the "generic" CPU we default to not having a scheduler model. The machine-combiner-int-vec.ll test shows the positive benefits of this change. Differential Revision: https://reviews.llvm.org/D62787 llvm-svn: 362629	2019-06-05 18:25:09 +00:00
Craig Topper	396a915c26	[X86] Add the SSE versions of PMULLW and PMULLD to isAssociativeAndCommutative. llvm-svn: 362309	2019-06-02 00:42:58 +00:00
Pengfei Wang	2e67d0c842	[X86] Add VP2INTERSECT instructions Support Intel AVX512 VP2INTERSECT instructions in llvm Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D62366 llvm-svn: 362188	2019-05-31 02:50:41 +00:00
David Greene	561fcc0d63	[X86-64] Fix 256-bit SET0 lowering for non-VLX targets If we don't have VLX then 256-bit SET0 should be lowered to VPXOR with ZMM registers. This restores functionality accidentally removed by r309926. Differential Revision: https://reviews.llvm.org/D62415 llvm-svn: 361843	2019-05-28 15:37:01 +00:00
Philip Reames	849ef823df	Factor out redzone ABI checks [NFCI] As requested in D58632, cleanup our red zone detection logic in the X86 backend. The existing X86MachineFunctionInfo flag is used to track whether we use the redzone (via a particularly optimization?), but there's no common way to check whether the function has a red zone. I'd appreciate careful review of the uses being updated. I think they are NFC, but a careful eye from someone else would be appreciated. Differential Revision: https://reviews.llvm.org/D61799 llvm-svn: 360479	2019-05-10 22:55:42 +00:00
Simon Pilgrim	3c975a0ab5	[X86] Reduce scope of variables where possible. NFCI. Fixes cppcheck warnings. llvm-svn: 360131	2019-05-07 10:50:11 +00:00
Simon Pilgrim	04dad8f66d	[X86] X86InstrInfo::findThreeSrcCommutedOpIndices - fix unread variable warning. scan-build was reporting that CommutableOpIdx1 never used its original initialized value - move it down to where its first used to make the real initialization more obvious (and matches the comment that's there). llvm-svn: 360028	2019-05-06 10:15:34 +00:00
Bjorn Pettersson	238c9d6308	[CodeGen] Add "const" to MachineInstr::mayAlias Summary: The basic idea here is to make it possible to use MachineInstr::mayAlias also when the MachineInstr is const (or the "Other" MachineInstr is const). The addition of const in MachineInstr::mayAlias then rippled down to the need for adding const in several other places, such as TargetTransformInfo::getMemOperandWithOffset. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60856 llvm-svn: 358744	2019-04-19 09:08:38 +00:00
Craig Topper	6bf0802738	[X86] In CopyToFromAsymmetricReg, use VR128 instead of FR32 instructions for GR32<->XMM register copies. We have two versions of some instructions, VR128 versions and FR32 versions that are marked as CodeGenOnly. This change switches to using the VR128 versions for these copies. It's after register allocation so the class size no longer matters. This matches how GR64 works. llvm-svn: 358555	2019-04-17 06:09:11 +00:00
Craig Topper	80aa2290fb	[X86] Merge the different Jcc instructions for each condition code into single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 llvm-svn: 357802	2019-04-05 19:28:09 +00:00
Craig Topper	7323c2bf85	[X86] Merge the different SETcc instructions for each condition code into single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri Reviewed By: andreadb Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60138 llvm-svn: 357801	2019-04-05 19:27:49 +00:00
Craig Topper	e0bfeb5f24	[X86] Merge the different CMOV instructions for each condition code into single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800	2019-04-05 19:27:41 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Craig Topper	9224c114a9	[X86] Mark the default case of the X86InstrInfo::convertToThreeAddress switch as unreachable. This function should only be called with instructions that are really convertible. And all convertible instructions need to be handled by the switch. So nothing should use the default. llvm-svn: 357529	2019-04-02 20:52:16 +00:00
Craig Topper	7c9afc35bc	[X86] Add post-isel pseudos for rotate by immediate using SHLD/SHRD Haswell CPUs have special support for SHLD/SHRD with the same register for both sources. Such an instruction will go to the rotate/shift unit on port 0 or 6. This gives it 1 cycle latency and 0.5 cycle reciprocal throughput. When the register is not the same, it becomes a 3 cycle operation on port 1. Sandybridge and Ivybridge always have 1 cyc latency and 0.5 cycle reciprocal throughput for any SHLD. When FastSHLDRotate feature flag is set, we try to use SHLD for rotate by immediate unless BMI2 is enabled. But MachineCopyPropagation can look through a copy and change one of the sources to be different. This will break the hardware optimization. This patch adds psuedo instruction to hide the second source input until after register allocation and MachineCopyPropagation. I'm not sure if this is the best way to do this or if there's some other way we can make this work. Fixes PR41055 Differential Revision: https://reviews.llvm.org/D59391 llvm-svn: 357096	2019-03-27 17:29:34 +00:00
Craig Topper	b4c49255aa	[X86] Make ADD*_DB post-RA pseudos and expand them in expandPostRAPseudo. These are used to help convert OR->LEA when needed to avoid avoid a copy. They aren't need after register allocation. Happens to remove an ugly goto from X86MCCodeEmitter.cpp llvm-svn: 356356	2019-03-18 05:48:18 +00:00
Craig Topper	4cf8cdc51d	[X86] Remove VCVTSI2SDZrrb_Int as it shouldn't exist. This would convert a signed 32-bit integer to double precision with rounding. But there's nothing to round. llvm-svn: 355795	2019-03-11 01:20:37 +00:00
Craig Topper	4a9dd7c39b	[X86] Enable 8-bit SHL to convert to LEA Differential Revision: https://reviews.llvm.org/D58870 llvm-svn: 355425	2019-03-05 18:37:41 +00:00

1 2 3 4 5 ...

1495 Commits