llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthias Braun	d1aabb2813	livePhysRegs: Pass MBB by reference in addLive{Ins\|Outs}(); NFC The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340	2016-05-03 00:24:32 +00:00
Matthias Braun	24f26e6d91	LivePhysRegs: Automatically determine presence of pristine regs. Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336	2016-05-03 00:08:46 +00:00
Quentin Colombet	4e1d389ac5	[X86] Model FAULTING_LOAD_OP as a terminator and branch. This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327	2016-05-02 22:58:54 +00:00
Simon Pilgrim	52f8693263	[X86][SSE] Added placeholder for 128/256-bit wide shuffle combines Begun adding placeholder for future support for vperm2f128/vshuff64x2 style 128/256-bit wide shuffles llvm-svn: 268306	2016-05-02 21:12:48 +00:00
Matt Arsenault	bcdfee7030	AMDGPU: Custom lower v2i32 loads and stores This will allow us to split up 64-bit private accesses when necessary. llvm-svn: 268296	2016-05-02 20:13:51 +00:00
Tom Stellard	154c9cdd24	AMDGPU/SI: Use v_readfirstlane_b32 when restoring SGPRs spilled to scratch We were using v_readlane_b32 with the lane set to zero, but this won't work if thread 0 is not active. Differential Revision: http://reviews.llvm.org/D19745 llvm-svn: 268295	2016-05-02 20:11:44 +00:00
Matt Arsenault	2b957b5a6f	AMDGPU: Make i64 loads/stores promote to v2i32 Now that unaligned access expansion should not attempt to produce i64 accesses, we can remove the hack in PreprocessISelDAG where this is done. This allows splitting i64 private accesses while allowing the new add nodes indexing the vector components can be folded with the base pointer arithmetic. llvm-svn: 268293	2016-05-02 20:07:26 +00:00
Reid Kleckner	0549ab6033	Fix instance of -Winconsistent-missing-override in AMDGPU code llvm-svn: 268289	2016-05-02 19:45:10 +00:00
Tom Stellard	ce5e994887	AMDGPU/SI: Set the kill flag on temp VGPRs used to restore SGPRs from scratch Summary: When we restore an SGPR value from scratch, we first load it into a temporary VGPR and then use v_readlane_b32 to copy the value from the VGPR back into an SGPR. We weren't setting the kill flag on the VGPR in the v_readlane_b32 instruction, so the register scavenger wasn't able to re-use this temp value later. I wasn't able to create a lit test for this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19744 llvm-svn: 268287	2016-05-02 19:37:56 +00:00
Tim Northover	c08db1840c	ARM: fix handling of SUB immediates in peephole opt. We were negating an immediate that was going to be used in a SUBri form unnecessarily. Since ADD/SUB are very similar we can do that, but we have to change the SUB to an ADD at the same time. This also applies to ADD, and allows us to handle a slightly larger range of immediates for those two operations. rdar://25992245 llvm-svn: 268276	2016-05-02 18:30:08 +00:00
Justin Holewinski	9a6ea2c256	[NVPTX] Fix sign/zero-extending ldg/ldu instruction selection Summary: We don't have sign-/zero-extending ldg/ldu instructions defined, so we need to emulate them with explicit CVTs. We were originally handling the i8 case, but not any other cases. Fixes PR26185 Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D19615 llvm-svn: 268272	2016-05-02 18:12:02 +00:00
Tom Stellard	27233b727f	AMDGPU: Move R600 specific code out of AMDGPUISelLowering.cpp Reviewers: arsenm Subscribers: jvesely, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19736 llvm-svn: 268267	2016-05-02 18:05:17 +00:00
Tom Stellard	341e293d67	AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260 We can't use MI->getDebugLoc() when MI is an iterator that could be MBB.end(). llvm-svn: 268265	2016-05-02 18:02:24 +00:00
Tom Stellard	1f520e5c98	AMDGPU/SI: Use the hazard recognizer to break SMEM soft clauses Summary: Add support for detecting hazards in SMEM soft clauses, so that we only break the clauses when necessary, either by adding s_nop or re-ordering other alu instructions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18870 llvm-svn: 268260	2016-05-02 17:39:06 +00:00
Nicolai Haehnle	119d3d80cb	AMDGPU: llvm.SI.fs.constant is a source of divergence Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 llvm-svn: 268259	2016-05-02 17:37:01 +00:00
Derek Schuff	31680dd832	[WebAssembly] Rename memory_size intrinsic to current_memory This follows the recent renaming in the wasm spec. llvm-svn: 268255	2016-05-02 17:25:22 +00:00
Tom Stellard	a27007eb4f	AMDGPU/SI: Use hazard recognizer to detect DPP hazards Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18603 llvm-svn: 268247	2016-05-02 16:23:09 +00:00
Simon Pilgrim	e5e04baf95	[X86][SSE] Dropped X86ISD::FGETSIGNx86 and use MOVMSK instead for FGETSIGN lowering movmsk.ll tests are unchanged. llvm-svn: 268237	2016-05-02 14:58:22 +00:00
Chad Rosier	9d1a556125	Cleanup comments. NFC. llvm-svn: 268236	2016-05-02 14:56:21 +00:00
Chad Rosier	7b6001ee0f	Cleanup comments. NFC. llvm-svn: 268235	2016-05-02 14:50:30 +00:00
Aaron Ballman	5c190d056d	Silence unused variable warnings; NFC. llvm-svn: 268234	2016-05-02 14:48:03 +00:00
David L Kreitzer	0fe4632bd7	Enable the X86 call frame optimization for the 64-bit targets that allow it. Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227	2016-05-02 13:45:25 +00:00
Jonas Paulsson	f0344826b9	[SystemZ] Fix in restoreCalleeSavedRegisters() Only add operands for GRs to the LMG. Reviewed by Ulrich Weigand. llvm-svn: 268216	2016-05-02 09:37:44 +00:00
Jonas Paulsson	9028acf0b3	[SystemZ] Mark CC defs as dead whenever possible. Marking implicit CC defs as dead everywhere except when CC is actually defined and used explicitly, is important since the post-ra scheduler will otherwise insert edges between instructions unnecessarily. Also temporarily disable LA(Y)-> AGSI optimization in foldMemoryOperandImpl(), since this inroduces a def of the CC reg, which is illegal unless it is known to be dead. Reviewed by Ulrich Weigand. llvm-svn: 268215	2016-05-02 09:37:40 +00:00
Craig Topper	7b5925a5b6	[X86] Fix a bug in LOCK arithmetic operation pattern matching where the wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates. This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first. llvm-svn: 268212	2016-05-02 05:44:21 +00:00
Craig Topper	b6da65403a	[AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While there fix the execution domain for VPACKSSDW/VPACKUSDW. llvm-svn: 268200	2016-05-01 17:38:32 +00:00
Igor Breger	131008fbcb	Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190	2016-05-01 08:40:00 +00:00
Craig Topper	e430de8be6	[AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when VLX and BWI are supported. llvm-svn: 268189	2016-05-01 06:52:19 +00:00
Craig Topper	5acb5a1caf	[AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB and VPMADDUBSW/VPMADDWD. llvm-svn: 268188	2016-05-01 06:24:57 +00:00
Craig Topper	db290664f6	[AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also marked as requiring VLX. llvm-svn: 268186	2016-05-01 05:57:06 +00:00
Craig Topper	f77ca947ce	[X86] Add an AddedComplexity to another pattern to put it near similar in the output file. llvm-svn: 268184	2016-05-01 05:22:15 +00:00
Craig Topper	742977ede8	[X86] Remove a seemlingly unused pattern. The same pattern appears elsewhere with an AddedComplexity that made this unreachable. llvm-svn: 268183	2016-05-01 05:22:13 +00:00
Craig Topper	eb9a87918b	[X86] Add AddedComplexity to keep some similar patterns near each other in the output file. llvm-svn: 268181	2016-05-01 04:59:49 +00:00
Craig Topper	7ed84d826e	[X86] Remove some redundant selection patterns. llvm-svn: 268180	2016-05-01 04:59:46 +00:00
Craig Topper	c9b1923358	[AVX512] Replace vector_extract with extractelt in some patterns. They mean the same thing but vector_extract is deprecated. NFC llvm-svn: 268179	2016-05-01 04:59:44 +00:00
Craig Topper	99f6b620cc	[AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions. llvm-svn: 268174	2016-05-01 01:03:56 +00:00
Craig Topper	e012ede137	[X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps. llvm-svn: 268164	2016-04-30 17:59:49 +00:00
Rafael Espindola	92dd7b82be	Add missing override. llvm-svn: 268163	2016-04-30 15:18:21 +00:00
Tom Stellard	c51e4468b7	AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits This was supposed to be part of r268143. llvm-svn: 268154	2016-04-30 04:04:48 +00:00
Hal Finkel	17e9754dd4	[PowerPC/QPX] Fix the load/splat peephole with overlapping reads If, in between the splat and the load (which does an implicit splat), there is a read of the splat register, then that register must have another earlier definition. In that case, we can't replace the load's destination register with the splat's destination register. Unfortunately, I don't have a small or non-fragile test case. llvm-svn: 268152	2016-04-30 01:59:28 +00:00
Tom Stellard	cb6ba62d6f	AMDGPU/SI: Enable the post-ra scheduler Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 llvm-svn: 268143	2016-04-30 00:23:06 +00:00
Matt Arsenault	701c21ea10	AMDGPU: Fix crash with unreachable terminators. If a block has no successors because it ends in unreachable, this was accessing an invalid iterator. Also stop counting instructions that don't emit any real instructions. llvm-svn: 268119	2016-04-29 21:52:13 +00:00
Sriraman Tallam	7da9b445ea	Differential Revision: http://reviews.llvm.org/D19733 llvm-svn: 268106	2016-04-29 21:19:16 +00:00
Matt Arsenault	dc4ebad6d4	AMDGPU: Add kernarg.segment.ptr intrinsic llvm-svn: 268105	2016-04-29 21:16:52 +00:00
Matt Arsenault	cf2744f1c8	AMDGPU/SI: Move post regalloc run of SIShrinkInstructions Move to addPreEmitPass. This is so it runs after post-RA scheduling so we can merge s_nops emitted by the scheduler and hazard recognizer. llvm-svn: 268095	2016-04-29 20:23:42 +00:00
Artem Tamazov	38e496b175	Fixed/Recommitted r267733 "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD." Previously reverted by r267752. r267733 review: Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 268066	2016-04-29 17:04:50 +00:00
Guozhi Wei	fa3e04298b	[PPC] Enable shuffling of VSX vectors This patch fixes PR27078 by enabling shuffling of vectors if VSX is available. llvm-svn: 268064	2016-04-29 17:00:54 +00:00
Daniel Sanders	7225cd52e7	[mips][ias] Move createCpRestoreMemOp to MipsTargetStreamer. NFC. Summary: This removes the temporary call to isIntegratedAssemblerRequired() which was added recently. It's effect is now acheived directly in the MipsTargetStreamer hierarchy. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D19715 llvm-svn: 268058	2016-04-29 16:16:49 +00:00
Krzysztof Parzyszek	173fc57b54	Fix NDEBUG build: variables used only in debug code causing compile error llvm-svn: 268057	2016-04-29 16:14:00 +00:00
Simon Dardis	d8bceb9d3a	[mips][FastISel] A store is not a load. Correct trivial error. One of the failing tests from PR/27458. Reviewers: dsanders, vkalintiris, mcrosier Differential Review: http://reviews.llvm.org/D19726 llvm-svn: 268053	2016-04-29 16:07:47 +00:00

1 2 3 4 5 ...

37221 Commits