llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	58e03a09db	[CostModel][X86] Recursive call for cost of imul for packed v16i16 constant shift left. Don't just assume cost = 1. llvm-svn: 330834	2018-04-25 15:22:03 +00:00
Amara Emerson	1f5d994119	[AArch64][GlobalISel] Implement selection for the llvm.trap intrinsic. rdar://38674040 llvm-svn: 330831	2018-04-25 14:43:59 +00:00
Shiva Chen	d58bd8dc4a	[RISCV] Expand function call to "call" pseudoinstruction To do this: 1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer split the symbol. 2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer split the symbol. 3. Let PseudoCALL match direct call with target operand TargetGlobalAddress and TargetExternalSymbol. Differential Revision: https://reviews.llvm.org/D44885 llvm-svn: 330827	2018-04-25 14:19:12 +00:00
Shiva Chen	98f9389f65	[RISCV] Support "call" pseudoinstruction in the MC layer To do this: 1. Add PseudoCALLIndirct to match indirect function call. 2. Add PseudoCALL to support parsing and print pseudo `call` in assembly 3. Expand PseudoCALL to the following form with R_RISCV_CALL relocation type while encoding: auipc ra, func jalr ra, ra, 0 If we expand PseudoCALL before emitting assembly, we will see auipc and jalr pair when compile with -S. It's hard for assembly parser to parsing this pair and identify it's semantic is function call and then insert R_RISCV_CALL relocation type. Although we could insert R_RISCV_PCREL_HI20 and R_RISCV_PCREL_LO12_I relocation types instead of R_RISCV_CALL. Due to RISCV relocation design, auipc and jalr pair only can relax to jal with R_RISCV_CALL + R_RISCV_RELAX relocation types. We expand PseudoCALL as late as encoding(RISCVMCCodeEmitter) instead of before emitting assembly(RISCVAsmPrinter) because we want to preserve call pseudoinstruction in assembly code. It's more readable and assembly parser could identify call assembly and insert R_RISCV_CALL relocation type. Differential Revision: https://reviews.llvm.org/D45859 llvm-svn: 330826	2018-04-25 14:18:55 +00:00
Simon Dardis	0f2f5976d0	[mips] Teach the delay slot filler to transform 'jal' for microMIPS ISel is currently picking 'JAL' over 'JAL_MM' for calling a function when targeting microMIPS. A later patch will correct this behaviour. This patch extends the mechanism for transforming instructions into their short delay to recognise 'JAL_MM' for transforming into 'JALS_MM'. llvm-svn: 330825	2018-04-25 14:12:57 +00:00
Simon Pilgrim	dbd1ae7ddd	[X86] Split WriteFMA into XMM, Scalar and YMM/ZMM scheduler classes This removes all the FMA InstRW overrides. If we ever get PR36924, then we can remove many of these declarations from models. llvm-svn: 330820	2018-04-25 13:07:58 +00:00
Alexander Timofeev	b934728cd2	[AMDGPU] Revert b0efc4fd6 (https://reviews.llvm.org/D40556 ) llvm-svn: 330818	2018-04-25 12:32:46 +00:00
Simon Pilgrim	6a82e96ed9	[X86][SKX] Setup WriteFAdd and remove unnecessary InstRW scheduler overrides. llvm-svn: 330813	2018-04-25 10:51:19 +00:00
Simon Pilgrim	98e21c5ade	[X86][SNB] Remove unnecessary WriteFBlendLd InstRW scheduler overrides. llvm-svn: 330812	2018-04-25 10:50:39 +00:00
Simon Dardis	eac9301cdb	[mips] Fix the definition of sync, synci Also, fix the disassembly of synci for microMIPS. Reviewers: abeserminji, smaksimovic, atanasyan Differential Revision: https://reviews.llvm.org/D45870 llvm-svn: 330810	2018-04-25 10:19:22 +00:00
Sander de Smalen	eb896b148b	[AArch64][SVE] Asm: Add AsmOperand classes for SVE gather/scatter addressing modes. This patch adds parsing support for 'vector + shift/extend' and corresponding asm operand classes, needed for implementing SVE's gather/scatter addressing modes. The added combinations of vector (ZPR) and Shift/Extend are: Unscaled: ZPR64ExtLSL8: signed 64-bit offsets (z0.d) ZPR32ExtUXTW8: unsigned 32-bit offsets (z0.s, uxtw) ZPR32ExtSXTW8: signed 32-bit offsets (z0.s, sxtw) Unpacked and unscaled: ZPR64ExtUXTW8: unsigned 32-bit offsets (z0.d, uxtw) ZPR64ExtSXTW8: signed 32-bit offsets (z0.d, sxtw) Unpacked and scaled: ZPR64ExtUXTW<scale>: unsigned 32-bit offsets (z0.d, uxtw #<shift>) ZPR64ExtSXTW<scale>: signed 32-bit offsets (z0.d, sxtw #<shift>) Scaled: ZPR32ExtUXTW<scale>: unsigned 32-bit offsets (z0.s, uxtw #<shift>) ZPR32ExtSXTW<scale>: signed 32-bit offsets (z0.s, sxtw #<shift>) ZPR64ExtLSL<scale>: unsigned 64-bit offsets (z0.d, lsl #<shift>) ZPR64ExtLSL<scale>: signed 64-bit offsets (z0.d, lsl #<shift>) Patch [1/3] in series to add support for SVE's gather load instructions that use scalar+vector addressing modes: - Patch [1/3]: https://reviews.llvm.org/D45951 - Patch [2/3]: https://reviews.llvm.org/D46023 - Patch [3/3]: https://reviews.llvm.org/D45958 Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D45951 llvm-svn: 330805	2018-04-25 09:26:47 +00:00
Jessica Paquette	4f56428de1	[MachineOutliner] Check for explicit uses of LR/W30 in MI operands Before, the outliner would grab ADRPs that used LR/W30. This patch fixes that by checking for explicit uses of those registers before the special-casing for ADRPs. This also adds a test that ensures that those sorts of ADRPs won't be outlined. llvm-svn: 330783	2018-04-24 22:38:15 +00:00
Warren Ristow	b960d2cb40	[X86] Account for partial stack slot spills (PR30821) Previously, _any_ store or load instruction was considered to be operating on a spill if it had a frameindex as an operand, and thus was fair game for optimisations such as "StackSlotColoring". This usually works, except on architectures where spills can be partially restored, for example on X86 where a spilt vector can have a single component loaded (zeroing the rest of the target register). This can be mis-interpreted and the zero extension unsoundly eliminated, see pr30821. To avoid this, this commit optionally provides the caller to isLoadFromStackSlot and isStoreToStackSlot with the number of bytes spilt/loaded by the given instruction. Optimisations can then determine that a full spill followed by a partial load (or vice versa), for example, cannot necessarily be commuted. Patch by Jeremy Morse! Differential Revision: https://reviews.llvm.org/D44782 llvm-svn: 330778	2018-04-24 22:01:50 +00:00
Tom Stellard	a2be8f4c35	AMDGPU: Remove deprecated llvm.AMDGPU.kilp intrinsic Summary: This is no longer used by mesa since its 18.0.0 release. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D45988 llvm-svn: 330775	2018-04-24 21:37:57 +00:00
Tom Stellard	257882ff72	AMDGPU/GlobalISel: Fall-back to SelectionDAG for non-void functions Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45843 llvm-svn: 330774	2018-04-24 21:29:36 +00:00
Tom Stellard	c7709e1c29	AMDGPU/GlobalISel: Add support for amdgpu_ps calling convention Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45837 llvm-svn: 330767	2018-04-24 20:51:28 +00:00
Simon Pilgrim	c4d25a2922	[X86][SKX] Setup WriteFMul and remove unnecessary InstRW scheduler overrides. llvm-svn: 330760	2018-04-24 19:22:01 +00:00
Simon Pilgrim	27bc83e228	[X86] Split off PHMINPOSUW to their own schedule class This also fixes Jaguar's schedule which was treating it as the WriteVecIMul default. llvm-svn: 330756	2018-04-24 18:49:25 +00:00
Stanislav Mekhanoshin	a4bfb3c446	[AMDGPU] Truncate packed inline constant If a packed inline constant is sign extended it must be truncated after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended to 64 bit. After the value shifted right by 16 to use it in a low part with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline constant any longer. Fixed the error and added verification code. Without the fix and with the verification bug is causing pk_max_f16_literal.ll to fail. Differential Revision: https://reviews.llvm.org/D45987 llvm-svn: 330752	2018-04-24 18:17:55 +00:00
Simon Pilgrim	81cb67ad82	[XOP] v4i32 IFMA 'VPMACS' instructions should use the WritePMULLD schedule class llvm-svn: 330751	2018-04-24 18:13:57 +00:00
Simon Pilgrim	cf0199a289	[AVX512] VPERMQ/VPERMPD/VPERMIL single op shuffles are not variable shuffles These variants all take an immediate shuffle mask value and should be scheduled as such. llvm-svn: 330747	2018-04-24 17:59:54 +00:00
Simon Dardis	d2ac0faf3b	Reland "[mips] Guard traps for microMIPS correctly" This is part of fixing the instruction predicates for MIPS. Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D44212 This patch relands r327409, hopefully without the problematic part of the tests that cause FileCheck to assert on the windows expensive checks bot. llvm-svn: 330741	2018-04-24 17:11:37 +00:00
Simon Pilgrim	f0945aa0e0	[X86][F16C] Add WriteCvtF2FSt scheduling class Fixes the classification of VCVTPS2PHmr/VCVTPS2PHYmr which were tagged as WriteCvtF2FLd_WriteRMW (PR36887) llvm-svn: 330737	2018-04-24 16:43:07 +00:00
Simon Pilgrim	828ef9e013	[X86][BtVer2] Fix VCVTPS2PHmr/VCVTPS2PHYmr latencies These are stores, not loads, so don't need to account for load latency. llvm-svn: 330735	2018-04-24 16:26:51 +00:00
Simon Atanasyan	9df3be3ccb	[mips] Show an error if register number is out of range Current code does not check that a register number is in the 0-31 range. Sometimes the parser checks that later for some kinds of instructions, but that leads to unclear / incorrect error messages like that: % cat test.s .text lb $4, 8($32) % llvm-mc test.s -triple=mips64-unknown-linux test.s:2:10: error: expected memory with 16-bit signed offset lb $4, 8($32) ^ Sometimes the parser just crashes: % cat test.s .text lw $4, 8($32) % llvm-mc test.s -triple=mips64-unknown-linux This patch resolves the problem by checking that register number after '$' sign is in the 0-31 range. If the number is out of the range the parser shows the `invalid register number` error, but treats invalid register number as a normal one to continue parsing and catch other possible errors. Differential Revision: https://reviews.llvm.org/D45919 llvm-svn: 330732	2018-04-24 16:14:00 +00:00
Mark Searles	70901b9047	[AMDGPU][Waitcnt] NFC. Cleanup some code/naming consistency: - s/SWaitcnt/Waitcnt s/WaitCnt/Waitcnt llvm-svn: 330730	2018-04-24 15:59:59 +00:00
Simon Pilgrim	16299273d0	[X86] Remove unnecessary FMA reg-mem InstRW scheduler overrides. llvm-svn: 330720	2018-04-24 14:47:11 +00:00
Ulrich Weigand	497c70fff1	[SystemZ] Use preferred 16-byte function alignment While not necessary for correctness, it is preferable for performance reasons on all architectures we currently support to align functions to 16-byte boundaries by default. llvm-svn: 330718	2018-04-24 14:03:21 +00:00
Simon Pilgrim	f7d2a93d5f	[X86] Add vector element insertion/extraction scheduler classes Split off pinsr/pextr and extractps instructions. (Mostly) fixes PR36887. Note: It might be worth adding a WriteFInsertLd class as well in the future. Differential Revision: https://reviews.llvm.org/D45929 llvm-svn: 330714	2018-04-24 13:21:41 +00:00
Alexander Ivchenko	5717fbaf4c	[X86] Replace action Promote with Expand for operation ISD::SINT_TO_FP Summary: If attribute "use-soft-float"="true" is set then X86ISelLowering.cpp sets 'Promote' action for ISD::SINT_TO_FP operation on type i32. But 'Promote' action is not proper in this case since lib function __floatsidf is available for casting from signed int to float type. Thus Expand action is more suitable here. The Expand action should be set for ISD::UINT_TO_FP for soft float as well. If function attribute "use-soft-float"="true" is set then infinite looping can happen in DAG combining, function visitSINT_TO_FP() replaces SINT_TO_FP node with UINT_TO_FP node and function combineUIntToFP() replace vice versa in cycle. The fix prevents it. Patch by vrybalov Differential Revision: https://reviews.llvm.org/D45572 llvm-svn: 330711	2018-04-24 12:57:51 +00:00
Petar Jovanovic	e2bfcd6394	Correct dwarf unwind information in function epilogue This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: * CFI instructions do not affect code generation (they are not counted as instructions when tail duplicating or tail merging) * Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Added CFIInstrInserter pass: * analyzes each basic block to determine cfa offset and register are valid at its entry and exit * verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors * inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D42848 llvm-svn: 330706	2018-04-24 10:32:08 +00:00
Simon Dardis	fce722e6f8	[mips] Correct the patterns for bswap Guard the MIPS64 variant correctly for i64, mark the MIPS32 version as not in microMIPS and provide the microMIPS version. Additionally, remove a related stale XFAIL'd test as bswap has its own test case providing coverage. Reviewers: smaksimovic, abeserminji, atanasyan Differential Revision: https://reviews.llvm.org/D45816 llvm-svn: 330705	2018-04-24 10:19:29 +00:00
Sander de Smalen	eb1053f9d3	[AArch64][SVE] Asm: Support for contiguous, first-faulting LDFF1 (scalar+scalar) load instructions. Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: rengolin Subscribers: tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45946 llvm-svn: 330697	2018-04-24 08:59:08 +00:00
Craig Topper	19b85103a3	[X86] Add a BSWAP16 instruction using the 32-bit encoding plus a 0x66 prefix. This encoding is recognized by the CPU, but the behavior is undefined. This makes the disassembler handle it correctly so we don't print bswapl with a 16-bit register. llvm-svn: 330682	2018-04-24 04:28:02 +00:00
Eric Christopher	b9733d0f7c	Remove unused function HexagonEarlyIfConversion::replacePhiEdges. NFC. llvm-svn: 330678	2018-04-24 02:10:59 +00:00
Simon Pilgrim	e5e4bf02d6	[X86] Remove unnecessary vector memory folded InstRW overrides. We have test coverage for these with resources-sse/avx llvm-svn: 330662	2018-04-23 22:45:04 +00:00
Simon Pilgrim	eb6090941c	[X86] Remove unnecessary BMI2 InstRW overrides. We have test coverage for these with resources-bmi2.s llvm-svn: 330659	2018-04-23 22:19:55 +00:00
Simon Pilgrim	ed09ebb48d	[X86] Remove unnecessary WriteLEA InstRW overrides. llvm-svn: 330648	2018-04-23 21:04:23 +00:00
Gabor Buella	1a2ce572bf	[X86] Revert r330638 - accidental commit llvm-svn: 330640	2018-04-23 20:05:51 +00:00
Gabor Buella	213a7cda1f	[X86] movdiri and movdir64b instructions Reviewers: craig.topper llvm-svn: 330638	2018-04-23 20:00:59 +00:00
Peter Collingbourne	5ab4a4793e	Reland r329956, "AArch64: Introduce a DAG combine for folding offsets into addresses.", with a fix for the bot failure. This reland includes a check to prevent the DAG combiner from folding an offset that is smaller than the existing one. This can cause oscillations between two possible DAGs, which was the cause of the hang and later assertion failure observed on the lnt-ctmark-aarch64-O3-flto bot. http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/ Original commit message: > This is a code size win in code that takes offseted addresses > frequently, such as C++ constructors that typically need to compute > an offseted address of a vtable. This reduces the size of Chromium > for Android's .text section by 108KB. Differential Revision: https://reviews.llvm.org/D45199 llvm-svn: 330630	2018-04-23 19:09:34 +00:00
Matt Arsenault	b21f9592be	AMDGPU: Move a flawed assert when spilling SGPRs It's possible to validly spill the frame offset register in a call sequence to a VGPR. There are definitely issues with SGPR spilling to memory, so move the assert later. llvm-svn: 330612	2018-04-23 16:13:30 +00:00
Simon Pilgrim	8cd01aaa0f	[X86] Replace x87 instregex with instrs if they only match one instruction llvm-svn: 330611	2018-04-23 16:10:50 +00:00
Matt Arsenault	adc59d7076	AMDGPU: Assign enum name to stack ID Also assert that it is correct for SGPRs. There is currently a bug where stack slot coloring replaces SGPR spill FIs with one with the default ID, which results in a more confusing assert later about a dead object. llvm-svn: 330607	2018-04-23 15:51:26 +00:00
Simon Pilgrim	455d0b2cfe	[X86] Remove instregex matching from CLAC/STAC. Note - noticed this as the STAC case as it was unintentionally matching against STACK pseudo instructions. llvm-svn: 330588	2018-04-23 13:24:17 +00:00
Nico Weber	77c5471d9f	List cpp file only once (was added in 147117 and 147117 as build fix each). llvm-svn: 330587	2018-04-23 13:11:51 +00:00
Nicolai Haehnle	cbebba4917	AMDGPU: Fix SDWA peephole for V_AND_B32 Summary: Found by inspection. We care about the operand that doesn't contain the immediate. I believe this is currently not hit because we fold 0xff / 0xffff immediates only later. Change-Id: Ic3cf8538bc7da5eff3200d96eccf9d339e6345a7 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45886 llvm-svn: 330586	2018-04-23 13:06:03 +00:00
Nicolai Haehnle	5a995664f0	AMDGPU: Fix a corner case crash in SIOptimizeExecMasking Summary: See the new test case; this is really unlikely to happen with real code, but I ran into this while attempting to bugpoint-reduce a different issue. Change-Id: I9ade1dc1aa8fd9c4d9fc83661d7b80e310b5c4a6 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45885 llvm-svn: 330585	2018-04-23 13:05:50 +00:00
Nico Weber	5d53aed419	Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt llvm-svn: 330584	2018-04-23 12:49:34 +00:00
Sander de Smalen	7893f722b2	[AArch64][SVE] Asm: Support for contiguous, non-faulting LDNF1 (scalar+imm) load instructions Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45684 llvm-svn: 330583	2018-04-23 12:43:19 +00:00

1 2 3 4 5 ...

47199 Commits