llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	1c1335a10d	[X86][BMI1] Fix BLSI/BLSMSK/BLSR BMI1 scheduling on btver2 These have the same behaviour as tzcnt on btver2 - confirmed with AMD 16h SOG, Agner and instlatx64. llvm-svn: 342235	2018-09-14 13:31:14 +00:00
Simon Pilgrim	6a47cdbdec	[X86][BMI1] Add scheduler class for BLSI/BLSMSK/BLSR BMI1 instructions llvm-svn: 342234	2018-09-14 13:09:56 +00:00
David Stuttard	20de3e99b5	[AMDGPU] Ensure trig range reduction only used for subtargets that require it Summary: GFX9 and above support sin/cos instructions with a greater range and thus don't require a fract instruction prior to invocation. Added a subtarget feature to reflect this and added code to take advantage of expanded range on GFX9+ Also updated the tests to check correct behaviour Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51933 Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0 llvm-svn: 342222	2018-09-14 10:27:19 +00:00
Sam Parker	7b84fd7847	[ARM] bottom-top mul support in ARMParallelDSP On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342210	2018-09-14 08:09:09 +00:00
Jonas Paulsson	77df2f2f38	[SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM After recent improvements which makes better use of LOC instead of IPM, the TTI cost functions also needs to be updated to reflect this. This involves sext, zext and xor of i1. The tests were updated so that for z13 the new costs are expected, while the old costs are still checked for on zEC12. Review: Ulrich Weigand https://reviews.llvm.org/D51339 llvm-svn: 342207	2018-09-14 06:46:55 +00:00
Tim Renouf	c8af6a46fa	[AMDGPU] Removed unused method Summary: I accidentally left this behind in D50306, and it causes a build warning when I build with gcc7. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52022 Change-Id: I30f7a47047e9d9d841f652da66d2fea19e74842c llvm-svn: 342189	2018-09-13 21:56:25 +00:00
Nirav Dave	59ad1c8457	[X86] Fix register resizings for inline assembly register operands. When replacing a named register input to the appropriately sized sub/super-register. In the case of a 64-bit value being assigned to a register in 32-bit mode, match GCC's assignment. Reviewers: eli.friedman, craig.topper Subscribers: nickdesaulniers, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D51502 llvm-svn: 342175	2018-09-13 20:33:56 +00:00
Nirav Dave	2060a16dfd	[X86] Cleanup pair returns. NFCI. llvm-svn: 342174	2018-09-13 20:33:27 +00:00
Ana Pazos	065b088759	[RISCV][MC] Reject bare symbols for the simm6 and simm6nonzero operand types Summary: Fixed assertions due to invalid fixup when encoding compressed instructions (c.addi, c.addiw, c.li, c.andi) with bare symbols with/without modifiers. This matches GAS behavior as well. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D52005 llvm-svn: 342160	2018-09-13 18:37:23 +00:00
Ana Pazos	b0799dda77	[RISCV] Fix decoding of invalid instruction with C extension enabled. Summary: The illegal instruction 0x00 0x00 is being wrongly decoded as c.addi4spn with 0 immediate. The invalid instruction 0x01 0x61 is being wrongly decoded as c.addi16sp with 0 immediate. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D51815 llvm-svn: 342159	2018-09-13 18:21:19 +00:00
Sam Clegg	79c054f6b8	[WebAssembly] Fix signature of `main` in FixFunctionBitcasts Also, add a check to ensure that when main has the expected signature we do not create a wrapper. Differential Revision: https://reviews.llvm.org/D51562 llvm-svn: 342157	2018-09-13 17:13:10 +00:00
Sam Parker	aaec3c6260	[ARM] Allow truncs as sources in ARM CGP We previously only allowed truncs as sinks, but now allow them as sources too. We do this by checking that the result type is the narrow type that we're trying to optimise for. Differential Revision: https://reviews.llvm.org/D51978 llvm-svn: 342141	2018-09-13 15:14:12 +00:00
Sam Parker	96f77f142b	[ARM] Fix FixConst for ARMCodeGenPrepare Part of FixConsts wrongly assumes either a 8- or 16-bit constant which can result in the wrong constants being generated during promotion. Differential Revision: https://reviews.llvm.org/D52032 llvm-svn: 342140	2018-09-13 14:48:10 +00:00
Matt Arsenault	ff987ac6ea	AMDGPU: Fix not preserving alignent in call setups If an argument was passed on the stack, this was using the default alignment. I'm not sure there's an observable change from this. This was observable due to bugs in expansion of unaligned loads and stores, but since that is fixed I don't think this matters much. llvm-svn: 342133	2018-09-13 12:14:31 +00:00
Tim Northover	c15d47bb01	ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4. The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127	2018-09-13 10:28:05 +00:00
Alexander Timofeev	4d302f6911	[AMDGPU] Load divergence predicate refactoring Differential revision: https://reviews.llvm.org/D51931 Reviewers: rampitec llvm-svn: 342120	2018-09-13 09:06:56 +00:00
Simon Atanasyan	c49da2e4ed	[mips] Enable the mnemonic spell corrector This implements suggesting alternative mnemonics when an invalid one is specified. For example `addru $9, $6, 17767` leads to the following error message: error: unknown instruction, did you mean: add, addiu, addu, maddu? Differential revision: https://reviews.llvm.org/D40646 llvm-svn: 342119	2018-09-13 08:38:03 +00:00
Alexander Timofeev	2fb44808b1	[AMDGPU] Preliminary patch for divergence driven instruction selection. Load offset inlining pattern changed. Differential revision: https://reviews.llvm.org/D51975 Reviewers: rampitec llvm-svn: 342115	2018-09-13 06:34:56 +00:00
Craig Topper	f107123a88	[X86] Type legalize v2i32 div/rem by scalarizing rather than promoting Summary: Previously we type legalized v2i32 div/rem by promoting to v2i64. But we don't support div/rem of vectors so op legalization would then scalarize it using i64 scalar ops since it doesn't know about the original promotion. 64-bit scalar divides on Intel hardware are known to be slow and in 32-bit mode they require a libcall. This patch switches type legalization to do the scalarizing itself using i32. It looks like the division by power of 2 optimization is still kicking in and leaving the code as a vector. The division by other constant optimization doesn't kick in pre type legalization since it ignores illegal types. And previously, after type legalization we scalarized the v2i64 since we don't have v2i64 MULHS/MULHU support. Another option might be to widen v2i32 to v4i32 so we could do division by constant optimizations, but we'd have to be careful to only do that for constant divisors or we risk scalaring to 4 scalar divides. Reviewers: RKSimon, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51325 llvm-svn: 342114	2018-09-13 06:13:37 +00:00
Saleem Abdulrasool	aaa72c547b	ARM: correct the relocation type for `bl` on WoA The `IMAGE_REL_ARM_BRANCH20T` applies only to a `b.w` instruction. A thumb-2 `bl` should be relocated using a `IMAGE_REL_ARM_BRANCH24T`. Correct the relocation that we emit in such a case. Resolves PR38620! Based on the patch by Jordan Rhee! llvm-svn: 342109	2018-09-13 04:55:08 +00:00
Thomas Lively	65825cd7c5	Remove isAsCheapAsAMove from v128.const llvm-svn: 342106	2018-09-13 02:50:57 +00:00
Thomas Lively	17ba6becaa	Remove isAsCheapAsAMove from mem ops llvm-svn: 342105	2018-09-13 02:50:57 +00:00
Thomas Lively	56b34f6c51	[WebAssembly] Add missing SIMD instruction attributes Summary: These attributes are copied from equivalent instructions in WebAssemblyInstrInfo.td. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51518 llvm-svn: 342104	2018-09-13 02:50:56 +00:00
Krzysztof Parzyszek	a6d4fc0e29	[Hexagon] Use shuffles when lowering "gather" shufflevectors Shufflevector instructions in LLVM IR that extract a subset of elements of a longer input into a shorter vector can be done using VECTOR_SHUFFLEs. This will avoid expanding them into constly extracts and inserts. llvm-svn: 342091	2018-09-12 22:14:52 +00:00
Krzysztof Parzyszek	f853741142	[Hexagon] Improve the selection algorithm in scalarizeShuffle Use topological ordering for newly generated nodes. llvm-svn: 342090	2018-09-12 22:10:58 +00:00
Heejin Ahn	300f42fbce	[WebAssembly] Make tied inline asm operands work again Summary: rL341389 broke code with tied register operands in inline assembly. For example, `asm("" : "=r"(var) : "0"(var));` The code above specifies the input operand to be in the same register with the output operand, tying the two register. This patch makes this kind of code work again. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, eraman, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51991 llvm-svn: 342084	2018-09-12 21:34:39 +00:00
Krzysztof Parzyszek	cd95e03cf0	[Hexagon] Use legalized type for extracted elements in scalarizeShuffle Scalarization of a shuffle will break up the source vectors into individual elements, and use them to assemble the resulting vector. An element type of a legal vector type may not necessarily be a legal scalar type, so make sure that the extracted values are extended to a legal scalar type. llvm-svn: 342079	2018-09-12 20:58:48 +00:00
Konstantin Zhuravlyov	6e551e0e49	AMDGPU: Print all kernel descriptor directives (including the ones with default values) Change by Tony Tye Differential Revision: https://reviews.llvm.org/D51954 llvm-svn: 342077	2018-09-12 20:25:39 +00:00
Konstantin Zhuravlyov	71e43ee47d	AMDGPU: Re-apply r341982 after fixing the layering issue Move isa version determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). llvm-svn: 342069	2018-09-12 18:50:47 +00:00
Thomas Lively	ebd4c906d8	[WebAssembly] SIMD comparisons Summary: Match the ordering semantics of non-vector comparisons. For floating point comparisons that do not correspond to instructions, the tests check that some vector comparison instruction was emitted but do not care about the full implementation. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51765 llvm-svn: 342064	2018-09-12 17:56:00 +00:00
Diogo N. Sampaio	01b916e188	[ARM] Tighten f64<->f16 conversion requirements Fix missing Requires fields. Patch by Bernard Ogden (bogden) Reviewers: SjoerdMeijer, javed.absar, t.p.northover Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D51631 llvm-svn: 342061	2018-09-12 16:24:43 +00:00
Craig Topper	2262613532	[X86] Remove isel patterns for ADCX instruction There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags. I don't think gcc will generate this instruction either. Fixes PR38852. Differential Revision: https://reviews.llvm.org/D51754 llvm-svn: 342059	2018-09-12 15:47:34 +00:00
Sander de Smalen	2d77e788f2	[AArch64] Implement aarch64_vector_pcs codegen support. This patch adds codegen support for the saving/restoring V8-V23 for functions specified with the aarch64_vector_pcs calling convention attribute, as added in patch D51477. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51479 llvm-svn: 342049	2018-09-12 12:10:22 +00:00
Sam Parker	1187911b0b	[ARM] Follow-up to rL342033 Fixed typo which can cause segfault. llvm-svn: 342040	2018-09-12 09:58:56 +00:00
Sander de Smalen	7140363cd0	[AArch64] NFC: Refactoring to prepare for vector PCS. This patch refactors several parts of AArch64FrameLowering so that it can be easily extended to support saving/restoring of FPR128 (Q) registers. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51478 llvm-svn: 342038	2018-09-12 09:44:46 +00:00
Sam Parker	a023c7a9cb	[ARM] Exchange MAC operands in ARMParallelDSP SMLAD and SMLALD instructions also come in the form of SMLADX and SMLALDX which perform an exchange on their second operand. To support this, more of the loads in the MAC candidates are compared for sequential access and a boolean value has been added to BinOpChain. AddMACCandiate has been refactored into a small pattern matching state machine to reduce the amount of duplicated code, but also to enable the matching to be more flexible. CreateParallelMACPairs now iterates through all the candidates to find parallel ones. Differential Revision: https://reviews.llvm.org/D51424 llvm-svn: 342033	2018-09-12 09:17:44 +00:00
Sam Parker	569b24549e	[ARM] Allow bitcasts in ARMCodeGenPrepare Allow bitcasts in the use-def chains, treating them as sources. Differential Revision: https://reviews.llvm.org/D50758 llvm-svn: 342032	2018-09-12 09:11:48 +00:00
Sander de Smalen	4dbc512676	[AArch64] Add parsing of aarch64_vector_pcs attribute. This patch adds parsing support for the 'aarch64_vector_pcs' calling convention attribute to calls and function declarations. More information describing the vector ABI and procedure call standard can be found here: https://developer.arm.com/products/software-development-tools/\ hpc/arm-compiler-for-hpc/vector-function-abi Reviewers: t.p.northover, rnk, rengolin, javed.absar, thegameg, SjoerdMeijer Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D51477 llvm-svn: 342030	2018-09-12 08:54:06 +00:00
Ilya Biryukov	95066496d0	Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser." This reverts commit r341982. The change introduced a layering violation. Reverting to unbreak our integrate. llvm-svn: 342023	2018-09-12 07:05:30 +00:00
Craig Topper	dc32e91bc6	[X86] Teach X86SelectionDAGInfo::EmitTargetCodeForMemcpy about GNUX32 Summary: In GNUX23, is64BitMode returns true, but pointers are 32-bits. So we shouldn't copy pointer values into RSI/RDI since the widths don't match. Fixes PR38865 despite what the title says. I think the llvm_unreachable in the copyPhysReg code tricked the optimizer and made the fatal error trigger. Reviewers: rnk, efriedma, MatzeB, echristo Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51893 llvm-svn: 342015	2018-09-12 01:57:22 +00:00
Konstantin Zhuravlyov	941615e4c8	AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). Differential Revision: https://reviews.llvm.org/D51890 llvm-svn: 341982	2018-09-11 18:56:51 +00:00
Craig Topper	8238580aae	[X86] Prefer unpckhpd over movhlps in isel for fake unary cases In r337348, I changed lowering to prefer X86ISD::UNPCKL/UNPCKH opcodes over MOVLHPS/MOVHLPS for v2f64 {0,0} and {1,1} shuffles when we have SSE2. This enabled the removal of a bunch of weirdly bitcasted isel patterns in r337349. To avoid changing the tests I placed a gross hack in isel to still emit movhlps instructions for fake unary unpckh nodes. A similar hack was not needed for unpckl and movlhps because we do execution domain switching for those. But unpckh and movhlps have swapped operand order. This patch removes the hack. This is a code size increase since unpckhpd requires a 0x66 prefix and movhlps does not. But if that's a big concern we should be using movhlps for all unpckhpd opcodes and let commuteInstruction turnit into unpckhpd when its an advantage. Differential Revision: https://reviews.llvm.org/D49499 llvm-svn: 341973	2018-09-11 17:57:27 +00:00
Craig Topper	cc9efaffad	[X86] Teach X86FastISel::X86SelectRet to use EAX for the sret pointer in GNUX32 GNUX32 uses 32-bit pointers despite is64BitMode being true. So we should use EAX to return the value. Fixes ones of the failures from PR38865. Differential Revision: https://reviews.llvm.org/D51940 llvm-svn: 341972	2018-09-11 17:57:23 +00:00
Josh Stone	aca532f14d	Test commit: remove trailing whitespace llvm-svn: 341966	2018-09-11 17:28:43 +00:00
Craig Topper	d7362a3e5f	[X86] Correct the one use check from r341915. The one use check should be on the bitcast, not the input to the bitcast. llvm-svn: 341956	2018-09-11 16:05:03 +00:00
Simon Atanasyan	16c2311c59	[MIPS] Fix illegal type assert in single-float mode An fp_to_sint node would be incorrectly lowered to a TruncIntFP node in single-float mode. This would trigger an "Unexpected illegal type!" assert. Patch by Dan Ravensloft. Differential revision: https://reviews.llvm.org/D51810 llvm-svn: 341952	2018-09-11 15:32:47 +00:00
Sam Parker	01db2983cd	[ARM] Add smlald support in ARMParallelDSP Search from i64 reducing phis, as well as i32, to allow the generation of smlald instructions. Differential Revision: https://reviews.llvm.org/D51101 llvm-svn: 341941	2018-09-11 14:01:22 +00:00
Sam Parker	945604d511	[ARM] Enable ARMCodeGenPrepare by default We've had the pass enabled downstream for a couple of weeks and it seems to be okay, so enable it by default. Differential Revision: https://reviews.llvm.org/D51920 llvm-svn: 341932	2018-09-11 12:45:43 +00:00
Alexander Timofeev	db7ee7660a	[AMDGPU] Preliminary patch for divergence driven instruction selection. Immediate selection predicate changed Differential revision: https://reviews.llvm.org/D51734 Reviewers: rampitec llvm-svn: 341928	2018-09-11 11:56:50 +00:00
Simon Atanasyan	32d8d1bf04	[mips] Add a pattern for 64-bit GPR variant of the `rdhwr` instruction MIPS ISAs start to support third operand for the `rdhwr` instruction starting from Revision 6. But LLVM generates assembler code with three-operands version of this instruction on any MIPS64 ISA. The third operand is always zero, so in case of direct code generation we get correct code. This patch fixes the bug by adding an instruction alias. The same alias already exists for 32-bit ISA. Ideally, we also need to reject three-operands version of the `rdhwr` instruction in an assembler code if ISA revision is less than 6. That is a task for a separate patch. This fixes PR38861 (https://bugs.llvm.org/show_bug.cgi?id=38861) Differential revision: https://reviews.llvm.org/D51773 llvm-svn: 341919	2018-09-11 09:57:25 +00:00

1 2 3 4 5 ...

49048 Commits