llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	0dab7ecc5d	[X86] EltsFromConsecutiveLoads - pull out repeated NumLoadedElts. NFCI.	2020-12-02 16:29:37 +00:00
Kazushi (Jam) Marukawa	dd0159bd81	[VE] Add vand, vor, and vxor intrinsic instructions Add vand, vor, and vxor intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92454	2020-12-02 22:52:54 +09:00
Anirudh Prasad	f03c21df7b	[SystemZ] Adding extra extended mnemonics for SystemZ target This patch consists of the addition of some common additional extended mnemonics to the SystemZ target. - These are jnop, jct, jctg, jas, jasl, jxh, jxhg, jxle, jxleg, bru, brul, br, brl. - These mnemonics and the instructions they map to are defined here, Chapter 4 - Branching with extended mnemonic codes. - Except for jnop (which is a variant of brc 0, label), every other mnemonic is marked as a MnemonicAlias since there is already a "defined" instruction with the same encoding and/or condition mask values. - brc 0, label doesn't have a defined extended mnemonic, thus jnop is defined using as an InstAlias. Furthermore, the applyMnemonicAliases function is called in the overridden parseInstruction function in SystemZAsmParser.cpp to ensure any mnemonic aliases are applied before any further processing on the instruction is done. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D92185	2020-12-02 08:25:31 -05:00
Jay Foad	d28624a209	[AMDGPU] Stop adding an implicit def of vcc_hi for wave32 This doesn't seem to be needed for anything. Differential Revision: https://reviews.llvm.org/D92400	2020-12-02 10:11:42 +00:00
Qiu Chaofan	ffa2dce590	[PowerPC] Fix FLT_ROUNDS_ on little endian In lowering of FLT_ROUNDS_, FPSCR content will be moved into FP register and then GPR, and then truncated into word. For subtargets without direct move support, it will store and then load. The load address needs adjustment (+4) only on big-endian targets. This patch fixes it on using generic opcodes on little-endian and subtargets with direct-move. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D91845	2020-12-02 17:16:32 +08:00
QingShan Zhang	47f784ace6	[PowerPC] Promote the i1 to i64 for SINT_TO_FP/FP_TO_SINT i1 is the native type for PowerPC if crbits is enabled. However, we need to promote the i1 to i64 as we didn't have the pattern for i1. Reviewed By: Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D92067	2020-12-02 05:37:45 +00:00
Heejin Ahn	60653e24b6	[WebAssembly] Support select and block for reference types This adds missing `select` instruction support and block return type support for reference types. Also refactors WebAssemblyInstrRef.td and rearranges tests in reference-types.s. Tests don't include `exnref` types, because we currently don't support `exnref` for `ref.null` and the type will be removed soon anyway. Reviewed By: tlively, sbc100, wingo Differential Revision: https://reviews.llvm.org/D92359	2020-12-01 19:16:57 -08:00
Chen Zheng	95d6042dd4	[NFC][PowerPC] code refactor: split IsReassociable to fma and add. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92070	2020-12-01 21:18:57 -05:00
Kazushi (Jam) Marukawa	c1762bcf0a	[VE] Add vcmp, vmax, and vmin intrinsic instructions Add vcmp, vmax, and vmin intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92387	2020-12-02 11:16:52 +09:00
Jessica Paquette	c82f002cea	[AArch64][GlobalISel] Don't write to WZR in non-flag-setting G_BRCOND case We are avoiding writing to WZR just about everywhere else. Also update the code to use MachineIRBuilder for the sake of consistency. We also didn't have a GlobalISel testcase for this path, so add a simple one now. Differential Revision: https://reviews.llvm.org/D90626	2020-12-01 16:45:37 -08:00
Fangrui Song	e27e3ba9c9	[RISCVAsmParser] Allow a SymbolRef operand to be a complex expression So that instructions like `lla a5, (0xFF + end) - 4` (supported by GNU as) can be parsed. Add a missing test that an operand like `foo + foo` is not allowed. Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D92293	2020-12-01 16:08:09 -08:00
Jessica Paquette	6c3fa97d8a	[AArch64][GlobalISel] Select Bcc when it's better than TB(N)Z Instead of falling back to selecting TB(N)Z when we fail to select an optimized compare against 0, select Bcc instead. Also simplify selectCompareBranch a little while we're here, because the logic was kind of hard to follow. At -O0, this is a 0.1% geomean code size improvement for CTMark. A simple example of where this can kick in is here: https://godbolt.org/z/4rra6P In the example above, GlobalISel currently produces a subs, cset, and tbnz. SelectionDAG, on the other hand, just emits a compare and b.le. Differential Revision: https://reviews.llvm.org/D92358	2020-12-01 15:45:14 -08:00
Fangrui Song	f0659c0673	[X86] Support modifier @PLTOFF for R_X86_64_PLTOFF64 `gcc -mcmodel=large` can emit @PLTOFF. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D92294	2020-12-01 08:39:01 -08:00
Sanjay Patel	136f98e523	[x86] adjust cost model values for minnum/maxnum with fast-math-flags Without FMF, we lower these intrinsics into something like this: vmaxsd %xmm0, %xmm1, %xmm2 vcmpunordsd %xmm0, %xmm0, %xmm0 vblendvpd %xmm0, %xmm1, %xmm2, %xmm0 But if we can ignore NANs, the single min/max instruction is enough because there is no need to fix up the x86 logic that corresponds to X > Y ? X : Y. We probably want to make other adjustments for FP intrinsics with FMF to account for specialized codegen (for example, FSQRT). Differential Revision: https://reviews.llvm.org/D92337	2020-12-01 10:45:53 -05:00
David Green	eedf0ed63e	[ARM] Mark select and selectcc of MVE vector operations as expand. We already expand select and select_cc in codegenprepare, but they can still be generated under some situations. Explicitly mark them as expand to ensure they are not produced, leading to a failure to select the nodes. Differential Revision: https://reviews.llvm.org/D92373	2020-12-01 15:05:55 +00:00
Simon Pilgrim	1b209ff9e3	[DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111) Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.	2020-12-01 14:25:29 +00:00
Kazushi (Jam) Marukawa	10b164d2f7	[VE] Add vmul and vdiv intrinsic instructions Add vmul and vdiv intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92377	2020-12-01 23:03:49 +09:00
Simon Pilgrim	6dbd0d36a1	[DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111) Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds. The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away. Differential Revision: https://reviews.llvm.org/D91876	2020-12-01 11:56:26 +00:00
Simon Pilgrim	c63799fc52	[InstCombine][X86] Fold addsub intrinsic to fadd/fsub depending on demanded elts (PR46277)	2020-12-01 11:27:40 +00:00
Caroline Concatto	4b0ef2b075	[NFC][CostModel]Extend class IntrinsicCostAttributes to use ElementCount Type This patch replaces the attribute `unsigned VF` in the class IntrinsicCostAttributes by `ElementCount VF`. This is a non-functional change to help upcoming patches to compute the cost model for scalable vector inside this class. Differential Revision: https://reviews.llvm.org/D91532	2020-12-01 11:12:51 +00:00
Kazushi (Jam) Marukawa	c3fe6ea22e	[VE] Add vadd and vsub intrinsic instructions Add vadd and vsub intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92332	2020-12-01 19:57:22 +09:00
David Green	7923d71b4a	[ARM] PREDICATE_CAST demanded bits The PREDICATE_CAST node is used to model moves between MVE predicate registers and gpr's, and eventually become a VMSR p0, rn. When moving to a predicate only the bottom 16 bits of the sources register are demanded. This adds a simple fold for that, allowing it to potentially remove instructions like uxth. Differential Revision: https://reviews.llvm.org/D92213	2020-12-01 10:32:24 +00:00
Jay Foad	839c9635ed	[AMDGPU] Simplify some generation checks. NFC.	2020-12-01 10:15:32 +00:00
Craig Topper	40659cd2c6	[RISCV] Rename RISCVGenSystemOperands.inc to RISCVGenSearchableTables.inc to prepare for more tables. NFC D89449 adds more tables so renaming as a pre-commit for that.	2020-11-30 20:47:58 -08:00
Amara Emerson	87ff156414	[AArch64][GlobalISel] Fix crash during legalization of a vector G_SELECT with scalar mask. The lowering of vector selects needs to first splat the scalar mask into a vector first. This was causing a crash when building oggenc in the test suite. Differential Revision: https://reviews.llvm.org/D91655	2020-11-30 16:37:49 -08:00
Sjoerd Meijer	630d37dc1b	[AArch64] Enable Cortex-A55 schedmodel The model was committed in `4b8ade837e` but not yet enabled to allow for a few fix ups. This adds a few of these fixes, and also a LLVM MCA test to check most instructions. While I do have plans to look into some more tuning, it's time to enable this as it better than using the A53 schedule. Differential Revision: https://reviews.llvm.org/D88017	2020-11-30 19:28:34 +00:00
Harald van Dijk	cdac34bd47	[X86] Zero-extend pointers to i64 for x86_64 For LP64 mode, this has no effect as pointers are already 64 bits. For ILP32 mode (x32), this extension is specified by the ABI. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D91338	2020-11-30 18:51:23 +00:00
Simon Pilgrim	e425d0b92a	[InstCombine][X86] Add basic addsub intrinsic SimplifyDemandedVectorElts support (PR46277) Pass through the demanded elts mask to the source operands. The next step will be to add support for folding to add/sub if we only demand odd/even elements.	2020-11-30 18:40:16 +00:00
Fangrui Song	7c4555f60d	[PowerPC] Delete remnant Darwin code in PPCAsmParser Continue the work started at D50989. The code has been long dead since the triple has been removed (D75494). Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D91836	2020-11-30 10:16:19 -08:00
Kazushi (Jam) Marukawa	3d872cbc2f	[VE][NFC] Update comments Update comments. I forgot to update it previously when I modified code.	2020-12-01 02:56:16 +09:00
Kazushi (Jam) Marukawa	6834b3d6d5	[VE] Optimize prologue/epilogue instructions about GOT Optimize prologue/epilogue instructions if a given function use GOT but do not call other functions by eliminating FP. Previously, we had wrong implementations taken from other architectures. Update regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92313	2020-12-01 02:22:31 +09:00
Kazushi (Jam) Marukawa	6fe610535f	[VE] Clean check routines of branch types Previously, these check routines accepted non-generatble instructions. This time, I clean them and add assert for those non-generatable instructions. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92254	2020-12-01 02:19:37 +09:00
Craig Topper	bfc4f29f46	[RISCV] Combine (GORCI (GORCI x, C2), C1) -> (GORCI x, C1\|C2). Unlike GREVI, GORCI stages can't be undone, but they are redundant if done more than once. Differential Revision: https://reviews.llvm.org/D92295	2020-11-30 08:42:46 -08:00
Craig Topper	76d1026b59	[RISCV] Custom legalize bswap/bitreverse to GREVI with Zbp extension to enable them to combine with other GREVI instructions This enables bswap/bitreverse to combine with other GREVI patterns or each other without needing to add more special cases to the DAG combine or new DAG combines. I've also enabled the existing GREVI combine for GREVIW so that it can pick up the i32 bswap/bitreverse on RV64 after they've been type legalized to GREVIW. Differential Revision: https://reviews.llvm.org/D92253	2020-11-30 08:30:40 -08:00
Fangrui Song	25c8fbb3d9	[X86] Don't emit R_X86_64_[REX_]GOTPCRELX for a GOT load with an offset clang may produce `movl x@GOTPCREL+4(%rip), %eax` when loading the high 32 bits of the address of a global variable in -fpic/-fpie mode. If assembled by GNU as, the fixup emits R_X86_64_GOTPCRELX with an addend != -4. The instruction loads from the GOT entry with an offset and thus it is incorrect to relax the instruction. This patch does not emit a relaxable relocation for a GOT load with an offset because R_X86_64_[REX_]GOTPCRELX do not make sense for instructions which cannot be relaxed. The result is good enough for LLD to work. GNU ld relaxes mov+GOTPCREL as well, but it suppresses the relaxation if addend != -4. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D92114	2020-11-30 08:27:31 -08:00
Craig Topper	cbbd7021f1	[RISCV] Only combine (or (GREVI x, shamt), x) -> GORCI if shamt is a power of 2. GORCI performs an OR between each stage. So we need to ensure only one stage is active before doing this combine. Initial attempts at finding a test case for this failed due to the order things get combined. It's most likely that we'll form one stage of GREVI then combine to GORCI before the two stages of GREVI are able to be formed and combined with each other to form a multi stage GREVI. Differential Revision: https://reviews.llvm.org/D92289	2020-11-30 08:10:39 -08:00
Kazushi (Jam) Marukawa	686988a50f	[VE] Optimize prologue/epilogue instructions Optimize eliminate FP mechanism. This time optimize a function which has no call but fixed stack objects. LLVM eliminates FP on such functions now. Also, optimize GOT/PLT registers save/restore instructions if a given function doesn't uses them. In addition, remove generating mechanism of `.cfi` instructions since those are taken from other architectures and not inspected yet. Update regression tests, also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92251	2020-11-30 22:22:33 +09:00
Kazushi (Jam) Marukawa	44a679eaa4	[VE] Change the behaviour of truncate Change the way to truncate i64 to i32 in I64 registers. VE assumed sext values previously. Change it to zext values this time to make it match to the LLVM behaviour. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92226	2020-11-30 22:12:45 +09:00
Kazushi (Jam) Marukawa	33eac0f283	[VE] Specify vector alignments Specify alignments for all vector types. Update a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92256	2020-11-30 22:09:21 +09:00
Sjoerd Meijer	5110ff0817	[AArch64][CostModel] Fix cost for mul <2 x i64> This was modeled to have a cost of 1, but since we do not have a MUL.2d this is scalarized into vector inserts/extracts and scalar muls. Motivating precommitted test is test/Transforms/SLPVectorizer/AArch64/mul.ll, which we don't want to SLP vectorize. Test Transforms/LoopVectorize/AArch64/extractvalue-no-scalarization-required.ll unfortunately needed changing, but the reason is documented in LoopVectorize.cpp:6855: // The cost of executing VF copies of the scalar instruction. This opcode // is unknown. Assume that it is the same as 'mul'. which I will address next as a follow up of this. Differential Revision: https://reviews.llvm.org/D92208	2020-11-30 11:36:55 +00:00
Simon Pilgrim	83d79ca5bf	[X86][AVX512] Only lower to VPALIGNR if we have BWI (PR48322)	2020-11-30 10:51:24 +00:00
Evgeny Leviant	112b3cb6ba	[TableGen][SchedModels] Fix read/write variant substitution Patch fixes multiple issues related to expansion of variant sched reads and writes. Differential revision: https://reviews.llvm.org/D90844	2020-11-30 11:55:55 +03:00
Fangrui Song	e6db1416ae	[RISCV] Remove unused Addend parameter from classifySymbolRef. NFC It is confusing as well since in the case of A - B + Cst, the returned Addend is not Cst.	2020-11-29 19:17:59 -08:00
Craig Topper	84aad9b5da	[RISCV] Change predicate on InstAliases for GORCI/GREVI/SHFLI/UNSHFLI to HasStdExtZbp instead of HasStdExtZbbOrZbp. This matches the predicate on the instructions. Though I think some specific encodings are valid in Zbb, but not all of them.	2020-11-29 11:23:23 -08:00
Harald van Dijk	47e2fafbf3	[X86] Do not allow FixupSetCC to relax constraints The build bots caught two additional pre-existing problems exposed by the test change part of my change https://reviews.llvm.org/D91339, when expensive checks are enabled. https://reviews.llvm.org/D91924 fixes one of them, this fixes the other. FixupSetCC will change code in the form of %setcc = SETCCr ... %ext1 = MOVZX32rr8 %setcc to %zero = MOV32r0 %setcc = SETCCr ... %ext2 = INSERT_SUBREG %zero, %setcc, %subreg.sub_8bit and replace uses of %ext1 with %ext2. The register class for %ext2 did not take into account any constraints on %ext1, which may have been required by its uses. This change ensures that the original constraints are honoured, by instead of creating a new %ext2 register, reusing %ext1 and further constraining it as needed. This requires a slight reorganisation to account for the fact that it is possible for the constraining to fail, in which case no changes should be made. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91933	2020-11-28 17:46:56 +00:00
Harald van Dijk	47c902ba84	[X86] Have indirect calls take 64-bit operands in 64-bit modes The build bots caught two additional pre-existing problems exposed by the test change part of my change https://reviews.llvm.org/D91339, when expensive checks are enabled. This fixes one of them. X86 has CALL64r and CALL32r opcodes, where CALL64r takes a 64-bit register, and CALL32r takes a 32-bit register. CALL64r can only be used in 64-bit mode, CALL32r can only be used in 32-bit mode. LLVM would assume that after picking the appropriate CALLr opcode, a pointer-sized register would be a valid operand, but in x32 mode, a 64-bit mode, pointers are 32 bits. In this mode, it is invalid to directly pass a pointer to CALL64r, it needs to be extended to 64 bits first. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91924	2020-11-28 16:46:30 +00:00
Craig Topper	6ee22ca6ce	[RISCV] Add tests for existing (rotr (bswap X), (i32 16))->grevi pattern for RV32. Extend same pattern to rotl and GREVIW. Not sure why bswap was treated specially. This also applies to bitreverse or generic grevi. We can improve this in future patches. For now I just wanted to get the consistency and the test coverage as I plan to make some other changes around bswap.	2020-11-27 18:09:01 -08:00
Kazushi (Jam) Marukawa	3bd78b7cc0	[VE] Optimize emitSPAdjustment function Optimize emitSPAdjustment function to generate as small as possible instructions to adjust SP. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92174	2020-11-28 08:06:31 +09:00
Craig Topper	8709d9d872	[RISCV] Replace getSimpleValueType() with getValueType() in DAG combines to prevent asserts with weird types.	2020-11-27 12:49:12 -08:00
Craig Topper	f325b4bbce	[RISCV] Replace sexti32/zexti32 in isel patterns where only one part of their PatFrags can match. NFCI We had an zexti32 after a sign_extend_inreg. The AND X, 0xffffffff part of the zexti32 should never occur since SimplifyDemandedBits from the sign_extend_inreg would have removed it. We also had sexti32 as the root node of a pattern, but SelectionDAGISel matches assertsext early before the tablegen based patterns are evaluated.	2020-11-27 11:37:25 -08:00

1 2 3 4 5 ...

60269 Commits