llvm-project

Commit Graph

Author	SHA1	Message	Date
Oliver Stannard	6739805e24	[ARM] Track epilogue instructions with FrameDestroy flag (NFC) Rather than trying to work out which instructions are part of the epilogue by examining them, we can just mark them with the FrameDestroy flag, like we do in the AArch64 backend.	2020-03-18 13:32:59 +00:00
Francesco Petrogalli	9bdcd9bf44	[llvm][SVE] Addressing mode for FF/NF loads. Summary: This patch adds addressing mode computation for the following SVE instructions: * ldff1{s}<T1> { <Zt>.<T2> }, <Pg>/Z, [<Xn\|SP>{, <Xm>{, lsl #imm}}] * ldnf1{s}<T1> { <Zt>.<T2> }, <Pg>/Z, [<Xn\|SP>{, #<imm>, mul vl}] Reviewers: andwar, sdesmalen, rengolin, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76209	2020-03-18 12:46:07 +00:00
Sander de Smalen	4788ca450f	[AArch64][SVE] Change pointer type of nontemporal load/store intrinsics Summary: This fixes a discrepancy between the non-temporal loads/store intrinsics and other SVE load intrinsics (such as nf/ff), so that Clang can use the same code to generate these intrinsics. Reviewers: andwar, kmclaughlin, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76237	2020-03-18 12:44:51 +00:00
Danila Malyutin	940ba1465b	Fix possible assertion when using PBQP with debug info Skip debug instructions before calling functions not expecting them. In particular, LIS.getInstructionIndex(*mi) would fail if mi was a debg instr. Differential Revision: https://reviews.llvm.org/D76129	2020-03-18 15:29:42 +03:00
David Stenberg	a0a3a9c5a8	[DebugInfo] Fix multi-byte entry values in call site values Summary: In D67768/D67492 I added support for entry values having blocks larger than one byte, but I now noticed that the DIE implementation I added there was broken. The takeNodes() function, that moves the entry value block from a temporary buffer to the output buffer, would destroy the input iterator when transferring the first node, meaning that only that node was moved. In practice, this meant that when emitting a call site value using a DW_OP_entry_value operation with a DWARF register number larger than 31, that multi-byte DW_OP_regx expression would be truncated. Reviewers: djtodoro, aprantl, vsk Reviewed By: djtodoro Subscribers: llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D76279	2020-03-18 13:23:17 +01:00
Florian Hahn	0db7244295	[SCCP] Precommit some additional tests for integer ranges.	2020-03-18 11:34:04 +00:00
Simon Pilgrim	f4e495a18e	[InstCombine][X86] simplifyX86varShift - convert variable in-range per-element shift amounts to generic shifts (PR40391) AVX2/AVX512 per-element shifts can be replaced with generic shifts if the shift amounts are guaranteed to be in-range (upper bits are known zero).	2020-03-18 11:26:54 +00:00
Simon Tatham	928776de92	[ARM,MVE] Add intrinsics for the VQDMLAH family. Summary: These are complicated integer multiply+add instructions with extra saturation, taking the high half of a double-width product, and optional rounding. There's no sensible way to represent that in standard IR, so I've converted the clang builtins directly to target-specific intrinsics. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76123	2020-03-18 10:55:04 +00:00
Simon Tatham	28c5d97bee	[ARM,MVE] Add intrinsics and isel for MVE integer VMLA. Summary: These instructions compute multiply+add in integers, with one of the operands being a splat of a scalar. (VMLA and VMLAS differ in whether the splat operand is a multiplier or the addend.) I've represented these in IR using existing standard IR operations for the unpredicated forms. The predicated forms are done with target- specific intrinsics, as usual. When operating on n-bit vector lanes, only the bottom n bits of the i32 scalar operand are used. So we have to tell that to isel lowering, to allow it to remove a pointless sign- or zero-extension instruction on that input register. That's done in `PerformIntrinsicCombine`, but first I had to enable `PerformIntrinsicCombine` for MVE targets (previously all the intrinsics it handled were for NEON), and make it a method of `ARMTargetLowering` so that it can get at `SimplifyDemandedBits`. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76122	2020-03-18 10:55:04 +00:00
Simon Pilgrim	cda2b0769f	[InstCombine][X86] Tests for variable but in-range per-element shift amounts (PR40391) These shifts are masked to be inrange so we should be able to replace them with generic shifts.	2020-03-18 10:29:47 +00:00
Florian Hahn	5672ae8d86	[SCCP] Use constant ranges for select, if cond is overdefined. For selects with an unknown condition, we can approximate the result by merging the state of both options. This automatically takes care of the case where on operand is undef. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71935	2020-03-18 09:26:02 +00:00
Sam Parker	ef56b55e12	[NFC][ARM] Add thumb triple to test Test the costs of selects for thumbv8m.base too.	2020-03-18 09:15:19 +00:00
Pengfei Wang	974d649f8e	CET for Exception Handle Summary: Bug fix for https://bugs.llvm.org/show_bug.cgi?id=45182 Exception handle may indirectly jump to catch pad, So we should add ENDBR instruction before catch pad instructions. Reviewers: craig.topper, hjl.tools, LuoYuanke, annita.zhang, pengfei Reviewed By: LuoYuanke Subscribers: hiraditya, llvm-commits Patch By: Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D76190	2020-03-17 22:35:05 -07:00
Vitaly Buka	9bca8fc4cf	Revert "AMDGPU/GlobalISel: Fully handle 0 dmask case during legalize" The patch introduced use-after-poison. This reverts commit `d0fe13ecf9`.	2020-03-17 22:04:14 -07:00
QingShan Zhang	d577193c0f	[DAGCombine] Respect the uses when combine FMA for ab+/-cd If it is ab-cd, it could be also folded into fma(a, b, -cd) or fma(-c, d, ab). This patch is trying to respect the uses of ab and cd to make the best choice. Differential Revision: https://reviews.llvm.org/D75982	2020-03-18 03:34:27 +00:00
Jin Lin	7b166d5182	Revert "Support repeated machine outlining" This reverts commit `ab2dcff309`.	2020-03-17 18:33:55 -07:00
Jin Lin	ab2dcff309	Support repeated machine outlining Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019. Reviewers: aschwaighofer, tellenbach, paquette Reviewed By: paquette Subscribers: tellenbach, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71027	2020-03-17 18:11:08 -07:00
Nico Weber	4e0fe038f4	Revert "Avoid emitting unreachable SP adjustments after `throw`" This reverts commit `65b21282c7`. Breaks sanitizer bots (https://reviews.llvm.org/D75712#1927668) and causes https://crbug.com/1062021 (which may or may not be a compiler bug, not clear yet).	2020-03-17 20:49:22 -04:00
Matt Arsenault	c9b454a1b7	AMDGPU/GlobalISel: Fix verifier errors on image atomics	2020-03-17 20:06:25 -04:00
Scott Linder	68f163df0e	[AMDGPU] Print DWARF register numbers in AMDGPUInstPrinter Summary: Explanation is in a comment in the diff, but essentially printing a physical register name here is ambiguous. Until we can implement printing a DWARF register name here just use the encoding directly. Tags: #llvm Differential Revision: https://reviews.llvm.org/D76253	2020-03-17 19:42:10 -04:00
Jian Cai	6a38e0e4f5	[MC] Recalculate fragment offsets after relaxation Summary: The current relaxation implementation is not correctly adjusting the size and offsets of fragements in one section based on changes in size of another if the layout order of the two happened to be such that the former was visited before the later. Therefore, we need to invalidate the fragments in all sections after each iteration of relaxation, and possibly further relax some of them in the next ieration. This fixes PR#45190. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76114	2020-03-17 14:48:05 -07:00
Simon Pilgrim	68224c1952	[TargetLowering] Only demand a rotation's modulo amount bits ISD::ROTL/ROTR rotation values are guaranteed to act as a modulo amount, so for power-of-2 bitwidths we only need the lowest bits. Differential Revision: https://reviews.llvm.org/D76201	2020-03-17 21:23:46 +00:00
Vedant Kumar	526c51e6fd	[DwarfDebug] Fix an assertion error when emitting call site info that combines two DW_OP_stack_values When compiling ``` struct S { float w; }; void f(long w, long b); void g(struct S s) { int w = s.w; f(w, w*4); } ``` I get Assertion failed: ((!CombinedExpr \|\| CombinedExpr->isValid()) && "Combined debug expression is invalid"). That's because we combine two epxressions that both end in DW_OP_stack_value: ``` (lldb) p Expr->dump() !DIExpression(DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value) (lldb) p Param.Expr->dump() !DIExpression(DW_OP_constu, 4, DW_OP_mul, DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value) (lldb) p CombinedExpr->isValid() (bool) $0 = false (lldb) p CombinedExpr->dump() !DIExpression(4097, 32, 5, 4097, 64, 5, 16, 4, 30, 4097, 32, 5, 4097, 64, 5, 159, 159) ``` I believe that in this particular case combining two stack values is safe, but I didn't want to sink the special handling into DIExpression::append() because I do want everyone to think about what they are doing. Patch by Adrian Prantl. Fixes PR45181. rdar://problem/60383095 Differential Revision: https://reviews.llvm.org/D76164	2020-03-17 12:51:49 -07:00
Sanjay Patel	be9e3d9416	[InstCombine] reduce demand-limited bool math to logic, part 2 Follow-on suggested in: D75961	2020-03-17 15:18:18 -04:00
Sanjay Patel	586565c514	[InstCombine] add tests for bool math; NFC	2020-03-17 15:18:18 -04:00
Huihui Zhang	1bf0c99375	[ValueTracking][SVE] Fix isGEPKnownNonNull for scalable vector. Summary: DataLayout::getTypeAllocSize() return TypeSize. For cases where the scalable property doesn't matter, we should explicitly call getKnownMinSize() to avoid implicit type conversion to uint64_t, which is not valid for scalable vector type. Reviewers: sdesmalen, efriedma, apazos, reames Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76260	2020-03-17 11:31:30 -07:00
Jin Lin	b9f1b8be1c	Revert "Support repeated machine outlining" This reverts commit `1f93b162fc`.	2020-03-17 10:03:27 -07:00
Sebastian Neubauer	6e29846b29	[AMDGPU] Fix whole wavefront mode We cannot move wwm over exec copies because the exec register needs an exact exec mask. Differential Revision: https://reviews.llvm.org/D76232	2020-03-17 17:23:23 +01:00
Jin Lin	1f93b162fc	Support repeated machine outlining Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019. Reviewers: aschwaighofer, tellenbach, paquette Reviewed By: paquette Subscribers: tellenbach, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71027	2020-03-17 09:16:11 -07:00
Kang Zhang	9cd8db1c80	[NFC][PowerPC] Add 2 test cases to early-ret.mir to test BLR and BCCLR	2020-03-17 15:52:44 +00:00
Matt Arsenault	039c917b43	AMDGPU/GlobalISel: Fix asserting on gather4 intrinsics	2020-03-17 11:07:30 -04:00
Tyker	e8ac825f5b	[AssumeBundles] Detection of Empty bundles Summary: Prevent InstCombine from removing llvm.assume for which the arguement is true when they have operand bundles with usefull information. Reviewers: jdoerfert, nikic, lebedev.ri Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76147	2020-03-17 15:50:15 +01:00
alex-t	48a9cf9043	[AMDGPU] Enable SEXT divergence driven selection. Summary: This change enable the divergence driven selection for the SEXT DAG opcode. Reviewers: vpykhtin, rampitec Reviewed By: vpykhtin Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Differential Revision: https://reviews.llvm.org/D76230	2020-03-17 17:30:11 +03:00
Simon Atanasyan	73b1da1605	[MIPS] Implement MIPS3D vector instructions Patch by Michael Roe. Differential Revision: https://reviews.llvm.org/D76247	2020-03-17 17:17:51 +03:00
Matt Arsenault	d0fe13ecf9	AMDGPU/GlobalISel: Fully handle 0 dmask case during legalize For normal loads, fully eliminate the load. For the TFE case, adjust the dmask value in the instruction so the selector doesn't need to handle it. For the TFE special case, I guess it would be possible to replace the loaded data register with undef, but as-is this will start treating it as a well defined value.	2020-03-17 10:15:30 -04:00
Matt Arsenault	d9a012ed8a	AMDGPU/GlobalISel: Adjust image load register type based on dmask Trim elements that won't be written. The equivalent still needs to be done for writes. Also start widening 3 elements to 4 elements. Selection will get the count from the dmask.	2020-03-17 10:09:18 -04:00
Matt Arsenault	83ffbf2618	AMDGPU/GlobalISel: Legalize non-a16 non-NSA images	2020-03-17 10:02:09 -04:00
Matt Arsenault	2aba9b6cf8	AMDGPU/GlobalISel: Legalize a16 images Pack the address registers in the legalizer. Avoid introducing a huge family of new intermediate operations by filling dead operands with noreg.	2020-03-17 10:02:09 -04:00
Florian Hahn	1d6f919df2	[SCCP] Explicitly mark values as overdefined (NFC). This was part of D60582 but can be committed separately.	2020-03-17 12:13:30 +00:00
John Brawn	c09368313c	[StackProtector] Catch direct out-of-bounds when checking address-takenness With -fstack-protector-strong we check if a non-array variable has its address taken in a way that could cause a potential out-of-bounds access. However what we don't catch is when the address is directly used to create an out-of-bounds memory access. Fix this by examining the offsets of GEPs that are ultimately derived from allocas and checking if the resulting address is out-of-bounds, and by checking that any memory operations using such addresses are not over-large. Fixes PR43478. Differential revision: https://reviews.llvm.org/D75695	2020-03-17 12:09:07 +00:00
Georgii Rymar	4dd5f1ca9b	[yaml2obj] - Add `ELFYAML::YAMLIntUInt` to fix how we parse a relocation `Addend` key. This patch makes `Relocation::Addend` to be `ELFYAML::YAMLIntUInt` and not `int64_t`. `ELFYAML::YAMLIntUInt` it is a new type and it has the following benefits/features: 1) For an 64-bit object any hex/decimal addends in the range [INT64_MIN, UINT64_MAX] is accepted. 2) For an 32-bit object any hex/decimal addends in range [INT32_MIN, UINT32_MAX] is accepted. 3) Negative hex numbers like -0xffffffff are not accepted. 4) It is printed as decimal. I.e. obj2yaml will print something like "Addend: 125", this matches the current behavior. This fixes all FIXMEs in `relocation-addend.yaml`. Differential revision: https://reviews.llvm.org/D75527	2020-03-17 14:22:19 +03:00
Georgii Rymar	fe134b661b	[yaml2obj][test] - Ensure that dynamic section has sh_entsize correctly set. This updates the existent test because it lacks coverage. Differential revision: https://reviews.llvm.org/D76226	2020-03-17 13:32:01 +03:00
Georgii Rymar	0095200035	[obj2yaml][test] - Remove excessive missing_symtab.test test. This test uses a precompiled object and duplicates the functionality of a modern elf-no-symtab.yaml test that uses yaml2obj for producing inputs. Differential revision: https://reviews.llvm.org/D76217	2020-03-17 12:42:52 +03:00
Georgii Rymar	409cf4b7bf	[llvm-readobj][test] - Remove unused Offset key from reloc-types-*.test tests This is a follow-up for D75608. The `Offset` property is unused and can be removed to reduce tests. This patch does nothing with `reloc-types-elf-i386.test` which has a different structure and kind of tests the `Offset`. I think we might want to split it probably. Differential revision: https://reviews.llvm.org/D76195	2020-03-17 12:10:08 +03:00
Serguei Katkov	80c351cdb6	[InstCombine] Transform to undef incorrect atomic unordered mem intrinsics According to LangRef: If len is not a positive integer multiple of element_size, then the behaviour of the intrinsic is undefined. Add InstCombine rule to transform intrinsic to undef operation. This is a follow-up for D76116. Reviewers: reames Reviewed By: reames Subscribers: hiraditya, jfb, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D76215	2020-03-17 10:20:16 +07:00
Chen Zheng	fa72b29bec	[PowerPC] add test cases for target hook isProfitableToHoist - NFC	2020-03-16 23:07:30 -04:00
Shengchen Kan	39bcc76a92	[X86] Disable nop padding before instruction following hardcode Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: LuoYuanke Subscribers: annita.zhang, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D76176	2020-03-17 09:45:12 +08:00
Craig Topper	85726bbcba	[X86] Disable fast-isel call lowering for functions with vXi1 arguments on avx512. This fails an assert because the type is marked in the calling convention td file as needing promotion, but the code doesn't know how to do it. It also much more complicated because we try to pass these in xmm/ymm/zmm registers. As of a few weeks ago we do this promotion from getRegisterTypeForCallingConv before the td file generated code gets involved.	2020-03-16 18:20:42 -07:00
Sriraman Tallam	c3f0ceab0f	Add target to test basicblock-sections-mir-parse.mir This test fails on ppc which was unintended. Revision: D73674.	2020-03-16 18:03:23 -07:00
Philip Reames	5f7772004b	[Tests] Add test coverage for prefix selection logic Note that I'm not asserting this code is correct; I'm simply adding coverage for what's there already. I'm reasonable sure the logic works for existing relaxable instructions, but I wouldn't be suprised if there were incorrect cases for other instructions. (i.e. is it legal to add prefixes to all instructions?)	2020-03-16 17:27:44 -07:00
Evgenii Stepanov	2a3723ef11	[memtag] Plug in stack safety analysis. Summary: Run StackSafetyAnalysis at the end of the IR pipeline and annotate proven safe allocas with !stack-safe metadata. Do not instrument such allocas in the AArch64StackTagging pass. Reviewers: pcc, vitalybuka, ostannard Reviewed By: vitalybuka Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73513	2020-03-16 16:35:25 -07:00
Sriraman Tallam	df082ac45a	Basic Block Sections support in LLVM. This is the second patch in a series of patches to enable basic block sections support. This patch adds support for: * Creating direct jumps at the end of basic blocks that have fall through instructions. * New pass, bbsections-prepare, that analyzes placement of basic blocks in sections. * Actual placing of a basic block in a unique section with special handling of exception handling blocks. * Supports placing a subset of basic blocks in a unique section. * Support for MIR serialization and deserialization with basic block sections. Parent patch : D68063 Differential Revision: https://reviews.llvm.org/D73674	2020-03-16 16:06:54 -07:00
Craig Topper	378b1e6080	[X86] Assign avx512bf16 instructions to the SSEPackedSingle ExeDomain.	2020-03-16 14:07:01 -07:00
Nico Weber	623cb95eb3	Revert "[InstSimplify] Simplify calls with "returned" attribute" This reverts commit `45555c3819`. Causes clang crashes in some causes, see comments on https://reviews.llvm.org/D75815 for details (including repro steps).	2020-03-16 15:21:30 -04:00
Francesco Petrogalli	0f2b68d9c7	Implement IR intrinsics for gather prefetch. Summary: Intrinsics and relative codegen has been implemented for the following SVE instructions: 1. PRF<T> <prfop>, <Pg>, [<Xn\|SP>, <Zm>.S, <mod>] -> 32-bit scaled offset 2. PRF<T> <prfop>, <Pg>, [<Xn\|SP>, <Zm>.D, <mod>] -> 32-bit unpacked scaled offset 3. PRF<T> <prfop>, <Pg>, [<Xn\|SP>, <Zm>.D] -> 64-bit scaled offset 4. PRF<T> <prfop>, <Pg>, [<Zn>.S{, #<imm>}] -> 32-bit element 5. PRF<T> <prfop>, <Pg>, [<Zn>.D{, #<imm>}] -> 64-bit element The instructions are associated the following intrinsics, respectively: 1. void @llvm.aarch64.sve.gather.prf<T>.scaled.<mod>.nx4vi32( i8* %base, <vscale x 4 x i32> %offset, <vscale x 4 x i1> %Pg, i32 %prfop) 2. void @llvm.aarch64.sve.gather.prf<T>.scaled.<mod>.nx2vi32( i8* %base, <vscale x 2 x i32> %offset, <vscale x 2 x i1> %Pg, i32 %prfop) 3. void @llvm.aarch64.sve.gather.prf<T>.scaled.nx2vi64( i8* %base, <vscale x 2 x i64> %offset, <vscale x 2 x i1> %Pg, i32 %prfop) 4. void @llvm.aarch64.sve.gather.prf<T>.nx4vi32( <vscale x 4 x i32> %bases, i64 %imm, <vscale x 4 x i1> %Pg, i32 %prfop) 5. void @llvm.aarch64.sve.gather.prf<T>.nx2vi64( <vscale x 2 x i64> %bases, i64 %imm, <vscale x 2 x i1> %Pg, i32 %prfop) The intrinsics are the IR counterpart of the following SVE ACLE functions: * void svprf<T>(svbool_t pg, const void base, svprfop op) void svprf<T>_vnum(svbool_t pg, const void base, int64_t vnum, svprfop op) void svprf<T>_gather[_u32base](svbool_t pg, svuint32_t bases, svprfop op) * void svprf<T>_gather[_u64base](svbool_t pg, svuint64_t bases, svprfop op) * void svprf<T>_gather_[s32]offset(svbool_t pg, const void base, svint32_t offsets, svprfop op) void svprf<T>_gather_[u32]offset(svbool_t pg, const void base, svint32_t offsets, svprfop op) void svprf<T>_gather_[s64]offset(svbool_t pg, const void base, svint64_t offsets, svprfop op) void svprf<T>_gather_[u64]offset(svbool_t pg, const void base, svint64_t offsets, svprfop op) void svprf<T>_gather[_u32base]_offset(svbool_t pg, svuint32_t bases, int64_t offset, svprfop op) * void svprf<T>_gather[_u64base]_offset(svbool_t pg, svuint64_t bases,int64_t offset, svprfop op) Reviewers: andwar, sdesmalen, efriedma, rengolin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75580	2020-03-16 18:52:35 +00:00
Huihui Zhang	0616e9964b	[InstSimplify][SVE] Fix SimplifyGEPInst for scalable vector. Summary: Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable vector, only minimal type alloc size is known at compile-time. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75892	2020-03-16 11:46:12 -07:00
Matt Arsenault	b0bdb186f5	Utils: Always set alignment when expanding mem intrinsics This was creating natural aligned loads and stores, which may not be the case. The target could request a wider type load with less alignment.	2020-03-16 14:34:29 -04:00
Nico Weber	9e48422035	Revert "[llvm-objdump] Display locations of variables alongside disassembly" Makes tests fail on Windows, see https://reviews.llvm.org/D70720#1924542 This reverts commit `3a5ddedadb`, and follow-ups: `f4cb9c919e` `042eb0482a` `c0cf5f5da9` `18649f4813` `f62b898c1f`	2020-03-16 14:04:25 -04:00
Simon Pilgrim	ebb181cf40	[X86] matchScalarReduction - add support for partial reductions Add optional support for opt-in partial reduction cases by providing an optional partial mask to indicate which elements have been extracted for the scalar reduction.	2020-03-16 18:01:02 +00:00
Matt Arsenault	80b627d69d	AMDGPU/GlobalISel: Fix handling of G_ANYEXT with s1 source We were letting G_ANYEXT with a vcc register bank through, which was incorrect and would select to an invalid copy. Fix this up like G_ZEXT and G_SEXT. Also drop old code to fixup the non-boolean case in RegBankSelect. We now have to perform that expansion during selection, so there's no benefit to doing it during RegBankSelect.	2020-03-16 12:59:54 -04:00
Matt Arsenault	c460dc6eeb	AMDGPU/GlobalISel: Fix some illegal scalar argument types Fixes integers that don't evenly divide to i32 pieces. We should probably extract some of the code in the legalizer to start handling argument breakdowns. I'm dissatisfied with the argument lowering's handling of vectors for example, and we should not be producing the weird G_EXTRACTs we do now.	2020-03-16 12:51:23 -04:00
Matt Arsenault	84386b2d8a	AMDGPU: Drop special case f64 fround lowering The result is better if ftrunc is emitted and separately legalized when unavailable.	2020-03-16 12:09:30 -04:00
Matt Arsenault	19a0350187	GlobalISel: Fix round lowering I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.	2020-03-16 11:37:30 -04:00
Dominik Montada	8ff2dcb18b	[GlobalISel] add additional lowering support for G_INSERT Summary: Add lowering support for inserting pointers or scalars into scalars, vectors or pointers Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75994	2020-03-16 16:27:17 +01:00
Guillaume Chatelet	4efec6e1c0	Revert "Disable memcpy-inline-fails.ll for windows" This reverts commit `adc2e250a1`.	2020-03-16 16:03:39 +01:00
Matt Arsenault	57d896e838	AMDGPU/GlobalISel: Make some large merges legal We allow up to 1024-bit registers, so we should support merges all the way to the maximum.	2020-03-16 10:49:10 -04:00
Fangrui Song	536ba6373f	[Object] Change ELFObjectFile<ELFT>::getFileFormatName() to use BFD names Follow-up for D74433 What the function returns are almost standard BFD names, except that "ELF" is in uppercase instead of lowercase. This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names. MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them. Advantages: * llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects * "file format " line can be extracted and fed into llvm-objcopy -O literally. (https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case) Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed) Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D76046	2020-03-16 07:42:04 -07:00
Simon Pilgrim	2b3b453a82	[TargetLowering] Only demand a funnelshift's modulo amount bits ISD::FSHL/FSHR shift amount values are guaranteed to act as a modulo amount, so for power-of-2 bitwidths we only need the lowest bits.	2020-03-16 13:52:17 +00:00
Dominik Montada	c0241f150d	[GlobalISel] combine G_TRUNC with G_MERGE_VALUES Summary: Truncating the result of a merge means that most likely we could have done without merge in the first place and just used the input merge inputs directly. This can be done in three cases: 1. If the truncation result is smaller than the merge source, we can use the source in the trunc directly 2. If the sizes are the same, we can replace the register or use a copy 3. If the truncation size is a multiple of the merge source size, we can build a smaller merge This gets rid of most of the larger, hard-to-legalize merges. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, arsenm, Petar.Avramovic Reviewed By: arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75915	2020-03-16 14:42:01 +01:00
Juneyoung Lee	7aecf2323c	[ExpandMemCmp] Correctly set alignment of generated loads Summary: This is a part of the series of efforts for correcting alignment of memory operations. (Another related bugs: https://bugs.llvm.org/show_bug.cgi?id=44388 , https://bugs.llvm.org/show_bug.cgi?id=44543 ) This fixes https://bugs.llvm.org/show_bug.cgi?id=43880 by giving default alignment of loads to 1. The test CodeGen/AArch64/bcmp-inline-small.ll should have been changed; it was introduced by https://reviews.llvm.org/D64805 . I talked with @evandro, and confirmed that the test is okay to be changed. Other two tests from PowerPC needed changes as well, but fixes were straightforward. Reviewers: courbet Reviewed By: courbet Subscribers: nlopes, gchatelet, wuzish, nemanjai, kristof.beyls, hiraditya, steven.zhang, danielkiss, llvm-commits, evandro Tags: #llvm Differential Revision: https://reviews.llvm.org/D76113	2020-03-16 22:39:48 +09:00
Juneyoung Lee	acdcd23b7b	Add tests to ExpandMemCmp/X86/memcmp.ll before submitting D76113	2020-03-16 22:19:37 +09:00
Guillaume Chatelet	adc2e250a1	Disable memcpy-inline-fails.ll for windows	2020-03-16 14:10:08 +01:00
Georgii Rymar	46c34447f8	[yaml2obj][test] - Fix comments in ELF/program-header-address.yaml test. NFC. This addresses post-commit comments suggested by James Henderson for D76131.	2020-03-16 16:07:10 +03:00
Oliver Stannard	f4cb9c919e	Disable llvm-objdump --debug-vars tests on Windows These tests passed in my Windows 10 VM, but are failing on Windows bots with errors which look related to unicode encodings. Disable the tests on Windows for now.	2020-03-16 12:50:01 +00:00
Jonas Paulsson	132f25bcca	[SystemZ] Avoid scalarization of [SU]INT_TO_FP ISD-nodes. The type legalizer will scalarize vector conversions from integer to floating point if the source element size is less than that of the result. This is avoided now by inserting a zero/sign-extension of the source vector before type legalization. Review: Ulrich Weigand Differential revision: https://reviews.llvm.org/D75978	2020-03-16 13:07:42 +01:00
Oliver Stannard	2878c66938	Don't run PowerPC objdump tests when PowerPC backend not built	2020-03-16 12:00:33 +00:00
Oliver Stannard	161f70eae6	Don't run ARM objdump tests when ARM backend not built	2020-03-16 11:24:19 +00:00
David Stenberg	02b6a3c349	[DebugInfo] Handle generic type DW_OP_convert ops in dsymutil Summary: This is a preparatory change for allowing LLVM to emit DW_OP_convert operations converting to the generic type. If DW_OP_convert's operand is 0, it converts the top of the stack to the generic type, as specified by DWARFv5 section 2.5.1.6: "[...] takes one operand, which is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, or value 0 which represents the generic type." This adds support for such operations to dsymutil. Reviewers: aprantl, markus, friss, JDevlieghere Reviewed By: aprantl, JDevlieghere Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D76142	2020-03-16 12:16:37 +01:00
Oliver Stannard	3a5ddedadb	[llvm-objdump] Display locations of variables alongside disassembly This adds the --debug-vars option to llvm-objdump, which prints locations (registers/memory) of source-level variables alongside the disassembly based on DWARF info. A vertical line is printed for each live-range, with a label at the top giving the variable name and location, and the position and length of the line indicating the program counter range in which it is valid. Currently, this only works for object files, not executables or shared libraries. Differential revision: https://reviews.llvm.org/D70720	2020-03-16 10:54:40 +00:00
David Stenberg	c93652517c	[DebugInfo] Handle generic type DW_OP_convert ops in llvm-dwarfdump Summary: This is a preparatory change for allowing LLVM to emit DW_OP_convert operations converting to the generic type. If DW_OP_convert's operand is 0, it converts the top of stack to the generic type, as specified by DWARFv5 section 2.5.1.6: "[...] takes one operand, which is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, or value 0 which represents the generic type." This adds support for such operations to llvm-dwarfdump. Reviewers: aprantl, markus, jdoerfert, jhenderson Reviewed By: aprantl Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D76141	2020-03-16 11:24:01 +01:00
Rui Ueyama	a2923b2a1e	Implement CET Shadow Stack (Intel Controlflow Enforcement Technology) support on Windows Patch by Petr Penzin. Windows support for CET is limited to shadow stack, which is enabled by setting a PE bit in the linker. Docs: MSVC linker flag: https://docs.microsoft.com/en-us/cpp/build/reference/cetcompat?view=vs-2019 IMAGE_DLLCHARACTERISTICS_EX_CET_COMPAT PE bit: https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#extended-dll-characteristics Differential Revision: https://reviews.llvm.org/D70606	2020-03-16 17:51:32 +09:00
Georgii Rymar	2005c60a6b	[obj2yaml][test] - Simplify call-graph-profile-section.yaml. NFCI. Use yaml2obj -D to simplify. We started to use it for other tests recently.	2020-03-16 11:47:10 +03:00
Shengchen Kan	d2b522f173	[NFC][X86] Simplify test cases for branch align	2020-03-16 16:30:29 +08:00
Simon Atanasyan	e0ab0e6a28	[MIPS] Implement PUL.PS and PUU.PS instructions Patch by Michael Roe. Differential Revision: https://reviews.llvm.org/D75812	2020-03-16 09:39:47 +03:00
Serguei Katkov	ad643d5e93	[Verifier] Remove invalid verifier check According to LangRef for unordered atomic memory transfer intrinsics "The first three arguments are the same as they are in the @llvm.memcpy intrinsic, with the added constraint that len is required to be a positive integer multiple of the element_size. If len is not a positive integer multiple of element_size, then the behaviour of the intrinsic is undefined." So the len is not multiple of element size is just an undefined behavior and verifier should not complain about that as undefined behavior is allowed in LLVM IR. This change removes the verifier check for this condition Reviewers: reames Reviewed By: reames Subscribers: dantrushin, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D76116	2020-03-16 12:00:08 +07:00
Juneyoung Lee	6ad63606ea	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not frozen, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-16 12:46:20 +09:00
Juneyoung Lee	4ffe3ac729	Revert "[CodeGenPrepare] Freeze condition when transforming select to br" This reverts commit `10aa7ea951`.	2020-03-16 12:45:54 +09:00
Philip Reames	a79863f2f7	Support prefix padding for alignment purposes (Relaxable instructions only) Now that D75203 has landed and baked for a few days, extend the basic approach to prefix padding as well. The patch itself is fairly straight forward. For the moment, this patch adds the functional support and some basic testing there of, but defaults to not enabling prefix padding. I want to be able to phrase a separate patch which adds the target specific reasoning and test it cleanly. I haven't decided whether I want to common it with the nop logic or not. Differential Revision: https://reviews.llvm.org/D75300	2020-03-15 19:53:41 -07:00
QingShan Zhang	f84beee9b8	[NFC][Test] Add three tests to verify the behavior of ab-cd if there is multi-uses	2020-03-16 01:58:49 +00:00
Fangrui Song	ecd6d7254e	[test] llvm/test/: change llvm-objdump single-dash long options to double-dash options As announced here: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html Grouped option syntax (POSIX Utility Conventions) does not play well with -long-option A subsequent change will reject -long-option.	2020-03-15 17:46:23 -07:00
Craig Topper	b2da1ddaef	[X86] Add a non-zero cost for truncating v32i16->v32i8 on avx512bw.	2020-03-15 17:18:46 -07:00
Fangrui Song	6ed18eaa77	[llvm-objdump][test] Change llvm-objdump tests to use double dash options	2020-03-15 16:01:26 -07:00
Fangrui Song	b1cdada023	[llvm-objdump][test] Move {AArch64,ARM}/* to ELF/ARM/ or MachO/ARM/ and {AMDGPU,Hexagon,Mips,powerPC}/ to ELF/	2020-03-15 15:18:33 -07:00
Fangrui Song	d385133249	[llvm-objdump][test] Move {AArch64,X86}/macho-* to MachO/	2020-03-15 15:05:12 -07:00
Matt Arsenault	ce33926342	AMDGPU/GlobalISel: Remove -global-isel-abort=0 from some tests	2020-03-15 17:22:34 -04:00
Matt Arsenault	fe6037172b	AMDGPU/GlobalISel: Add more tests for G_SADDE/G_SSUBE These don't work, but add baseline tests.	2020-03-15 16:54:40 -04:00
Matt Arsenault	79cda46e49	AMDGPU/GlobalISel: Add baseline test for mul	2020-03-15 14:53:51 -04:00
Matt Arsenault	de5b2cfdd4	AMDGPU/GlobalISel: Add baseline test for mul	2020-03-15 14:50:36 -04:00
Simon Pilgrim	3ffb5ef7b0	[PowerPC] Regenerate rotate tests	2020-03-15 18:29:00 +00:00
Simon Pilgrim	1ec395523d	[Thumb2] Regenerate rotate tests	2020-03-15 18:28:54 +00:00

1 2 3 4 5 ...

69776 Commits