llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	61ced4b87a	GlobalISel: Handle 'n' inline asm constraint	2020-07-26 09:30:41 -04:00
Changpeng Fang	9162b70e51	DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit Summary: In parallelizeChainedStores, a TokenFactor was created with the size greater than 3000. We found that DAGCombiner::visitTokenFactor will consume a huge amount of time on such nodes. Since the number of operands already exceeds TokenFactorInlineLimit, we propose to give up simplification with the consideration of compile time. Reviewers: @spatel, @arsenm Differential Revision: https://reviews.llvm.org/D84204	2020-07-25 21:20:59 -07:00
Eric Christopher	18975762c1	Fold StatepointBB into checks as it's only used from an NDEBUG or ASSERT context fixing an unused variable warning.	2020-07-25 18:36:53 -07:00
Philip Reames	55dae9c20c	[Statepoints] Style cleanup after `3da1a963` [NFC] Just fixing a few minor stylistic issues.	2020-07-25 16:40:39 -07:00
Philip Reames	3da1a9634e	[Statepoints] Support lowering gc relocations to virtual registers (Disabled under flag for the moment) This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator. Today, we force spill all operands in SelectionDAG. The code to do so is distinctly non-optimal. The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to. In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots. This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above. In particular, the first N are lowered to variadic tied def/use pairs. So new statepoint looks like this: reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ... N is limited by the maximal number of tied registers machine instruction can have (15 at the moment). The current patch is restricted to handling relocations within a single basic block. Cross block relocations (e.g. invokes) are handled via the legacy mechanism. This restriction will be relaxed in future patches. Patch By: dantrushin Differential Revision: https://reviews.llvm.org/D81648	2020-07-25 14:26:05 -07:00
Matt Arsenault	4b53072ee5	GlobalISel: Define mulfix/divfix opcodes The full expansion involves the funnel shifts, which depend on another patch to expand those.	2020-07-24 20:02:20 -04:00
Nicolai Hähnle	5934df0c9a	MachineBasicBlock: add printName method Common up some existing MBB name printing logic into a single place. Note that basic block dumping now prints the same set of attributes as the MIRPrinter. Change-Id: I8f022bbd922e831bc96d63143d7472c03282530b Differential Revision: https://reviews.llvm.org/D83253	2020-07-24 18:18:09 +02:00
Djordje Todorovic	6371a0a00e	[DWARF][EntryValues] Emit GNU extensions in the case of DWARF 4 + SCE Emit DWARF 5 call-site symbols even though DWARF 4 is set, only in the case of LLDB tuning. This patch addresses PR46643. Differential Revision: https://reviews.llvm.org/D83463	2020-07-24 14:33:57 +02:00
Simon Pilgrim	0128b9505c	Revert rG5dd566b7c7b78bd- "PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI." This reverts commit `5dd566b7c7`. Causing some buildbot failures that I'm not seeing on MSVC builds.	2020-07-24 13:02:33 +01:00
Simon Pilgrim	5dd566b7c7	PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI. PassManager.h is one of the top headers in the ClangBuildAnalyzer frontend worst offenders list. This exposes a large number of implicit dependencies on various forward declarations/includes in other headers that need addressing.	2020-07-24 12:40:50 +01:00
Djordje Todorovic	cbb3571b0d	[DWARF] Avoid entry_values production for SCE SONY debugger does not prefer debug entry values feature, so the plan is to avoid production of the entry values by default when the tuning is SCE debugger. The feature still can be enabled with the -debug-entry-values option for the testing/development purposes. This patch addresses PR46643. Differential Revision: https://reviews.llvm.org/D83462	2020-07-24 13:34:05 +02:00
Craig Topper	8131e19064	[LegalizeTypes] Teach DAGTypeLegalizer::GenWidenVectorLoads to pad with undef if needed when concatenating small or loads to match a larger load In the included test case the align 16 allowed the v23f32 load to handled as load v16f32, load v4f32, and load v4f32(one element not used). These loads all need to be concatenated together into a final vector. In this case we tried to concatenate the two v4f32 loads to match the type of the v16f32 load so we could do a second concat_vectors, but those loads alone only add up to v8f32. So we need to two v4f32 undefs to pad it. It appears we've tried to hack around a similar issue in this code before by adding undef padding to loads in one of the earlier loops in this function. Originally in r147964 by padding all loads narrower than previous loads to the same size. Later modifed to only the last load in r293088. This patch removes that earlier code and just handles it on demand where we know we need it. Fixes PR46820 Differential Revision: https://reviews.llvm.org/D84463	2020-07-23 19:02:03 -07:00
Matt Arsenault	891759db73	GlobalISel: Add scalarSameSizeAs LegalizeRule Widen or narrow a type to a type with the same scalar size as another. This can be used to force G_PTR_ADD/G_PTRMASK's scalar operand to match the bitwidth of the pointer type. Use this to disallow narrower types for G_PTRMASK.	2020-07-23 21:17:31 -04:00
Amara Emerson	645e7fc542	[GlobalISel] Use existing MIR builder instead of creating one in combiner.	2020-07-23 14:16:45 -07:00
Amara Emerson	3b10e42ba1	[AArch64][GlobalISel] Add post-legalize combine for sext(trunc(sextload)) -> trunc/copy On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this. Differential Revision: https://reviews.llvm.org/D81993	2020-07-23 12:06:35 -07:00
Nikita Popov	deb4bb2b3a	[IR] Add min/max/abs intrinsics This adds the llvm.abs(), llvm.umin(), llvm.umax(), llvm.smin(), and llvm.smax() intrinsics specified in D81829. For SelectionDAG, the ISD opcodes and all the legalization and lowering already exist, so this just wires them up to the intrinsic in the SDAG builder and adds rudimentary tests. For GlobalISel only the min/max intrinsics are wired up, as llvm.abs() will require the addition of a G_ABS op, and corresponding legalization support. Differential Revision: https://reviews.llvm.org/D84125	2020-07-23 20:56:19 +02:00
Mircea Trofin	302e91baf4	[llvm][NFC] Add comments and common-case API to MachineBlockFrequencyInfo Clarify the relation between a block's BlockFrequency and the getEntryFreq() API, and added an API for the relatively common case of finding a block's frequency relative to the entrypoint. Added / moved some comments to header. Differential Revision: https://reviews.llvm.org/D84357	2020-07-23 08:42:34 -07:00
Evgeny Leviant	dc619f3d7a	[CodeGen][TargetPassConfig] Add unreachable-mbb-elimination pass explicitly Differential revision: https://reviews.llvm.org/D84228	2020-07-23 18:05:11 +03:00
Jay Foad	b35833b84e	[GlobalISel][AMDGPU] Legalize saturating add/subtract Add support in LegalizerHelper for lowering G_SADDSAT etc. either using add/subtract-with-overflow or using max/min instructions. Enable this lowering for AMDGPU so it can be tested. The legalization rules are still approximate and skips out on using the clamp bit to treat these as legal, which has never been used before. This also doesn't yet try to deal with expanding SALU cases.	2020-07-23 09:06:42 -04:00
Simon Pilgrim	1003113ef0	Fix -Wparentheses warning - add missing brackets around the entire assertion condition	2020-07-23 13:33:24 +01:00
Konstantin Schwarz	931488779f	[GlobalISel][InlineAsm] Add register class ID to the flags of register input operands Summary: We do this already for output operands, but missed it for (non-tied) input operands. Reviewers: arsenm, Petar.Avramovic Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, llvm-commits, kerbowa Tags: #llvm Differential Revision: https://reviews.llvm.org/D83763	2020-07-23 13:35:01 +02:00
Florian Hahn	6c9da995fc	[ScheduleDAGRRList] Pacify overload mismatch in std::min. On systems where size() doesn't return unsigned long, this leads to an overloading mismatch. Convert the constant to whatever type is used for Q.size() on the system.	2020-07-23 11:56:50 +01:00
Florian Hahn	2f8e6b5f3c	[ScheduleDAGRRList] Limit number of candidates to explore. Currently popFromQueueImpl iterates over all candidates to find the best one. While the candidate queue is small, this is not a problem. But it becomes a problem once the queue gets larger. For example, the snippet below takes 330s to compile with llc -O0, but completes in 3s with this patch. define void @test(i4000000* %ptr) { entry: store i4000000 0, i4000000* %ptr, align 4 ret void } This patch limits the number of candidates to check to 1000. This limit ensures that it never triggers for test-suite/SPEC2000/SPEC2006 on X86 and AArch64 with -O3, while still drastically limiting the compile-time in case of very large queues. It would be even better to use a binary heap to manage to queue (D83335), but some heuristics change the score of a node in the queue after another node has been scheduled. I plan to address this for backends that use the MachineScheduler in the future, but that requires a more careful evaluation. In the meantime, the limit should help users impacted by this issue. The patch includes a slightly smaller version of the motivating example as test case, to guard against the issue. Reviewers: efriedma, paquette, niravd Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84328	2020-07-23 11:35:33 +01:00
Sourabh Singh Tomar	8998f8ab66	[DebugInfo] Attempt to fix regression test failure after `59a76d957a` Test case `test/CodeGen/WebAssembly/stackified-debug.ll` was failing due to malformed DwarfExpression. This failure has been seen in lot of bots, for instance in: http://lab.llvm.org:8011/builders/lld-x86_64-ubuntu-fast/builds/18794 : 'RUN: at line 1' /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/build/bin/llc /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/build/bin/FileCheck /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/test/CodeGen/WebAssembly/stackified-debug.ll home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/test/CodeGen/WebAssembly/stackified-debug.ll:26:10: error: CHECK: expected string not found in input CHECK: .int16 4 # Loc expr size ^ <stdin>:34:2: note: scanning from here .int16 3 # Loc expr size Differential Revision: https://reviews.llvm.org/D83560	2020-07-23 14:55:30 +05:30
Sourabh Singh Tomar	59a76d957a	Re-apply:" Emit DW_OP_implicit_value for Floating point constants" This patch was reverted in `9d2da6759b` due to assertion failure seen in `test/DebugInfo/Sparc/subreg.ll`. Assertion failure was happening due to malformed/unhandeled DwarfExpression. Differential Revision: https://reviews.llvm.org/D83560	2020-07-23 13:56:20 +05:30
Sourabh Singh Tomar	9d2da6759b	Revert "[DebugInfo] Emit DW_OP_implicit_value for Floating point constants" This reverts commit `6b55a95898`. Temporal revert due to a failing/assertion in test case in Sparc backend. `test/DebugInfo/Sparc/subreg.ll` Seen in lot of bots, for instance in: `http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/24679`	2020-07-23 08:50:01 +05:30
Sourabh Singh Tomar	6b55a95898	[DebugInfo] Emit DW_OP_implicit_value for Floating point constants Summary: llvm is missing support for DW_OP_implicit_value operation. DW_OP_implicit_value op is indispensable for cases such as optimized out long double variables. For intro refer: DWARFv5 Spec Pg: 40 2.6.1.1.4 Implicit Location Descriptions Consider the following example: ``` int main() { long double ld = 3.14; printf("dummy\n"); ld *= ld; return 0; } ``` when compiled with tunk `clang` as `clang test.c -g -O1` produces following location description of variable `ld`: ``` DW_AT_location (0x00000000: [0x0000000000201691, 0x000000000020169b): DW_OP_constu 0xc8f5c28f5c28f800, DW_OP_stack_value, DW_OP_piece 0x8, DW_OP_constu 0x4000, DW_OP_stack_value, DW_OP_bit_piece 0x10 0x40, DW_OP_stack_value) DW_AT_name ("ld") ``` Here one may notice that this representation is incorrect(DWARF4 stack could only hold integers(and only up to the size of address)). Here the variable size itself is `128` bit. GDB and LLDB confirms this: ``` (gdb) p ld $1 = <invalid float value> (lldb) frame variable ld (long double) ld = <extracting data from value failed> ``` GCC represents/uses DW_OP_implicit_value in these sort of situations. Based on the discussion with Jakub Jelinek regarding GCC's motivation for using this, I concluded that DW_OP_implicit_value is most appropriate in this case. Link: https://gcc.gnu.org/pipermail/gcc/2020-July/233057.html GDB seems happy after this patch:(LLDB doesn't have support for DW_OP_implicit_value) ``` (gdb) p ld p ld $1 = 3.14000000000000012434 ``` Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83560	2020-07-23 07:21:49 +05:30
Christopher Tetreault	ae35c09c34	[MVT] Fix getTypeForEVT for v64f16 and v128f16 Summary: These should have half float as the element type Reviewers: cameron.mcinally, efriedma, sdesmalen, paulwalker-arm Reviewed By: paulwalker-arm Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84211	2020-07-22 14:27:08 -07:00
David Blaikie	5c2451785d	DebugInfo: Use debug_line.dwo for debug_macro.dwo This is an alternative proposal to D81476 (and D82084) - the details were sufficiently confusing to me it seemed easier to write some code and see how it looks. Reviewers: SouraVX Differential Revision: https://reviews.llvm.org/D84278	2020-07-22 14:06:33 -07:00
Mircea Trofin	111a018b36	[llvm][NFC] const-ed MachineBlockFrequencyInfo::isIrrLoopHeader	2020-07-22 13:06:34 -07:00
Andrew Litteken	bcbc6117b5	[CGP] Add Pass Dependencies Add pass dependecies: - TargetTransformInfoWrapperPass - TargetPassConfig - LoopInfoWrapperPass - TargetLibraryInfoWrapperPass To fix inconsistencies when passes are added to the pipeline. Reviewers: efriedma, kmclaughlin, paquette Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84346	2020-07-22 12:02:53 -07:00
Simon Pilgrim	1c060aa988	DwarfCompileUnit.cpp - remove duplicate includes that already exist in DwarfCompileUnit.h. NFC. Also remove DIE.h include from DwarfCompileUnit.h and replace with forward declarations.	2020-07-22 19:25:27 +01:00
Simon Pilgrim	cd0a36bbda	CodeViewDebug.cpp - remove duplicate includes that already exist in CodeViewDebug.h. NFC.	2020-07-22 19:25:27 +01:00
Matt Arsenault	b98f902f18	GlobalISel: Restructure argument lowering loop in handleAssignments This was structured in a way that implied every split argument is in memory, or in registers. It is possible to pass an original argument partially in registers, and partially in memory. Transpose the logic here to only consider a single piece at a time. Every individual CCValAssign should be treated independently, and any merge to original value needs to be handled later. This is in preparation for merging some preprocessing hacks in the AMDGPU calling convention lowering into the generic code. I'm also not sure what the correct behavior for memlocs where the promoted size is larger than the original value. I've opted to clamp the memory access size to not exceed the value register to avoid the explicit trunc/extend/vector widen/vector extract instruction. This happens for AMDGPU for i8 arguments that end up stack passed, which are promoted to i16 (I think this is a preexisting DAG bug though, and they should not really be promoted when in memory).	2020-07-22 13:31:11 -04:00
jasonliu	b98b1700ef	[XCOFF] Enable symbol alias for AIX Summary: AIX assembly's .set directive is not usable for aliasing purpose. We need to use extra-label-at-defintion strategy to generate symbol aliasing on AIX. Reviewed By: DiggerLin, Xiangling_L Differential Revision: https://reviews.llvm.org/D83252	2020-07-22 14:03:55 +00:00
Simon Pilgrim	fa95688237	SelectionDAGBuilder.cpp - remove duplicate includes that already exist in SelectionDAGBuilder.h. NFC.	2020-07-22 14:19:41 +01:00
OCHyams	ce6de3747b	[DebugInfo] Drop location ranges for variables which exist entirely outside the variable's scope Summary: This patch reduces file size in debug builds by dropping variable locations a debugger user will not see. After building the debug entity history map we loop through it. For each variable we look at each entry. If the entry opens a location range which does not intersect any of the variable's scope's ranges then we mark it for removal. After visiting the entries for each variable we also mark any clobbering entries which will no longer be referenced for removal, and then finally erase the marked entries. This all requires the ability to query the order of instructions, so before this runs we number them. Tests: Added llvm/test/DebugInfo/X86/trim-var-locs.mir Modified llvm/test/DebugInfo/COFF/register-variables.ll Branch folding merges the tails of if.then and if.else into if.else. Each blocks' debug-locations point to different scopes so when they're merged we can't use either. Because of this the variable 'c' ends up with a location range which doesn't cover any instructions in its scope; with the patch applied the location range is dropped and its flag changes to IsOptimizedOut. Modified llvm/test/DebugInfo/X86/live-debug-variables.ll Modified llvm/test/DebugInfo/ARM/PR26163.ll In both tests an out of scope location is now removed. The remaining location covers the entire scope of the variable allowing us to emit it as a single location. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D82129	2020-07-22 12:45:21 +01:00
Matt Arsenault	bf6bc62d1f	GlobalISel: Use Register and update comment physical register syntax	2020-07-21 19:11:57 -04:00
Amara Emerson	791544422a	Revert "[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy" This reverts commit `64eb3a4915`. It caused miscompiles with optimizations enabled. Reverting while I investigate.	2020-07-21 16:01:18 -07:00
Matt Arsenault	7cd8a0256d	GlobalISel: Legalize G_FPOWI	2020-07-21 18:13:04 -04:00
Matt Arsenault	7941dc5041	GlobalISel: Translate llvm.powi intrinsic There are a few questionable things about this intrinsic and existing DAG implementation. For some reason the intrinsic hardcodes the second operand to be scalar-only i32, and SelectionDAG builder makes a legalization decision based on whether the operand is constant.	2020-07-21 18:13:04 -04:00
Matt Arsenault	f659c44016	CodeGen: Add support for lowering byref attribute	2020-07-21 17:38:15 -04:00
Matt Arsenault	2fe0ea8261	DAG: Handle expanding strict_fsub into fneg and strict_fadd The AMDGPU handling of f16 vectors is terrible still since it gets scalarized even when the vector operation is legal. The code is is essentially duplicated between the non-strict and strict case. Apparently no other expansions are currently trying to do this. This is mostly because I found the behavior of getStrictFPOperationAction to be confusing. In the ARM case, it would expand strict_fsub even though it shouldn't due to the later check. At that point, the logic required to check for legality was more complex than just duplicating the 2 instruction expansion.	2020-07-21 16:17:10 -04:00
Guozhi Wei	28759e9fcc	[MBP] Use profile count to compute tail dup cost if it is available Current tail duplication in machine block placement pass uses block frequency information in cost model. But frequency number has only relative meaning compared to other basic blocks in the same function. A large frequency number doesn't mean it is hot and a small frequency number doesn't mean it is cold. To overcome this problem, this patch uses profile count in cost model if it's available. So we can tail duplicate real hot basic blocks. Differential Revision: https://reviews.llvm.org/D83265	2020-07-21 11:18:06 -07:00
David Blaikie	38fbba4cb8	DebugInfo: Move getMD5AsBytes from DwarfUnit to DwarfDebug It wasn't using any state from DwarfUnit anyway.	2020-07-20 19:21:39 -07:00
Matt Arsenault	1ef3ed0eb4	GlobalISel: Rewrite getLCMType Try to make the behavior more consistent with getGCDType, and bias towards returning something closer to the source type whenever there's an ambiguity.	2020-07-20 21:06:30 -04:00
Matt Arsenault	12d5bec8c7	GlobalISel: Handle more cases in getGCDType Try harder to find a canonical unmerge type when trying to cover the desired target type. Handle finding a compatible unmerge type for two vectors with different element types. This will return the largest multiple of the source vector element that will evenly divide the target vector type. Also make the handling mixing scalars and vectors, and prefer the source element type as the unmerge target type.	2020-07-20 20:53:35 -04:00
Eli Friedman	b8f765a1e1	[AArch64][SVE] Add support for trunc to <vscale x N x i1>. This isn't a natively supported operation, so convert it to a mask+compare. In addition to the operation itself, fix up some surrounding stuff to make the testcase work: we need concat_vectors on i1 vectors, we need legalization of i1 vector truncates, and we need to fix up all the relevant uses of getVectorNumElements(). Differential Revision: https://reviews.llvm.org/D83811	2020-07-20 13:11:02 -07:00
Yuanfang Chen	efcb8a1903	[NFC] remove unneeded TargetLoweringObjectFile init after `85c30f3374`	2020-07-20 10:43:28 -07:00
Yuanfang Chen	589c646a7e	[llc] (almost) remove `--print-machineinstrs` Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches. The motivation of this patch is to reduce tests dependency on would-be-deprecated feature. Reviewed By: arsenm, dsanders Differential Revision: https://reviews.llvm.org/D83275	2020-07-20 10:43:28 -07:00
Alok Kumar Sharma	2d10258a31	[DebugInfo] Support for DW_AT_associated and DW_AT_allocated. Summary: This support is needed for the Fortran array variables with pointer/allocatable attribute. This support enables debugger to identify the status of variable whether that is currently allocated/associated. for pointer array (before allocation/association) without DW_AT_associated (gdb) pt ptr type = integer (140737345375288:140737354129776) (gdb) p ptr value requires 35017956 bytes, which is more than max-value-size with DW_AT_associated (gdb) pt ptr type = integer (:) (gdb) p ptr $1 = <not associated> for allocatable array (before allocation) without DW_AT_allocated (gdb) pt arr type = integer (140737345375288:140737354129776) (gdb) p arr value requires 35017956 bytes, which is more than max-value-size with DW_AT_allocated (gdb) pt arr type = integer, allocatable (:) (gdb) p arr $1 = <not allocated> Testing - unit test cases added - check-llvm - check-debuginfo Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83544	2020-07-20 19:54:35 +05:30
Petar Avramovic	6a1030aa0e	AMDGPU/GlobalISel: Legalize s16->s64 G_FPEXT Legalize using narrowScalar as s16->s32 G_FPEXT followed by s32->s64 G_FPEXT. Differential Revision: https://reviews.llvm.org/D84030	2020-07-20 16:12:19 +02:00
Matt Arsenault	5cbd4e415e	GlobalISel: Don't handle widenScalar for vector G_INSERT This handling didn't make any sense for vectors.	2020-07-20 10:06:18 -04:00
Matt Arsenault	a679f27e98	GlobalISel: Consistently get TII from MIRBuilder	2020-07-20 10:06:18 -04:00
Petar Avramovic	ba938f6388	AMDGPU/GlobalISel: Legalize s16->s64 G_FPTOSI/G_FPTOUI Add narrowScalarFor action. Add narrow scalar for typeIndex == 0 for G_FPTOSI/G_FPTOUI. Legalize using narrowScalarFor as s16->s32 G_FPTOSI/G_FPTOUI followed by s32->s64 G_SEXT/G_ZEXT. Differential Revision: https://reviews.llvm.org/D84010	2020-07-20 11:06:11 +02:00
Evgeny Leviant	24089928be	[CodeGen][TargetPassConfig] Add TargetTransformInfo pass correctly Patch adds tti pass directly enforcing its execution with correctly set TargetTransformInfo. Differential revision: https://reviews.llvm.org/D84047	2020-07-18 14:11:40 +03:00
Aditya Nandakumar	63c081e73d	[GISel: Add support for CSEing SrcOps which are immediates https://reviews.llvm.org/D84072 Add G_EXTRACT to CSEConfigFull and add unit test as well.	2020-07-17 16:04:24 -07:00
Sam Tebbs	6c348e4067	[HWLoops] Stop converting to a while loop when it would be unsafe to There were cases where a do-while loop would be converted to a while loop before finding out that it would be unsafe to expand the SCEV in this situation and then bailing out of hardware loop conversion. This patch checks if it would be unsafe to expand the SCEV and if so stops converting the do-while into a while, allowing conversion to a hardware loop. Differential Revision: https://reviews.llvm.org/D83953	2020-07-17 11:47:08 +01:00
Jay Foad	62fd7f767c	[MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics tryLatency compares two sched candidates. For the top zone it prefers the one with lesser depth, but only if that depth is greater than the total latency of the instructions we've already scheduled -- otherwise its latency would be hidden and there would be no stall. Unfortunately it only tests the depth of one of the candidates. This can lead to situations where the TopDepthReduce heuristic does not kick in, but a lower priority heuristic chooses the other candidate, whose depth is greater than the already scheduled latency, which causes a stall. The fix is to apply the heuristic if the depth of either candidate is greater than the already scheduled latency. All this also applies to the BotHeightReduce heuristic in the bottom zone. Differential Revision: https://reviews.llvm.org/D72392	2020-07-17 11:02:13 +01:00
Florian Hahn	e297006d6f	[ScheduleDAG] Move DBG_VALUEs after first term forward. MBBs are not allowed to have non-terminator instructions after the first terminator. Currently in some cases (see the modified test), EmitSchedule can add DBG_VALUEs after the last terminator, for example when referring a debug value that gets folded into a TCRETURN instruction on ARM. This patch updates EmitSchedule to move inserted DBG_VALUEs just before the first terminator. I am not sure if there are terminators produce values that can in turn be used by a DBG_VALUE. In that case, moving the DBG_VALUE might result in referencing an undefined register. But in any case, it seems like currently there is no way to insert a proper DBG_VALUEs for such registers anyways. Alternatively it might make sense to just remove those extra DBG_VALUES. I am not too familiar with the details of debug info in the backend and would appreciate any suggestions on how to address the issue in the best possible way. Reviewers: vsk, aprantl, jpaquette, efriedma, paquette Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D83561	2020-07-17 10:27:43 +01:00
Igor Kudrin	f76a0cd97a	[DebugInfo] Fix a misleading usage of DWARF forms with DIEExpr. NFCI. For now, DIEExpr is used only in two places: 1) in the debug info library unit test suite to emit a DW_AT_str_offsets_base attribute with the DW_FORM_sec_offset form, see dwarfgen::DIE::addStrOffsetsBaseAttribute(); 2) in DwarfCompileUnit::addLocationAttribute() to generate the location attribute for a TLS variable. The later case used an incorrect DWARF form of DW_FORM_udata, which implies storing an uleb128 value, not a 4/8 byte constant. The generated result was as expected because DIEExpr::SizeOf() did not handle the used form, but returned the size of the code pointer by default. The patch fixes the issue by using more appropriate DWARF forms for the problematic case and making DIEExpr::SizeOf() more straightforward. Differential Revision: https://reviews.llvm.org/D83958	2020-07-17 13:49:27 +07:00
Denis Antrushin	e04fe9aefd	[Statepoint] Fix bug found by sanitaizer. Statepoint has no static operands, so it cannot be verified against MCInstrDescr. Revert NumDefs change introduced by `ef658ebd62`.	2020-07-16 23:06:53 +03:00
Nadav Rotem	a394aa1b97	[LiveVariables] Replace std::vector with SmallVector. Replace std::vector with SmallVector to reduce the number of mallocs. This method is frequently executed, and the number of elements in the vector is typically small. https://reviews.llvm.org/D83920	2020-07-16 11:39:54 -07:00
Matt Arsenault	9d3e56e2ee	DAG: Try scalarizing when expanding saturating add/sub In an upcoming AMDGPU patch, the scalar cases will be legal and vector ops should be scalarized, rather than producing a long sequence of vector ops which will also need to be scalarized. Use a lazy heuristic that seems to work and improves the thumb2 MVE test.	2020-07-16 14:05:16 -04:00
Denis Antrushin	ef658ebd62	MIR Statepoint refactoring. Part 1: Basic MI level changes. Basic support for variadic-def MIR Statepoint: - Change TableGen STATEPOINT description to variadic out list (For self-documentation purpose; by itself it does not affect code generation in any way). - Update StatepointOpers helper class to handle variadic defs. - Update MachineVerifier to properly handle them, too. With this change, new Statepoint instruction can be passed through backend (excluding ISEL) without errors. Full change set is available at D81603. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D81645	2020-07-17 00:57:21 +07:00
Matt Arsenault	023883a834	IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byref When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate value copy, and byref which does not. Most, but not all of the existing uses will use the existing version. The associated size function added by D82679 also needs to contextually differ, and will help eliminate a few places still relying on pointee element types.	2020-07-16 13:50:49 -04:00
Petar Avramovic	6850033ca6	AMDGPU/GlobalISel: Legalize s64->s16 G_SITOFP/G_UITOFP Add widenScalar for TypeIdx == 0 for G_SITOFP/G_UITOFP. Legailize, using widenScalar, as s64->s32 G_SITOFP/G_UITOFP followed by s32->s16 G_FPTRUNC. Differential Revision: https://reviews.llvm.org/D83880	2020-07-16 16:31:57 +02:00
James Y Knight	60433c63ac	Remove TwoAddressInstructionPass::sink3AddrInstruction. This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after `4b0aa5724f`. Link: https://github.com/ClangBuiltLinux/linux/issues/1085 Differential Revision: https://reviews.llvm.org/D83708	2020-07-16 10:02:52 -04:00
Kerry McLaughlin	2762da0a16	[SVE][CodeGen] Legalisation of masked loads and stores Summary: This patch modifies IncrementMemoryAddress to use a vscale when calculating the new address if the data type is scalable. Also adds tablegen patterns which match an extract_subvector of a legal predicate type with zip1/zip2 instructions Reviewers: sdesmalen, efriedma, david-arm Reviewed By: efriedma, david-arm Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83137	2020-07-16 10:55:45 +01:00
Quentin Colombet	294be6b5d3	[CalcSpillWeights] Propagate the fact that a live-interval is not spillable When we calculate the weight of a live-interval, add some code to check if the original live-interval was markied as not spillable and if so, progagate that information down to the new interval. Previously we would just recompute a weight for the new interval, thus, we could in theory just spill live-intervals marked as not spillable by just splitting them. That goes against the spirit of a non-spillable live-interval. E.g., previously we could do: v1 = // v1 must not be spilled ... = v1 Split: v1 = // v1 must not be spilled ... v2 = v1 // v2 can be spilled ... v3 = v2 // v3 can be spilled = v3 There's no test case for that one as we would need to split a non-spillable live-interval without using LiveRangeEdit to see this happening. RegAlloc inserts non-spillable intervals only as part of the spilling mechanism, thus at this point the intervals are not splittable anymore. On top of that, RegAlloc uses the LiveRangeEdit API, which already properly propagate that information. In other words, this could only happen if a target was to mark a live-interval as not spillable before register allocation and split it without using LRE, e.g., through LiveIntervals::splitSeparateComponent.	2020-07-15 17:57:36 -07:00
Hiroshi Yamauchi	f233b92f92	[PGO][PGSO] Add profile guided size optimization to LegalizeDAG. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83333	2020-07-15 10:03:38 -07:00
Cameron McInally	ae51a70030	[Legalize] Hoist invariant condition in ExpandVectorBuildThroughStack(...) The operands of a BUILD_VECTOR must all have the same type, so we can hoist this invariant condition out of the loop. Differential Revision: https://reviews.llvm.org/D83882	2020-07-15 11:05:20 -05:00
Tim Northover	37b96d51d0	CodeGenPrep: remove AssertingVH references before deleting dead instructions. CodeGenPrepare keeps fairly close track of various instructions it's seen, particularly GEPs, in maps and vectors. However, sometimes those instructions become dead and get removed while it's still executing. This triggers AssertingVH references to them in an asserts build and could lead to miscompiles in a release build (I've only seen a later segfault though). So this patch adds a callback to RecursivelyDeleteTriviallyDeadInstructions which can make sure the instruction about to be deleted is removed from CodeGenPrepare's data structures.	2020-07-15 15:19:21 +01:00
Tim Northover	5165b2b5fd	AArch64+ARM: make LLVM consider system registers volatile. Some of the system registers readable on AArch64 and ARM platforms return different values with each read (for example a timer counter), these shouldn't be hoisted outside loops or otherwise interfered with, but the normal @llvm.read_register intrinsic is only considered to read memory. This introduces a separate @llvm.read_volatile_register intrinsic and maps all system-registers on ARM platforms to use it for the __builtin_arm_rsr calls. Registers declared with asm("r9") or similar are unaffected.	2020-07-15 09:47:36 +01:00
Roger Ferrer Ibanez	14bc5e149d	[DAGCombiner] Rebuild (setcc x, y, ==) from (xor (xor x, y), 1) The existing code already considered this case. Unfortunately a typo in the condition prevents it from triggering. Also the existing code, had it run, forgot to do the folding. This fixes PR42876. Differential Revision: https://reviews.llvm.org/D65802	2020-07-15 07:34:22 +00:00
Krzysztof Pszeniczny	c3e6555616	Call Frame Information (CFI) Handling for Basic Block Sections This patch handles CFI with basic block sections, which unlike DebugInfo does not support ranges. The DWARF standard explicitly requires emitting separate CFI Frame Descriptor Entries for each contiguous fragment of a function. Thus, the CFI information for all callee-saved registers (possibly including the frame pointer, if necessary) have to be emitted along with redefining the Call Frame Address (CFA), viz. where the current frame starts. CFI directives are emitted in FDE’s in the object file with a low_pc, high_pc specification. So, a single FDE must point to a contiguous code region unlike debug info which has the support for ranges. This is what complicates CFI for basic block sections. Now, what happens when we start placing individual basic blocks in unique sections: * Basic block sections allow the linker to randomly reorder basic blocks in the address space such that a given basic block can become non-contiguous with the original function. * The different basic block sections can no longer share the cfi_startproc and cfi_endproc directives. So, each basic block section should emit this independently. * Each (cfi_startproc, cfi_endproc) directive will result in a new FDE that caters to that basic block section. * Now, this basic block section needs to duplicate the information from the entry block to compute the CFA as it is an independent entity. It cannot refer to the FDE of the original function and hence must duplicate all the stuff that is needed to compute the CFA on its own. * We are working on a de-duplication patch that can share common information in FDEs in a CIE (Common Information Entry) and we will present this as a follow up patch. This can significantly reduce the duplication overhead and is particularly useful when several basic block sections are created. * The CFI directives are emitted similarly for registers that are pushed onto the stack, like callee saved registers in the prologue. There are cfi directives that emit how to retrieve the value of the register at that point when the push happened. This has to be duplicated too in a basic block that is floated as a separate section. Differential Revision: https://reviews.llvm.org/D79978	2020-07-14 12:54:12 -07:00
Logan Smith	a19461d9e1	[NFC] Add 'override' keyword where missing in include/ and lib/. This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual. Differential Revision: https://reviews.llvm.org/D83709	2020-07-14 09:47:29 -07:00
Paul Walker	6e198aae1d	[SelectionDAG] Prevent warnings when extracting fixed length vector from scalable. ComputeNumSignBits and computeKnownBits both trigger "Scalable flag may be dropped" warnings when a fixed length vector is extracted from a scalable vector. This patch assumes nothing about the demanded elements thus matching the behaviour when extracting a scalable vector from a scalable vector. Differential Revision: https://reviews.llvm.org/D83642	2020-07-14 11:12:56 +00:00
Sam Elliott	1d15bbb9d9	Revert "[RISCV] Avoid Splitting MBB in RISCVExpandPseudo" This reverts commit `97106f9d80`. This is based on feedback from https://reviews.llvm.org/D82988#2147105	2020-07-14 11:15:01 +01:00
David Sherwood	3b8eaf26db	[SVE][CodeGen] Fix implicit TypeSize->uint64_t conversion in TransformFPLoadStorePair In DAGCombiner::TransformFPLoadStorePair we were dropping the scalable property of TypeSize when trying to create an integer type of equivalent size. In fact, this optimisation makes no sense for scalable types since we don't know the size at compile time. I have changed the code to bail out when encountering scalable type sizes. I've added a test to llvm/test/CodeGen/AArch64/sve-fp.ll that exercises this code path. The test already emits an error if it encounters warnings due to implicit TypeSize->uint64_t conversions. Differential Revision: https://reviews.llvm.org/D83572	2020-07-14 08:07:30 +01:00
Amara Emerson	64eb3a4915	[AArch64][GlobalISel] Add post-legalize combine for sext_inreg(trunc(sextload)) -> copy On AArch64 we generate redundant G_SEXTs or G_SEXT_INREGs because of this. Differential Revision: https://reviews.llvm.org/D81993	2020-07-13 20:27:45 -07:00
zuojian lin	fefe6a6642	Fix undefined behavior in DWARF emission Caused by uninitialized load of llvm::DwarfDebug::PrevCU: llvm::DwarfCompileUnit::addRange () at ../lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:276 llvm::DwarfDebug::endFunctionImpl () at ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:1586 llvm::DebugHandlerBase::endFunction () at ../lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp:319 llvm::AsmPrinter::EmitFunctionBody () at ../lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1230 llvm::ARMAsmPrinter::runOnMachineFunction () at ../lib/Target/ARM/ARMAsmPrinter.cpp:161 Most of the DebugInfo tests under `LLVM_LIT_ARGS:STRING=-sv --vg` prior to this fix, and pass with the fix applied. Reviewed By: aprantl, dblaikie Differential Revision: https://reviews.llvm.org/D81631	2020-07-13 18:32:36 -07:00
Matt Arsenault	23ec773d19	GlobalISel: Implement fewerElementsVector for saturating add/sub	2020-07-13 14:46:40 -04:00
Matt Arsenault	6a8c11a11f	GlobalISel: Implement widenScalar for saturating add/sub Add a placeholder legality rule for AMDGPU until the rest of the actions are handled.	2020-07-13 14:46:40 -04:00
Sanjay Patel	8779b11410	[DAGCombiner] rot i16 X, 8 --> bswap X We have this generic transform in IR (instcombine), but as shown in PR41098: http://bugs.llvm.org/PR41098 ...the pattern may emerge in codegen too. x86 has a potential refinement/reversal opportunity here, but that should come later or needs a target hook to avoid the transform. Converting to bswap is the more specific form, so we should use it if it is available.	2020-07-13 12:01:53 -04:00
Sanjay Patel	2df46a5743	[DAGCombiner] allow load/store merging if pairs can be rotated into place This carves out an exception for a pair of consecutive loads that are reversed from the consecutive order of a pair of stores. All of the existing profitability/legality checks for the memops remain between the 2 altered hunks of code. This should give us the same x86 base-case asm that gcc gets in PR41098 and PR44895: http://bugs.llvm.org/PR41098 http://bugs.llvm.org/PR44895 I think we are missing a potential subsequent conversion to use "movbe" if the target supports that. That might be similar to what AArch64 would use to get "rev16". Differential Revision: https://reviews.llvm.org/D83567	2020-07-13 08:57:00 -04:00
Sanjay Patel	f1bbf3acb4	Revert "[DAGCombiner] allow load/store merging if pairs can be rotated into place" This reverts commit `591a3af5c7`. The commit message was cut off and failed to include the review citation.	2020-07-13 08:55:29 -04:00
Sanjay Patel	591a3af5c7	[DAGCombiner] allow load/store merging if pairs can be rotated into place This carves out an exception for a pair of consecutive loads that are reversed from the consecutive order of a pair of stores. All of the existing profitability/legality checks for the memops remain between the 2 altered hunks of code. This should give us the same x86 base-case asm that gcc gets in PR41098 and PR44895:i http://bugs.llvm.org/PR41098 http://bugs.llvm.org/PR44895 I think we are missing a potential subsequent conversion to use "movbe" if the target supports that. That might be similar to what AArch64 would use to get "rev16". Differential Revision:	2020-07-13 08:53:06 -04:00
Kerry McLaughlin	afcc9a81d2	[SVE][Codegen] Add a helper function for pointer increment logic Summary: Helper used when splitting load & store operations to calculate the pointer + offset for the high half of the split Reviewers: efriedma, sdesmalen, david-arm Reviewed By: efriedma Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83577	2020-07-13 10:53:40 +01:00
Petar Avramovic	fd85b40aee	[GlobalISel][InlineAsm] Fix buildCopy for inputs Check that input size matches size of destination reg class. Attempt to extend input size when needed. Differential Revision: https://reviews.llvm.org/D83384	2020-07-13 10:52:33 +02:00
Sanjay Patel	39009a8245	[DAGCombiner] tighten fast-math constraints for fma fold fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E) This is only allowed when "reassoc" is present on the fadd. As discussed in D80801, this transform goes beyond what is allowed by "contract" FMF (-ffp-contract=fast). That is because we are fusing the trailing add of 'E' with a multiply, but without "reassoc", the code mandates that the products AB and CD are added together before adding in 'E'. I've added this example to the LangRef to try to clarify the meaning of "contract". If that seems reasonable, we should probably do something similar for the clang docs because there does not appear to be any formal spec for the behavior of -ffp-contract=fast. Differential Revision: https://reviews.llvm.org/D82499	2020-07-12 08:51:49 -04:00
Alexandre Ganea	b71499ac9e	Revert "Re-land [CodeView] Add full repro to LF_BUILDINFO record" This reverts commit `add59ecb34` and `41d2813a5f`.	2020-07-10 19:46:16 -04:00
Alexandre Ganea	add59ecb34	Re-land [CodeView] Add full repro to LF_BUILDINFO record This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable). Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833	2020-07-10 13:59:28 -04:00
Sanjay Patel	02fec9d2a5	[DAGCombiner] move/rename variables for readability; NFC	2020-07-10 11:28:51 -04:00
David Sherwood	da731894a2	[CodeGen] Replace calls to getVectorNumElements() in DAGTypeLegalizer::SetSplitVector In DAGTypeLegalizer::SetSplitVector I have changed calls in the assert from getVectorNumElements() to getVectorElementCount(), since this code path works for both fixed and scalable vectors. This fixes up one warning in the test: sve-sext-zext.ll Differential Revision: https://reviews.llvm.org/D83196	2020-07-10 08:29:17 +01:00
David Sherwood	229dfb4728	[CodeGen] Replace calls to getVectorNumElements() in SelectionDAG::SplitVector This patch replaces some invalid calls to getVectorNumElements() with calls to getVectorMinNumElements() instead, since the code paths changed in this patch work for both fixed and scalable vector types. Fixes warnings in this test: sve-sext-zext.ll Differential Revision: https://reviews.llvm.org/D83203	2020-07-10 08:11:30 +01:00
Sanjay Patel	a46cf40240	[DAGCombiner] convert if-chain in store merging to switch; NFC	2020-07-09 17:20:04 -04:00
Sanjay Patel	b476e6a642	[DAGCombiner] add helper function for store merging of loaded values; NFC	2020-07-09 17:20:04 -04:00
Sanjay Patel	f98a602c2e	[DAGCombiner] add helper function for store merging of extracts; NFC	2020-07-09 17:20:03 -04:00
Sanjay Patel	8d74cb01b7	[DAGCombiner] add helper function for store merging of constants; NFC	2020-07-09 17:20:03 -04:00
Sanjay Patel	6890e2a17b	[DAGCombiner] add helper function to manage list of consecutive stores; NFC	2020-07-09 17:20:03 -04:00
Christopher Tetreault	ff5b9a7b3b	[SVE] Remove calls to VectorType::getNumElements from CodeGen Reviewers: efriedma, fpetrogalli, sdesmalen, RKSimon, arsenm Reviewed By: RKSimon Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82210	2020-07-09 12:43:36 -07:00
Sam Elliott	97106f9d80	[RISCV] Avoid Splitting MBB in RISCVExpandPseudo Since the `RISCVExpandPseudo` pass has been split from `RISCVExpandAtomicPseudo` pass, it would be nice to run the former as early as possible (The latter has to be run as late as possible to ensure correctness). Running earlier means we can reschedule these pairs as we see fit. Running earlier in the machine pass pipeline is good, but would mean teaching many more passes about `hasLabelMustBeEmitted`. Splitting the basic blocks also pessimises possible optimisations because some optimisations are MBB-local, and others are disabled if the block has its address taken (which is notionally what `hasLabelMustBeEmitted` means). This patch uses a new approach of setting the pre-instruction symbol on the AUIPC instruction to a temporary symbol and referencing that. This avoids splitting the basic block, but allows us to reference exactly the instruction that we need to. Notionally, this approach seems more correct because we do actually want to address a specific instruction. This then allows the pass to be moved much earlier in the pass pipeline, before both scheduling and register allocation. However, to do so we must leave the MIR in SSA form (by not redefining registers), and so use a virtual register for the intermediate value. By using this virtual register, this pass now has to come before register allocation. Reviewed By: luismarques, asb Differential Revision: https://reviews.llvm.org/D82988	2020-07-09 13:54:13 +01:00
Lucas Prates	fc39a9ca0e	[CodeGen] Matching promoted type for 16-bit integer bitcasts from fp16 operand Summary: When legalizing a biscast operation from an fp16 operand to an i16 on a target that requires both input and output types to be promoted to 32-bits, an assertion can fail when building the new node due to a mismatch between the the operation's result size and the type specified to the node. This patches fix the issue by making sure the bit width of the types match for the FP_TO_FP16 node, covering the difference with an extra ANYEXTEND operation. Reviewers: ostannard, efriedma, pirama, jmolloy, plotfi Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82552	2020-07-09 09:46:17 +01:00
serge-sans-paille	a60c31fd62	Fix return status of AtomicExpandPass Correctly reflect change in the return status. Differential Revision: https://reviews.llvm.org/D83457	2020-07-09 10:27:48 +02:00
Qiu Chaofan	4254ed5c32	[Legalizer] Fix wrong operand in split vector helper This should be a typo introduced in D69275, which may cause an unknown segment fault in getNode. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D83376	2020-07-09 09:57:29 +08:00
Matt Arsenault	18bd821f02	DAG: Remove redundant finalizeLowering call 9cac4e6d1403554b06ec2fc9d834087b1234b695/D32628 intended to eliminate this, and move all isel pseudo expansion to FinalizeISel. This was a bad rebase or something, and failed to actually delete this call. GlobalISel also has a redundant call of finalizeLowering. However, it requires more work to remove it since it currently triggers a lot of verifier errors in tests.	2020-07-08 18:48:20 -04:00
Matt Arsenault	2ec5fc0c61	DAG: Remove redundant handling of reg fixups It looks like `9cac4e6d14` accidentally added a second copy of this from a bad rebase or something. This second copy was added, and the finalizeLowering call was not deleted as intended.	2020-07-08 18:32:43 -04:00
Matt Arsenault	74a148ad39	GlobalISel: Verify G_BITCAST changes the type Updated the AArch64 tests the best I could with my vague, inferred understanding of AArch64 register banks. As far as I can tell, there is only one 32-bit/64-bit type which will use the gpr register bank, so we have to use the fpr bank for the other operand.	2020-07-08 17:16:27 -04:00
Sanjay Patel	1265eb2d5f	[DAGCombiner] clean up in mergeConsecutiveStores(); NFC	2020-07-08 14:48:05 -04:00
Sanjay Patel	12c2271e53	[DAGCombiner] fix code comment and improve readability; NFC	2020-07-08 14:48:05 -04:00
Sanjay Patel	683a7f7025	[DAGCombiner] fix function-name formatting; NFC	2020-07-08 12:49:59 -04:00
Sanjay Patel	39329d5724	[DAGCombiner] add enum for store source value; NFC This removes existing code duplication and allows us to assert that we are handling the expected cases. We have a list of outstanding bugs that could benefit by handling truncated source values, so that's a possible addition going forward.	2020-07-08 12:49:59 -04:00
Evgeny Leviant	a074984250	[MIR] Speedup parsing of function with large number of basic blocks Patch eliminates string length calculation when lexing a token. Speedup can be up to 1000x. Differential revision: https://reviews.llvm.org/D83389	2020-07-08 18:50:00 +03:00
Paul Walker	bb35f0fd89	[SelectionDAG] Fix incorrect offset when expanding CONCAT_VECTORS. ExpandVectorBuildThroughStack is also used for CONCAT_VECTORS. However, when calculating the offsets for each of the operands we incorrectly use the element size rather than actual size and thus the stores overlap. Differential Revision: https://reviews.llvm.org/D83303	2020-07-08 15:39:25 +00:00
Ties Stuij	26a22478cd	[CodeGen] Don't combine extract + concat vectors with non-legal types Summary: The following combine currently breaks in the DAGCombiner: ``` extract_vector_elt (concat_vectors v4i16:a, v4i16:b), x -> extract_vector_elt a, x ``` This happens because after we have combined these nodes we have inserted nodes that use individual instances of the vector element type. In the above example i16. However this isn't a legal type on all backends, and when the combining pass calls the legalizer it breaks as it expects types to already be legal. The type legalizer has already been run, and running it again would make a mess of the nodes. In the example code at least, the generated code is still efficient after the change. Reviewers: miyuki, arsenm, dmgreen, lebedev.ri Reviewed By: miyuki, lebedev.ri Subscribers: lebedev.ri, wdng, hiraditya, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83231	2020-07-08 15:29:57 +01:00
Petar Avramovic	419c92a749	[GlobalISel][InlineAsm] Fix matching input constraints to mem operand Mark matching input constraint to mem operand as not supported. Differential Revision: https://reviews.llvm.org/D83235	2020-07-08 12:32:17 +02:00
Jeremy Morse	b9d977b0ca	[DWARF] Add cuttoff guarding quadratic validThroughout behaviour Occasionally we see absolutely massive basic blocks, typically in global constructors that are vulnerable to heavy inlining. When these blocks are dense with DBG_VALUE instructions, we can hit near quadratic complexity in DwarfDebug's validThroughout function. The problem is caused by: * validThroughout having to step through all instructions in the block to examine their lexical scope, * and a high proportion of instructions in that block being DBG_VALUEs for a unique variable fragment, Leading to us stepping through every instruction in the block, for (nearly) each instruction in the block. By adding this guard, we force variables in large blocks to use a location list rather than a single-location expression, as shown in the added test. This shouldn't change the meaning of the output DWARF at all: instead we use a less efficient DWARF encoding to avoid a poor-performance code path. Differential Revision: https://reviews.llvm.org/D83236	2020-07-08 10:30:09 +01:00
David Sherwood	9e66e9c30a	[CodeGen] Fix wrong use of getVectorNumElements() in DAGTypeLegalizer::SplitVecRes_ExtendOp In DAGTypeLegalizer::SplitVecRes_ExtendOp I have replaced an invalid call to getVectorNumElements() with a call to getVectorMinNumElements(), since the code path works for both fixed and scalable vectors. This fixes up a warning in the following test: sve-sext-zext.ll Differential Revision: https://reviews.llvm.org/D83197	2020-07-08 09:53:20 +01:00
David Sherwood	5b14f5051f	[CodeGen] Fix wrong use of getVectorNumElements in PromoteIntRes_EXTRACT_SUBVECTOR Calling getVectorNumElements() is not safe for scalable vectors and we should normally use getVectorElementCount() instead. However, for the code changed in this patch I decided to simply move the instantiation of the variable 'OutNumElems' lower down to the place where only fixed-width vectors are used, and hence it is safe to call getVectorNumElements(). Fixes up one warning in this test: sve-sext-zext.ll Differential Revision: https://reviews.llvm.org/D83195	2020-07-08 09:36:34 +01:00
David Sherwood	15aeb805dc	[CodeGen] Fix warnings in sve-ld1-addressing-mode-reg-imm.ll For the GetElementPtr case in function AddressingModeMatcher::matchOperationAddr I've changed the code to use the TypeSize class instead of relying upon the implicit conversion to a uint64_t. As part of this we now check for scalable types and if we encounter one just bail out for now as the subsequent optimisations doesn't currently support them. This changes fixes up all warnings in the following tests: llvm/test/CodeGen/AArch64/sve-ld1-addressing-mode-reg-imm.ll llvm/test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll Differential Revision: https://reviews.llvm.org/D83124	2020-07-08 09:16:00 +01:00
Heejin Ahn	7e6793aa33	[WebAssembly] Generate unreachable after __stack_chk_fail `__stack_chk_fail` does not return, but `unreachable` was not generated following `call __stack_chk_fail`. This had a possibility to generate an invalid binary for functions with a return type, because `__stack_chk_fail`'s return type is void and `call __stack_chk_fail` can be the last instruction in the function whose return type is non-void. Generating `unreachable` after it makes sure CFGStackify's `fixEndsAtEndOfFunction` handles it correctly. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D83277	2020-07-08 01:02:05 -07:00
serge-sans-paille	edc7da2405	Upgrade TypePromotionTransaction to be able to report changes in CodeGenPrepare optimizeMemoryInst was reporting no change while still modifying the IR. Inspect the status of TypePromotionTransaction to get a better status. Related to https://reviews.llvm.org/D80916 Differential Revision: https://reviews.llvm.org/D81256	2020-07-08 08:35:44 +02:00
Philip Reames	22596e7b2f	[Statepoint] Use early return to reduce nesting and clarify comments [NFC]	2020-07-07 16:19:05 -07:00
Philip Reames	9955876d74	[Statepoint] Reduce intendation and change a variable name [NFC]	2020-07-07 16:19:05 -07:00
Matt Arsenault	23157f3bdb	GlobalISel: Handle EVT argument lowering correctly handleAssignments was assuming every argument type is an MVT, and assignArg would always fail. This fixes one of the hacks in the current AMDGPU calling convention code that pre-processes the arguments.	2020-07-07 16:36:14 -04:00
Philip Reames	b172cd7812	[Statepoint] Factor out logic for non-stack non-vreg lowering [almost NFC] This is inspired by D81648. The basic idea is to have the set of SDValues which are lowered as either constants or direct frame references explicit in one place, and to separate them clearly from the spilling logic. This is not NFC in that the handling of constants larger than > 64 bit has changed. The old lowering would crash on values which could not be encoded as a sign extended 64 bit value. The new lowering just spills all constants > 64 bits. We could be consistent about doing the sext(Con64) optimization, but I happen to know that this code path is utterly unexercised in practice, so simple is better for now.	2020-07-07 13:34:28 -07:00
Stanislav Mekhanoshin	7c03872645	LIS: fix handleMove to properly extend main range handleMoveDown or handleMoveUp cannot properly repair a main range of a LiveInterval since they only get LiveRange. There is a problem if certain use has moved few segments away and there is a hole in the main range in between of these two locations. We may get a SubRange with a very extended Segment spanning several Segments of the main range and also spanning that hole. If that happens then we end up with the main range not covering its SubRange which is an error. It might be possible to attempt fixing the main range in place just between of the old and new index by extending all of its Segments in between, but it is unclear this logic will be faster than just straight constructMainRangeFromSubranges, which itself is pretty cheap since it only contains interval logic. That will also require shrinkToUses() call after which is probably even more expensive. In the test second move is from 64B to 92B for the sub1. Subrange is correctly fixed: L000000000000000C [16r,32B:0)[32B,92r:1) 0@16r 1@32B-phi But the main range has a hole in between 80d and 88r after updateRange(): %1 [16r,32B:0)[32B,80r:4)[80r,80d:3)[88r,96r:1)[96r,160B:2) Since source position is 64B this segment is not even considered by the updateRange(). Differential Revision: https://reviews.llvm.org/D82916	2020-07-07 11:52:32 -07:00
Kerry McLaughlin	cdf2eef613	[SVE][CodeGen] Legalisation of unpredicated store instructions Summary: When splitting a store of a scalable type, the new address is calculated in SplitVecOp_STORE using a vscale and an add instruction. Reviewers: sdesmalen, efriedma, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83041	2020-07-07 11:47:10 +01:00
Kerry McLaughlin	5e8084beba	[SVE][CodeGen] Legalisation of unpredicated load instructions Summary: When splitting a load of a scalable type, the new address is calculated in SplitVecRes_LOAD using a vscale and an add instruction. This patch also adds a DAG combiner fold to visitADD for vscale: - Fold (add (vscale(C0)), (vscale(C1))) to (add (vscale(C0 + C1))) Reviewers: sdesmalen, efriedma, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82792	2020-07-07 11:05:03 +01:00
David Sherwood	79d34a5a1b	[SVE][CodeGen] Fix bug when falling back to DAG ISel In an earlier commit `584d0d5c17` I added functionality to allow AArch64 CodeGen support for falling back to DAG ISel when Global ISel encounters scalable vector types. However, it seems that we were not falling back early enough as llvm::getLLTForType was still being invoked for scalable vector types. I've added a new fallback function to the call lowering class in order to catch this problem early enough, rather than wait for lowerFormalArguments to reject scalable vector types. Differential Revision: https://reviews.llvm.org/D82524	2020-07-07 09:23:04 +01:00
David Sherwood	c061e56e88	[CodeGen] Fix warnings in sve-vector-splat.ll and sve-trunc.ll This patch fixes all remaining warnings in: llvm/test/CodeGen/AArch64/sve-trunc.ll llvm/test/CodeGen/AArch64/sve-vector-splat.ll I hit some warnings related to getCopyPartsToVector. I fixed two issues: 1. In widenVectorToPartType() we assumed that we'd always be using BUILD_VECTOR nodes to expand from one vector type to another, which is incorrect for scalable vector types. I've fixed this for now by simply bailing out immediately for scalable vectors. 2. In getCopyToPartsVector() I've changed the code to compare the element counts of different types. Differential Revision: https://reviews.llvm.org/D83028	2020-07-07 09:21:47 +01:00
Sanjay Patel	ea71ba11ab	[DAGCombiner] reassociate reciprocal sqrt expression to eliminate FP division X / (fabs(A) * sqrt(Z)) --> X / sqrt(AAZ) --> X * rsqrt(AAZ) In the motivating case from PR46406: https://bugs.llvm.org/show_bug.cgi?id=46406 ...this is restoring the sequence that was originally in the source code. We extracted a term from within the sqrt because we do not know in instcombine whether a target will expand a sqrt call. Note: we could say that the transform in IR should be restricted, but that would not solve the problem if the source was originally in the pattern shown here. This is a gray area for fast-math-flag requirements. I think we should at least check fast-math-flags on the fdiv and fmul because I view this transform as 2 pieces: reassociate the fmul operands and form reciprocal from the fdiv (as with the existing transform). We could argue that the sqrt also needs FMF, but that was not required before, so we should change that in a follow-up patch if that seems better. We don't currently have a way to check that the target will produce a sqrt or recip estimate without actually creating nodes (the APIs are SDValue getSqrtEstimate() and SDValue getRecipEstimate()), so we clean up speculatively created nodes if we are not able to create an estimate. The x86 test with doubles verifies that we are not changing a test with no estimate sequence. Differential Revision: https://reviews.llvm.org/D82716	2020-07-06 19:12:21 -04:00
Yuanfang Chen	1e495e10e6	[NFC] change getLimitedCodeGenPipelineReason to static function	2020-07-06 15:39:27 -07:00
Nicolai Hähnle	76c5cb05a3	DomTree: Remove getChildren() accessor Summary: Avoid exposing details about how children are stored. This will enable subsequent type-erasure changes. New methods are introduced to cover common access patterns. Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f Reviewers: arsenm, RKSimon, mehdi_amini, courbet Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83083	2020-07-06 21:58:11 +02:00
jasonliu	6d3ae365bd	[XCOFF][AIX] Give symbol an internal name when desired symbol name contains invalid character(s) Summary: When a desired symbol name contains invalid character that the system assembler could not process, we need to emit .rename directive in assembly path in order for that desired symbol name to appear in the symbol table. Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L Differential Revision: https://reviews.llvm.org/D82481	2020-07-06 15:49:15 +00:00
Matt Arsenault	521ebc1681	GlobalISel: Move finalizeLowering call later This matches the DAG behavior where this is called after the loop checking for calls. The AMDGPU implementation depends on knowing if there are calls in the function or not, so move this later. Another problem is finalizeLowering is actually called twice; I was seeing weird inconsistencies since the first call would produce unexpected results and the second run would correct them in some contexts. Since this requires disabling the verifier, and it's useful to serialize the MIR immediately after selection, FinalizeISel should probably not be a real pass.	2020-07-06 09:19:40 -04:00
Jay Foad	babbeafa00	[TargetLowering] Improve expansion of FSHL/FSHR by non-zero amount Use a simpler code sequence when the shift amount is known not to be zero modulo the bit width. Nothing much uses this until D77152 changes the translation of fshl and fshr intrinsics. Differential Revision: https://reviews.llvm.org/D82540	2020-07-06 12:07:14 +01:00
Jay Foad	e7a4a24dc5	[TargetLowering] Improve expansion of ROTL/ROTR Using a negation instead of a subtraction from a constant can save an instruction on some targets. Nothing much uses this until D77152 changes the translation of fshl and fshr intrinsics. Differential Revision: https://reviews.llvm.org/D82539	2020-07-06 12:07:14 +01:00
Craig Topper	76123d338d	[DAGCombiner] visitSIGN_EXTEND_INREG should fold sext_vector_inreg(undef) to 0 not undef. We need to ensure that the sign bits of the result all match so we can't fold to undef. Similar to PR46585. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D83163	2020-07-04 14:35:49 -07:00
Craig Topper	120c5f1057	[DAGCombiner] Don't fold zext_vector_inreg/sext_vector_inreg(undef) to undef. Fold to 0. zext_vector_inreg needs to produces 0s in the extended bits and sext_vector_inreg needs to produce upper bits that are all the same. So we should fold them to a 0 vector instead of undef. Fixes PR46585.	2020-07-04 11:42:53 -07:00
Simon Pilgrim	56a8a5c9fe	[DAG] matchBinOpReduction - match subvector reduction patterns beyond a matched shufflevector reduction Currently matchBinOpReduction only handles shufflevector reduction patterns, but in many cases these only occur in the final stages of a reduction, once we're down to legal vector widths. Before this its likely that we are performing reductions using subvector extractions to repeatedly split the source vector in half and perform the binop on the halves. Assuming we've found a non-partial reduction, this patch continues looking for subvector reductions as far as it can beyond the last shufflevector. Fixes PR37890	2020-07-04 15:28:15 +01:00
David Green	9e03547cab	[ARM][HWLoops] Create hardware loops for sibling loops Given a loop with two subloops, it should be possible for both to be converted to hardware loops. That's what this patch does, simply enough. It slightly alters the loop iterating order to try and convert all subloops. If one (or more) succeeds, it stops as before. Differential Revision: https://reviews.llvm.org/D78502	2020-07-03 17:20:02 +01:00
Guillaume Chatelet	87e2751cf0	[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D83082	2020-07-03 08:06:43 +00:00
Sanjay Patel	bc110de78a	[SelectionDAG] don't split branch on logic-of-vector-compares SelectionDAGBuilder converts logic-of-compares into multiple branches based on a boolean TLI setting in isJumpExpensive(). But that probably never considered the pattern of extracted bools from a vector compare - it seems unlikely that we would want to turn vector logic into control-flow. The motivating x86 reduction case is shown in PR44565: https://bugs.llvm.org/show_bug.cgi?id=44565 ...and that test shows the expected improvement from using pmovmsk codegen. For AArch64, I modified the test to include an extra op because the simpler test gets transformed by a codegen invocation of SimplifyCFG. Differential Revision: https://reviews.llvm.org/D82602	2020-07-02 17:05:24 -04:00
Sander de Smalen	143e324e75	[CodeGen][SVE] Don't drop scalable flag in DAGCombiner::visitEXTRACT_SUBVECTOR There was a rogue 'assert' in AArch64ISelLowering for the tuple.get intrinsics, that shouldn't really have been there (I suspect this was a remnant from when we expected the wider vector always to have come from a vector CONCAT). When I tried to create a more minimal reproducer, I found a bug in DAGCombiner where it drops the scalable flag when trying to fold: extract_subv (bitcast X), Index --> bitcast (extract_subv X, Index') This patch fixes both issues. Reviewers: david-arm, efriedma, spatel Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D82910	2020-07-02 10:16:43 +01:00
David Sherwood	c7df35d2b2	[CodeGen] Fix warnings in getCopyToPartsVector Whilst trying to assemble the following test: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_set2.c I discovered we were hitting some warnings about possible invalid calls to getVectorNumElements() in getCopyToPartsVector(). I've tried to fix these by using ElementCount types where possible and I've made the assumption that we don't support using a fixed width vector to copy parts of a scalable vector, and vice versa. Looking at how the copy is implemented I think that's the right thing for now. Differential Revision: https://reviews.llvm.org/D82744	2020-07-02 09:08:20 +01:00
Krzysztof Pszeniczny	e4b3c138de	This patch adds basic debug info support with basic block sections. This patch uses ranges for debug information when a function contains basic block sections rather than using [lowpc, highpc]. This is also the first in a series of patches for debug info and does not contain the support for linker relaxation. That will be done as a follow up patch. Differential Revision: https://reviews.llvm.org/D78851	2020-07-01 23:53:00 -07:00
Matt Arsenault	afb3bd9914	RegAllocGreedy: Use TargetInstrInfo already in the class	2020-07-01 18:58:59 -04:00
Craig Topper	51e92b223b	[X86] Speculatively apply the same fix from `361853c96f` to PromoteIntOp_MGATHER. The UpdateNodeOperands here is also subject to CSE.	2020-07-01 11:57:59 -07:00
Craig Topper	361853c96f	[LegalizeTypes] Properly handle the case when UpdateNodeOperands in PromoteIntOp_MLOAD triggers CSE instead of updating the node in place. The caller can't handle the node having multiple results like a masked load does. So we need to detect the case and do our own result replacement. Fixes PR46532.	2020-07-01 11:48:50 -07:00
David Sherwood	f11305780f	[CodeGen] Fix warnings in DAGCombiner::visitSCALAR_TO_VECTOR In visitSCALAR_TO_VECTOR we try to optimise cases such as: scalar_to_vector (extract_vector_elt %x) into vector shuffles of %x. However, it led to numerous warnings when %x is a scalable vector type, so for now I've changed the code to only perform the combination on fixed length vectors. Although we probably could change the code to work with scalable vectors in certain cases, without a proper profit analysis it doesn't seem worth it at the moment. This change fixes up one of the warnings in: llvm/test/CodeGen/AArch64/sve-merging-stores.ll I've also added a simplified version of the same test to: llvm/test/CodeGen/AArch64/sve-fp.ll which already has checks for no warnings. Differential Revision: https://reviews.llvm.org/D82872	2020-07-01 18:47:13 +01:00
James Y Knight	4b0aa5724f	Change the INLINEASM_BR MachineInstr to be a non-terminating instruction. Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block. Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already. Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate. Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D79794	2020-07-01 12:51:50 -04:00
Yuanfang Chen	78c69a00a4	[NFC] Clean up uses of MachineModuleInfoWrapperPass	2020-07-01 09:45:05 -07:00
David Green	ca4c1ad854	[Outliner] Set nounwind for outlined functions This prevents the outlined functions from pulling in a lot of unnecessary code in our downstream libraries/linker. Which stops outlining making codesize worse in c++ code with no-exceptions. Differential Revision: https://reviews.llvm.org/D57254	2020-07-01 17:18:34 +01:00
Guillaume Chatelet	ef36f5143d	[Alignment] TargetLowering::hasPairedLoad must use Align for RequiredAlignment As per documentation of `hasPairLoad`: "`RequiredAlignment` gives the minimal alignment constraints that must be met to be able to select this paired load." In this sense, `0` is strictly equivalent to `1`. We make this obvious by using `Align` instead of unsigned. There is only one implementor of this interface. Differential Revision: https://reviews.llvm.org/D82958	2020-07-01 14:32:30 +00:00
Guillaume Chatelet	d3085c2501	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82956	2020-07-01 14:31:56 +00:00
Guillaume Chatelet	27bbc8ede1	[Alignment][NFC] Migrate TargetTransformInfo::CreateVariableSizedObject to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82939	2020-07-01 14:31:21 +00:00
David Sherwood	97a7a9abb2	[CodeGen] Fix up warnings in visitEXTRACT_SUBVECTOR It's perfectly valid to do certain DAG combines where we extract subvectors from a concat vector when we have scalable vector types. However, we can do this in a way that avoids generating compiler warnings by replacing calls to getVectorNumElements() with getVectorMinNumElements(). Due to the way subvector extracts are designed to work with scalable vector types this is ok. This eliminates some warnings from existing tests in this file: llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll Differential Revision: https://reviews.llvm.org/D82655	2020-07-01 15:10:53 +01:00
David Stenberg	85460c4ea2	[DebugInfo] Do not emit entry values for composite locations Summary: This is a fix for PR45009. When working on D67492 I made DwarfExpression emit a single DW_OP_entry_value operation covering the whole composite location description that is produced if a register does not have a valid DWARF number, and is instead composed of multiple register pieces. Looking closer at the standard, this appears to not be valid DWARF. A DW_OP_entry_value operation's block can only be a DWARF expression or a register location description, so it appears to not be valid for it to hold a composite location description like that. See DWARFv5 sec. 2.5.1.7: "The DW_OP_entry_value operation pushes the value that the described location held upon entering the current subprogram. It has two operands: an unsigned LEB128 length, followed by a block containing a DWARF expression or a register location description (see Section 2.6.1.1.3 on page 39)." Here is a dwarf-discuss mail thread regarding this: http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2020-March/004610.html There was not a strong consensus reached there, but people seem to lean towards that operations specified under 2.6 (e.g. DW_OP_piece) may not be part of a DWARF expression, and thus the DW_OP_entry_value operation can't contain those. Perhaps we instead want to emit a entry value operation per each DW_OP_reg* operation, e.g.: - DW_OP_entry_value(DW_OP_regx sub_reg0), DW_OP_stack_value, DW_OP_piece 8, - DW_OP_entry_value(DW_OP_regx sub_reg1), DW_OP_stack_value, DW_OP_piece 8, [...] The question then becomes how the call site should look; should a composite location description be emitted there, and we then leave it up to the debugger to match those two composite location descriptions? Another alternative could be to emit a call site parameter entry for each sub-register, but firstly I'm unsure if that is even valid DWARF, and secondly it seems like that would complicate the collection of call site values quite a bit. As far as I can tell GCC does not emit any entry values / call sites in these cases, so we do not have something to compare with, but the former seems like the more reasonable approach. Currently when trying to emit a call site entry for a parameter composed of multiple DWARF registers a (DwarfRegs.size() == 1) assert is triggered in addMachineRegExpression(). Until the call site representation is figured out, and until there is use for these entry values in practice, this commit simply stops the invalid DWARF from being emitted. Reviewers: djtodoro, vsk, aprantl Reviewed By: djtodoro, vsk Subscribers: jyknight, hiraditya, fedor.sergeev, jrtc27, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D75270	2020-07-01 10:50:55 +02:00
Guillaume Chatelet	7f37d88306	[Alignment][NFC] Migrate MachineFrameInfo::CreateSpillStackObject to Align iThis patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82934	2020-07-01 08:49:28 +00:00
Sam Parker	3ee580d017	[ARM][LowOverheadLoops] Handle reductions While validating live-out values, record instructions that look like a reduction. This will comprise of a vector op (for now only vadd), a vorr (vmov) which store the previous value of vadd and then a vpsel in the exit block which is predicated upon a vctp. This vctp will combine the last two iterations using the vmov and vadd into a vector which can then be consumed by a vaddv. Once we have determined that it's safe to perform tail-predication, we need to change this sequence of instructions so that the predication doesn't produce incorrect code. This involves changing the register allocation of the vadd so it updates itself and the predication on the final iteration will not update the falsely predicated lanes. This mimics what the vmov, vctp and vpsel do and so we then don't need any of those instructions. Differential Revision: https://reviews.llvm.org/D75533	2020-07-01 08:31:49 +01:00
Guillaume Chatelet	28de229bc6	[Alignment][NFC] Migrate MachineFrameInfo::CreateStackObject to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82894	2020-07-01 07:28:11 +00:00
JF Bastien	ca134e4c52	[NFC] fix diagnostic It's pretty silly to diagnose on a scalar copy but the build does that: loop variable 'SibReg' of type 'const llvm::Register' creates a copy from type 'const llvm::Register' [-Wrange-loop-analysis]	2020-06-30 21:49:01 -07:00
Matt Arsenault	e9eab30339	GlobalISel: Disallow undef generic virtual register uses With an undef operand, it's possible for getVRegDef to fail and return null. This is an edge case very little code bothered to consider. Proper gMIR should use G_IMPLICIT_DEF instead. I initially tried to apply this restriction to all SSA MIR, so then getVRegDef would never fail anywhere. However, ProcessImplicitDefs does technically run while the function is in SSA. ProcessImplicitDefs and DetectDeadLanes would need to either move, or a new pseudo-SSA type of function property would need to be introduced.	2020-06-30 19:18:01 -04:00
Hendrik Greving	50ac7ce94f	[ModuloSchedule] Make PeelingModuloScheduleExpander inheritable. Basically a NFC, but allows subclasses access to the entire PeelingModuloScheduleExpander class. We are doing this to allow backends, particularly one that are not necessarily upstreamed, to inherit from PeelingModuloScheduleExpander and access its basic structures. Renames Info into LoopInfo for consistency in PeelingModuloScheduleExpander. Differential Revision: https://reviews.llvm.org/D82673	2020-06-30 15:56:13 -07:00
Hsiangkai Wang	a7b0f39185	[MVT] Add new MVT types for RISC-V vector. In RISC-V vector extension, users could group multiple vector registers as one pseudo register. In mixed width operations, users could use partial vector registers to reduce the register pressure. The parameter to control register grouping and partial use is called LMUL. LMUL is a part of the type. So, we have a bunch of vector types. In order to support all these types, we need new MVT types in LLVM. In this patch, I added several MVT types that are used in RISC-V vector implementation. This is a standalone patch for MVT types without RISC-V related implementation. Differential revision: https://reviews.llvm.org/D81724	2020-07-01 01:07:50 +08:00
Matt Arsenault	b7f6ecf0c7	RegAlloc: Start using Register	2020-06-30 12:13:08 -04:00
Matt Arsenault	af1eeaf380	BranchFolding: Use Register	2020-06-30 12:13:08 -04:00
Matt Arsenault	edb4a5cb36	TailDuplicator: Use Register	2020-06-30 12:13:08 -04:00
Guillaume Chatelet	423458ec09	[Alignment][NFC] TargetLowering::allowsMemoryAccessForAlignment First patch of a series to adapt TargetLowering::allowsXXX functions This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D81372	2020-06-30 15:31:24 +00:00
Guillaume Chatelet	c1cd61e02a	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemcpy to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82849	2020-06-30 13:12:31 +00:00
Guillaume Chatelet	306d7c6929	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemmove to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82850	2020-06-30 12:46:59 +00:00
Guillaume Chatelet	6a6af30d43	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemset to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82851	2020-06-30 12:46:26 +00:00
Guillaume Chatelet	2c5ff48e61	[Alignment][NFC] Migrate AtomicExpandPass to Align This is a followup on D78403. I'm unsure about `getAtomicOpAlign` overloads that take `AtomicRMWInst` and `AtomicCmpXchgInst`, shouldn't `getAlign` provide the correct answer already? Differential Revision: https://reviews.llvm.org/D81369	2020-06-30 09:54:45 +00:00
Petar Avramovic	4b980cc9ca	[GlobalISel][InlineAsm] Add support for matching input constraints Find def operand that corresponds to matching constraint and tie input to that operand. Differential Revision: https://reviews.llvm.org/D82651	2020-06-30 10:49:05 +02:00
Guillaume Chatelet	5f8bdb3e6a	[Alignment][NFC] TargetLowering::allowsMemoryAccess Second patch of a series to adapt TargetLowering::allowsXXX functions This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82785	2020-06-30 08:17:00 +00:00
David Sherwood	c02332a693	[CodeGen] Fix warning in getNode for EXTRACT_SUBVECTOR Fix a warning in getNode() when extracting a subvector from a concat vector. We can simply replace the call to getVectorNumElements with getVectorMinNumElements as this follows the defined behaviour for EXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D82746	2020-06-30 08:11:41 +01:00
David Sherwood	46a7f4d6f4	[SVE][CodeGen] Fix bug in DAGCombiner::reduceBuildVecToShuffle When trying to reduce a BUILD_VECTOR to a SHUFFLE_VECTOR it's important that we carefully check the vector types that led to that BUILD_VECTOR. In the test I have attached to this commit there is a case where the results of two SVE faddv instructions are being stored to consecutive memory locations. With my fix, as part of merging those stores we discover that each BUILD_VECTOR element came from an extract of a SVE vector element and therefore bail out. Differential Revision: https://reviews.llvm.org/D82564	2020-06-30 07:28:15 +01:00
Guillaume Chatelet	368a5e3a66	[Alignment][NFC] migrate DataLayout::getPreferredAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82752	2020-06-29 11:24:36 +00:00
Simon Pilgrim	3521ecf1f8	[X86] Add vector support to targetShrinkDemandedConstant for OR/XOR opcodes If a constant is only allsignbits in the demanded/active bits, then sign extend it to an allsignbits bool pattern for OR/XOR ops. This also requires SimplifyDemandedBits XOR handling to be modified to call ShrinkDemandedConstant on any (non-NOT) XOR pattern to account for non-splat cases. Next step towards fixing PR45808 - with this patch we now get a <-1,-1,0,0> v4i64 constant instead of <1,1,0,0>. Differential Revision: https://reviews.llvm.org/D82257	2020-06-29 12:19:05 +01:00
Simon Pilgrim	973685fc78	[TargetLowering] Add DemandedElts arg to ShrinkDemandedConstant Pre-commit for D82257, this adds a DemandedElts arg to ShrinkDemandedConstant/targetShrinkDemandedConstant which will allow future patches to (optionally) add vector support.	2020-06-29 11:46:58 +01:00
Guillaume Chatelet	3500d9ec95	Fix invalid alignment in DAGCombiner::isLegalNarrowLdSt `ShAmt / 8` can be a non power of two, this can lead to an invalid alignment. context: https://reviews.llvm.org/D41350#inline-749165 Differential Revision: https://reviews.llvm.org/D82565	2020-06-29 09:22:15 +00:00
madhur13490	299dee91b3	Revert accidentally landed patch citing o build errors Summary: This reverts commit `c73966c2f7`. Reviewers: Subscribers:	2020-06-28 11:52:33 +00:00
madhur13490	c73966c2f7	Improve stack object printing. NFC. Reviewers: madhur13490 Reviewed By: madhur13490 Subscribers: qcolombet, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82712	2020-06-28 11:43:33 +00:00
Simon Pilgrim	6bdb3ce452	[DAG] reduceBuildVecExtToExtBuildVec - don't combine if it would break a splat. reduceBuildVecExtToExtBuildVec was breaking a splat(zext(x)) pattern into buildvector(x, 0, x, 0, ..) resulting in much more complex insert+shuffle codegen. We already go to some lengths to avoid this in SimplifyDemandedVectorElts etc. when we encounter splat buildvectors. It should be OK to fold all splat(aext(x)) patterns - we might need to tighten this if we find a case where we mustn't introduce a buildvector(x, undef, x, undef, ..) but I can't find one. Fixes PR46461.	2020-06-27 11:03:57 +01:00
Matt Arsenault	c2e403c19d	GlobalISel: Don't fail translate on weak cmpxchg The translation of cmpxchg added by `9481399c0f` specifically skipped weak cmpxchg due to not understanding the meaning. Weak cmpxchg was added in `420a216817`. As explained in the commit message, the weak mode is implicit in how ATOMIC_CMP_SWAP_WITH_SUCCESS is lowered. If it's expanded to a regular ATOMIC_CMP_SWAP, it's replaced with a strong cmpxchg. This handling seems weird to me, but this was already following the DAG behavior. I would expect the strong IR instruction to not have the boolean output. Failing that, I might expect the IRTranslator to emit ATOMIC_CMP_SWAP and a constant for the boolean.	2020-06-26 17:52:18 -04:00
Sanjay Patel	e7f7715eb9	[DAGCombiner] rename variables for readability; NFC PR46406 shows a pattern where we can do better, so try to clean this up before adding more code.	2020-06-26 14:22:11 -04:00
Guillaume Chatelet	fdc7c7fb87	[Alignment][NFC] Migrate TTI::getInterleavedMemoryOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82573	2020-06-26 11:00:53 +00:00
Simon Pilgrim	da426ead73	LiveRangeEdit.h - reduce AliasAnalysis.h include to forward declaration. NFC. Move include to LiveRangeEdit.cpp and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-26 09:58:21 +01:00
Sjoerd Meijer	243a5329d4	[SelectionDAG] Lower @llvm.get.active.lane.mask to setcc This lowers intrinsic @llvm.get.active.lane.mask to a setcc node, i.e. an icmp ule, and creates vectors for its 2 arguments on which the comparison is performed. Differential Revision: https://reviews.llvm.org/D82292	2020-06-26 07:46:38 +01:00
Igor Kudrin	70165bb7e9	[DebugInfo] Fix emitting offsets to CUs with -dwarf-sections-as-references=Enable. The size of the field depends on the DWARF format, not the address size of the target. Differential Revision: https://reviews.llvm.org/D82311	2020-06-26 12:12:26 +07:00
Wouter van Oortmerssen	b9a539c010	[WebAssembly] Adding 64-bit versions of __stack_pointer and other globals We have 6 globals, all of which except for __table_base are 64-bit under wasm64. Differential Revision: https://reviews.llvm.org/D82130	2020-06-25 15:52:44 -07:00
Paul Walker	2c09e91054	[MVT] Add missing floating point types for 1024/2048-bit vectors. Summary: This patch adds entries for: v64f16 v128f16 v64bf16 v128bf16 v32f64 Subscribers: dschuff, hiraditya, aheejin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82466	2020-06-25 21:13:31 +00:00
Simon Pilgrim	1815b77c3e	LiveIntervals.h.h - reduce AliasAnalysis.h include to forward declaration. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-25 14:22:21 +01:00
Simon Pilgrim	792e4a8c97	CodeGenPrepare.cpp - remove unused IntrinsicsX86.h header. NFC.	2020-06-25 14:22:19 +01:00
Simon Pilgrim	172c36a100	Fix typos in CodeGenPrepare::splitLargeGEPOffsets comments.	2020-06-25 14:22:19 +01:00
Scott Linder	4d81aec40c	[MIR] Fix CFI_INSTRUCTION escape printing Summary: The printer seems to intend to not print the trailing comma but has a copy-paste error for the last value in the escape, and the parser enforces having no trailing comma, but somehow a test was never included to actually confirm it. Reviewers: thegameg, arsenm Reviewed By: thegameg, arsenm Subscribers: wdng, arsenm, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82478	2020-06-24 18:15:28 -04:00
Simon Pilgrim	a53dddb3e9	Local.h - reduce includes to forward declarations. NFC. Fix implicit include dependencies in source files and replace legacy AliasAnalysis typedef with AAResults where necessary.	2020-06-24 19:27:37 +01:00
Simon Pilgrim	bf77c7ef2d	Loads.h - reduce AliasAnalysis.h include to forward declarations. NFC. Fix implicit include dependencies in source files.	2020-06-24 13:49:04 +01:00
Eli Friedman	a2caa3b614	Remove GlobalValue::getAlignment(). This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that pointer, Value::getPointerAlignment() returns the correct value. If you want the actual declared alignment of a function or variable, GlobalObject::getAlignment() returns that. This patch switches all the users of GlobalValue::getAlignment to an appropriate alternative. Differential Revision: https://reviews.llvm.org/D80368	2020-06-23 19:13:42 -07:00
Eli Friedman	e9d4e34ab8	[AArch64][SVE] Add legalization support for i32/i64 vector srem/urem Implement them on top of sdiv/udiv, similar to what we do for integer types. Potential future work: implementing i8/i16 srem/urem, optimizations for constant divisors, optimizing the mul+sub to mls. Differential Revision: https://reviews.llvm.org/D81511	2020-06-23 16:27:52 -07:00
hsmahesha	5832950adb	[AMDGPU/MemOpsCluster] Compute `width` for `MIMG` instruction class. Summary: `width` computation is missing for newly added `MIMG` instruction class. Add it. Reviewers: foad, rampitec, arsenm Reviewed By: foad Subscribers: MatzeB, javed.absar, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81649	2020-06-23 17:32:17 +05:30
Kerry McLaughlin	5080503174	[SVE][CodeGen] Legalisation of vsetcc with scalable types Summary: Changes SplitVecOp_VSETCC to use getVectorElementCount() Reviewers: sdesmalen, efriedma, dancgr Reviewed By: efriedma Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79167	2020-06-23 11:56:29 +01:00
Simon Pilgrim	bcc0dc3832	[DAG] visitSIGN_EXTEND_INREG - rename EVT variable. NFCI. We had a EVT type variable called EVT, which isn't a good idea....	2020-06-23 10:45:27 +01:00
Paul Walker	499c63288f	[SVE] Code generation for fixed length vector loads & stores. Summary: This patch adds base support for code generating fixed length vector operations targeting a known SVE vector length. To achieve this we lower fixed length vector operations to equivalent scalable vector operations, whereby SVE predication is used to limit the elements processed to those present within the fixed length vector. Specifically this patch implements load and store operations, which get lowered to their masked counterparts thusly: V = load(Addr) => V = extract_fixed_vector(masked_load(make_pred(V.NumElts), Addr)) store(V, (Addr)) => masked_store(insert_fixed_vector(V), make_pred(V.NumElts), Addr)) Reviewers: rengolin, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80385	2020-06-23 09:39:03 +00:00
Simon Pilgrim	0acd22b8fb	StatepointLowering.cpp - fix implicit CommandLine.h dependency. NFC. StatepointLowering defines a cl::opt but don't include CommandLine.h.	2020-06-23 09:43:39 +01:00
Michael Liao	b1360caa82	[SDAG] Add new AssertAlign ISD node. Summary: - AssertAlign node records the guaranteed alignment on its source node, where these alignments are retrieved from alignment attributes in LLVM IR. These tracked alignments could help DAG combining and lowering generating efficient code. - In this patch, the basic support of AssertAlign node is added. So far, we only generate AssertAlign nodes on return values from intrinsic calls. - Addressing selection in AMDGPU is revised accordingly to capture the new (base + offset) patterns. Reviewers: arsenm, bogner Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81711	2020-06-23 00:51:11 -04:00
stozer	539381da26	[DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions Following on from this RFC[0] from a while back, this is the first patch towards implementing variadic debug values. This patch specifically adds a set of functions to MachineInstr for performing operations specific to debug values, and replacing uses of the more general functions where appropriate. The most prevalent of these is replacing getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the operands corresponding to values will no longer be at index 0, but index 2 and upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been added for the other operands, along with some helper functions to replace oft-repeated code and operate on a variable number of value operands. [0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste> Differential Revision: https://reviews.llvm.org/D81852	2020-06-22 16:01:12 +01:00
Simon Pilgrim	48d1a2d6d0	[DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI. We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.	2020-06-22 14:24:39 +01:00
Simon Pilgrim	ecc5d7ee0d	[DAG] SimplifyMultipleUseDemandedBits - drop unnecessary *_EXTEND_VECTOR_INREG cases For little endian targets, if we only need the lowest element and none of the extended bits then we can just use the (bitcasted) source vector directly. We already do this in SimplifyDemandedBits, this adds the SimplifyMultipleUseDemandedBits equivalent.	2020-06-22 12:35:32 +01:00
Tres Popp	09d72ad399	Revert "[CGP] Enable CodeGenPrepares phi type convertion." This reverts commit `67121d7b82`. This is causing compile times to be 2x slower on some large binaries.	2020-06-22 13:06:18 +02:00
David Green	67121d7b82	[CGP] Enable CodeGenPrepares phi type convertion.	2020-06-21 16:46:16 +01:00
David Green	730ecb63ec	[CGP] Convert phi types If a collection of interconnected phi nodes is only ever loaded, stored or bitcast then we can convert the whole set to the bitcast type, potentially helping to reduce the number of register moves needed as the phi's are passed across basic block boundaries. This has to be done in CodegenPrepare as it naturally straddles basic blocks. The alorithm just looks from phi nodes, looking at uses and operands for a collection of nodes that all together are bitcast between float and integer types. We record visited phi nodes to not have to process them more than once. The whole subgraph is then replaced with a new type. Loads and Stores are bitcast to the correct type, which should then be folded into the load/store, changing it's type. This comes up in the biquad testcase due to the way MVE needs to keep values in integer registers. I have also seen it come up from aarch64 partner example code, where a complicated set of sroa/inlining produced integer phis, where float would have been a better choice. I also added undef and extract element handling which increased the potency in some cases. This adds it with an option that defaults to off, and disabled for 32bit X86 due to potential issues around canonicalizing NaNs. Differential Revision: https://reviews.llvm.org/D81827	2020-06-21 15:54:17 +01:00
David Sherwood	584d0d5c17	[SVE] Fall back on DAG ISel at -O0 when encountering scalable types At the moment we use Global ISel by default at -O0, however it is currently not capable of dealing with scalable vectors for two reasons: 1. The register banks know nothing about SVE registers. 2. The LLT (Low Level Type) class knows nothing about scalable vectors. For now, the easiest way to avoid users hitting issues when using the SVE ACLE is to fall back on normal DAG ISel when encountering instructions that operate on scalable vector types. I've added a couple of RUN lines to existing SVE tests to ensure we can compile at -O0. I've also added some new tests to CodeGen/AArch64/GlobalISel/arm64-fallback.ll that demonstrate we correctly fallback to DAG ISel at -O0 when lowering formal arguments or translating instructions that involve scalable vector types. Differential Revision: https://reviews.llvm.org/D81557	2020-06-19 10:57:00 +01:00
Jay Foad	7cdf4326a8	[LiveIntervals] Fix early-clobber handling in handleMoveUp Without this fix, handleMoveUp can create an invalid live range like this: [98904e,98908r:0)[98908e,227504r:1) where the two segments overlap, but only because we have lost the "e" (early-clobber) on the end point of the first segment. Differential Revision: https://reviews.llvm.org/D82110	2020-06-19 10:17:04 +01:00
David Sherwood	7edc7f6edb	[CodeGen] Fix SimplifyDemandedBits for scalable vectors For now I have changed SimplifyDemandedBits and it's various callers to assume we know nothing for scalable vectors and to ignore the demanded bits completely. I have also done something similar for SimplifyDemandedVectorElts. These changes fix up lots of warnings due to calls to EVT::getVectorNumElements() for types with scalable vectors. These functions are all used for optimisations, rather than functional requirements. In future we can revisit this code if there is a need to improve code quality for SVE. Differential Revision: https://reviews.llvm.org/D80537	2020-06-19 07:59:35 +01:00
David Sherwood	9e811b0d93	[CodeGen] Fix ComputeNumSignBits for scalable vectors When trying to calculate the number of sign bits for scalable vectors we should just bail out for now and pretend we know nothing. Differential Revision: https://reviews.llvm.org/D81093	2020-06-19 07:58:42 +01:00
Vitaly Buka	fcd67665a8	[StackSafety] Add "Must Live" logic Summary: Extend StackLifetime with option to calculate liveliness where alloca is only considered alive on basic block entry if all non-dead predecessors had it alive at terminators. Depends on D82043. Reviewers: eugenis Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82124	2020-06-18 16:53:37 -07:00
Nathan James	8b0df1c1a9	[NFC] Refactor Registry loops to range for	2020-06-19 00:40:10 +01:00
Matt Arsenault	95605b784b	AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr We probably need to move where intrinsics are lowered to copies to make this useful.	2020-06-18 17:28:00 -04:00
Matt Arsenault	7f8b2e1b91	GlobalISel: Pass LegalizerHelper to custom legalize callbacks This was passing in all the parameters needed to construct a LegalizerHelper in the custom legalization, when it's simpler to just pass in the existing helper. This is slightly more annoying to use in the common case where you don't need the legalizer helper, but we could add back the common parameters back in addition to the helper. I didn't propagate this to all the internal target changes that this logically implies, but did update a sample one for legalizeMinNumMaxNum. This is in preparation for moving AMDGPU load/store legalization entirely into custom lowering. The current set of legalization actions is really constraining and not really capable of expressing all the actions needed to legalize loads/stores. In particular there's no way to express when the memory access itself needs to change size vs. the result type. There's also a lot of redundancy since the same split/widen actions need to be applied in both vector and scalar cases. All of the sub-cases logically belong as steps in the legalizer helper, but it will be easier to consider everything at once in custom lowering.	2020-06-18 17:17:38 -04:00
Alexandre Ganea	2ae0df5be7	[CodeView] Revert `8374bf4363` and `403f953792` This reverts: `8374bf4363` [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. `403f953792` [CodeView] Add full repro to LF_BUILDINFO record This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c	2020-06-18 16:18:46 -04:00
Simon Pilgrim	2474421398	[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes. If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.	2020-06-18 16:41:08 +01:00
Alexandre Ganea	8374bf4363	[CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string. Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.	2020-06-18 10:07:30 -04:00
Alexandre Ganea	403f953792	[CodeView] Add full repro to LF_BUILDINFO record This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable). Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833	2020-06-18 09:17:15 -04:00
Lucas Prates	a255931c40	[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169	2020-06-18 13:15:13 +01:00
Jeremy Morse	3626eba11f	[NFC][LiveDebugValues] Document how LiveDebugValues operates We're missing a plain English explanation of how this pass is supposed to operate -- add one to the file comment. Differential Revision: https://reviews.llvm.org/D80929	2020-06-18 10:54:09 +01:00
David Sherwood	7e30ef77f6	[CodeGen] Fix warnings in getVectorTypeBreakdown Added NextPowerOf2() routine to TypeSize and rewritten the code in getVectorTypeBreakdown to avoid warnings being generated. Differential Revision: https://reviews.llvm.org/D81578	2020-06-18 09:54:16 +01:00
David Sherwood	65912a9768	[CodeGen] Fix warnings in foldCONCAT_VECTORS Instead of asserting the number of elements is the same, we should be comparing the element counts instead. In addition, when looking at concats of extract_subvectors it's fine to use getVectorMinNumElements() for scalable vectors. I discovered these warnings when compiling the structured loads tests in this file: test/CodeGen/AArch64/sve-intrinsics-loads.ll Differential Revision: https://reviews.llvm.org/D81936	2020-06-18 09:29:37 +01:00
Nick Desaulniers	e7816f263b	[InlineSpiller] add assert about spills post terminators Summary: This invariant is being violated in the test case https://reviews.llvm.org/D77849, related to the use of the relatively new ability for callbr to have return values, and MachineBasicBlocks with INLINEASM_BR terminators to emit live out register defs. As noted in the comment, this triggers invariant violations in MachineVerifier via `llc -verify-machineinstrs` or `llc -verify-regalloc`, since only MachineInstrs that are terminators are allowed to follow the first terminator. https://reviews.llvm.org/D75098 may rework this very assertion if we're spilling via a (proposed) TCOPY MachineInstr. Reviewers: void, efriedma, arsenm Reviewed By: efriedma Subscribers: qcolombet, wdng, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D78166	2020-06-17 11:51:58 -07:00
Davide Italiano	1cbaf847ab	[CGP] Reset the debug location when promoting zext(s). When the zext gets promoted, it used to retain the original location, which pessimizes the debugging experience causing an unexpected jump in stepping at -Og. Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also contains a full C repro). Differential Revision: https://reviews.llvm.org/D81437	2020-06-17 11:13:13 -07:00
Ian Levesque	7c7c8e0da4	[xray] Option to omit the function index Summary: Add a flag to omit the xray_fn_idx to cut size overhead and relocations roughly in half at the cost of reduced performance for single function patching. Minor additions to compiler-rt support per-function patching without the index. Reviewers: dberris, MaskRay, johnislarry Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D81995	2020-06-17 13:49:01 -04:00
Vitaly Buka	d812efb121	[SafeStack,NFC] Fix names after files move Summary: Depends on D81831. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81832	2020-06-17 01:08:40 -07:00
Vitaly Buka	6754a0e2ed	[SafeStack,NFC] Move SafeStackColoring code Summary: This code is going to be used in StackSafety. This patch is file move with minimal changes. Identifiers will be fixed in the followup patch. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81831	2020-06-17 01:07:47 -07:00
Aaron Smith	7e01675ea5	[SelectionDAG] Add MVT::bf16 to getConstantFP() Summary: This was probably overlooked in recent bfloat patches. Needed to handle bf16 constants in SelectionDAG. ConstantFP:bf16<APFloat(0)> Reviewers: stuij Reviewed By: stuij Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81779	2020-06-16 15:10:05 -07:00
Matt Arsenault	e4f19d1dda	GlobalISel: Fix not failing on widening G_INSERT_VECTOR_ELT This doesn't actually handled type idx 0, but was reporting Legalized on it. No test changes because nothing was trying to use this.	2020-06-16 15:48:57 -04:00
Matt Arsenault	8a3340d25d	GlobalISel: Use early return and reduce indentation	2020-06-16 14:47:08 -04:00
Fangrui Song	4799fb63b5	[GlobalISel] Delete unused variable after r353432	2020-06-16 08:32:09 -07:00
Jessica Paquette	5a4c3f6b06	[GlobalISel] Look through extends etc in CombinerHelper::matchConstantOp It's possible to end up with a zext or something in the way of a G_CONSTANT, even pre-legalization. This can happen with memsets. e.g. https://godbolt.org/z/Bjc8cw To make sure we can catch these cases, use `getConstantVRegValWithLookThrough` instead of `mi_match`. Differential Revision: https://reviews.llvm.org/D81875	2020-06-15 16:34:25 -07:00
Amara Emerson	fc905ae003	[GlobalISel] Don't emit multiply by magic constant for zero memset values.	2020-06-15 14:42:14 -07:00
Davide Italiano	c2dccf9d5e	[CodeGenPrepare] Reset the debug location when promoting trunc(s) The promotion machinery in CGP moves instructions retaining debug locations. When the transformation is local, this is mostly correct, but when instructions are moved cross-BBs, this is not always true and causes jumpiness in line tables. This is the first of a series of commits. sext(s) and zext(s) need to be treated similarly. Differential Revision: https://reviews.llvm.org/D81879	2020-06-15 14:25:43 -07:00
Jessica Paquette	1ac8451a9b	[GlobalISel] Simplify G_ADD when it has (0-X) on the LHS or RHS This implements the following combines: ((0-A) + B) -> B-A (A + (0-B)) -> A-B Porting over the basic algebraic combines from the DAGCombiner. There are several combines which fold adds away into subtracts. This is just the simplest one. I noticed that add combines are some of the most commonly hit across CTMark, (via print statements when they fire), so I'm porting over some of the obvious ones. This gives some minor code size improvements on CTMark at -O3 on AArch64. Differential Revision: https://reviews.llvm.org/D77453	2020-06-15 09:43:24 -07:00
Dominik Montada	87e5742654	[NFC] Add braces to if-statement in MachineVerifier	2020-06-15 16:33:56 +02:00
Matt Arsenault	33e9086501	GlobalISel: Support lowering vector->vector G_BITCAST Extract subvectors and cast to the result element type before remerging.	2020-06-15 07:36:30 -04:00
Dominik Montada	c87bf29149	[MachineVerifier][GlobalISel] Check that branches have a MBB operand or are declared indirect. Add missing properties to G_BRJT, G_BRINDIRECT Summary: Teach MachineVerifier to check branches for MBB operands if they are not declared indirect. Add `isBarrier`, `isIndirectBranch` to `G_BRINDIRECT` and `G_BRJT`. Without these, `MachineInstr.isConditionalBranch()` was giving a false-positive for those instructions. Reviewers: aemerson, qcolombet, dsanders, arsenm Reviewed By: dsanders Subscribers: hiraditya, wdng, simoncook, s.egerton, arsenm, rovka, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81587	2020-06-15 11:17:09 +02:00
Vitaly Buka	ca2dcbd030	[SafeStack,NFC] Make StackColoring read-only Move core which removes markers out of StackColoring.	2020-06-14 23:05:43 -07:00
Vitaly Buka	c6426e2657	[SafeStack,NFC] Remove unneded branch	2020-06-14 23:05:43 -07:00
Vitaly Buka	7282da1ea8	[SafeStack,NFC] Fix naming style	2020-06-14 23:05:42 -07:00
Vitaly Buka	2f5e535a84	[SafeStack,NFC] Cleanup LiveRange interface	2020-06-14 23:05:42 -07:00
Vitaly Buka	adefa9ca2e	[SafeStack,NFC] "const" cleanup	2020-06-14 23:05:42 -07:00
Vitaly Buka	fb1e0f324f	[SafeStack,NFC] Add BlockLifetimeInfo constructor	2020-06-14 23:05:42 -07:00
Vitaly Buka	645058036a	[SafeStack,NFC] Use IntrinsicInst instead of Instruction	2020-06-14 23:05:41 -07:00
Vitaly Buka	f8e411656e	[SafeStack,NFC] Move ClColoring into SafeStack.cpp This allows to reuse the code in other components.	2020-06-14 23:05:41 -07:00
Vitaly Buka	05590a9cb8	[SafeStack,NFC] Move unconditional code into constructor Prepare to move ClColoring from SafeStackCode to SafeStackLayout. This will allow to reuse the code in other components.	2020-06-14 23:05:41 -07:00
Chen Zheng	bd7096b977	[PowerPC] fma chain break to expose more ILP This patch tries to reassociate two patterns related to FMA to expose more ILP on PowerPC. // Pattern 1: // A = FADD X, Y (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMA X, M21, M22 // B = FMA Y, M31, M32 // C = FADD A, B // Pattern 2: // A = FMA X, M11, M12 (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMUL M11, M12 // B = FMA X, M21, M22 // D = FMA A, M31, M32 // C = FADD B, D Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D80175	2020-06-15 00:00:04 -04:00
Qiu Chaofan	f8ef7c99a0	[DAGCombiner] Require ninf for division estimation Current implementation of division estimation isn't correct for some cases like 1.0/0.0 (result is nan, not expected inf). And this change exposes a potential infinite loop: we use isConstOrConstSplatFP in combineRepeatedFPDivisors to look up if the divisor is some constant. But it doesn't work after legalized on some platforms. This patch restricts the method to act before LegalDAG. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D80542	2020-06-14 22:58:22 +08:00
Amanieu d'Antras	6973125cb7	Fix FastISel dropping srcloc metadata from InlineAsm Summary: Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46060 I've also added the Extra_IsConvergent flag which was missing from FastISel. Reviewers: echristo Reviewed By: echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80759	2020-06-13 16:52:37 +01:00
Roman Lebedev	17f7654152	[NFCI][MachineCopyPropagation] invalidateRegister(): use SmallSet<8> instead of DenseSet. This decreases the time consumed by the pass [during RawSpeed unity build] by 25% (0.0586 s -> 0.04388 s). While that isn't really impressive overall, that wasn't the goal here. The memory results here are noticeable. The baseline results are: ``` total runtime: 55.65s. calls to allocation functions: 19754254 (354960/s) temporary memory allocations: 4951609 (88974/s) peak heap memory consumption: 239.13MB peak RSS (including heaptrack overhead): 463.79MB total memory leaked: 198.01MB ``` While with this patch the results are: ``` total runtime: 55.37s. calls to allocation functions: 19068237 (344403/s) # -3.47 % temporary memory allocations: 4261772 (76974/s) # -13.93 % (!!!) peak heap memory consumption: 239.13MB peak RSS (including heaptrack overhead): 463.73MB total memory leaked: 198.01MB ``` So we get rid of a lot of temporary allocations. Using `SmallSet<8>` makes sense to me because at least here for x86 BdVer2, the size of that set is never more than 3, over all of llvm test-suite + RawSpeed. The story might be different on other targets, not sure if it will ever justify whole DenseSet, but if it does SmallDenseSet might be a compromise.	2020-06-12 23:10:54 +03:00
Michael Liao	e7b920e6fe	[DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) Reviewers: arsenm Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81708	2020-06-12 13:53:08 -04:00
Matt Arsenault	350ee7fb3f	GlobalISel: Fix not erasing old instruction in sitofp/uitofp lowering	2020-06-12 10:33:23 -04:00
Simon Pilgrim	5509e2cc2e	[DAG] foldAddSubOfSignBit - add support for non-uniform vector constants	2020-06-12 14:58:15 +01:00
diggerlin	c6be3ea524	[NFC] clean up the AsmPrinter::emitLinkage for AIX part SUMMARY: Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage() Reviewers: Jason liu Differential Revision: https://reviews.llvm.org/D81613	2020-06-11 13:33:51 -04:00
Petar Avramovic	bd3d951b8b	AMDGPU/GlobalISel: Fix lower for f64->f16 G_FPTRUNC Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16 in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16. Differential Revision: https://reviews.llvm.org/D81666	2020-06-11 18:19:27 +02:00
Dominik Montada	f24e2e9eeb	[GlobalISel] fix crash in IRTranslator, MachineIRBuilder when translating @llvm.dbg.value intrinsic and using -debug Summary: Fix crash when using -debug caused by the GlobalISel observer trying to print an incomplete DBG_VALUE instruction. This was caused by the MachineIRBuilder using buildInstr, which immediately inserts the instruction causing print, instead of using BuildMI to first build up the instruction and using insertInstr when finished. Add RUN-line to existing debug-insts.ll test with -debug flag set to make sure no crash is happening. Also fixed a missing %s in the 2nd RUN-line of the same test. Reviewers: t.p.northover, aditya_nandakumar, aemerson, dsanders, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76934	2020-06-11 10:47:49 +02:00
David Sherwood	bd97342a0c	[CodeGen] Let computeKnownBits do something sensible for scalable vectors Until we have a real need for computing known bits for scalable vectors I have simply changed the code to bail out for now and pretend we know nothing. I've also fixed up some simple callers of computeKnownBits too. Differential Revision: https://reviews.llvm.org/D80437	2020-06-11 08:17:11 +01:00
Matt Arsenault	0671a4c508	RegAllocFast: Avoid unused method warning in release builds	2020-06-10 15:23:56 -04:00
Matt Arsenault	0f2af15c1b	GlobalISel: Make default implementation of legalizeCustom unreachable If the target explicitly requested custom legalization, it should be required to implement this. Also move default legalizeIntrinsic implementation into the header so it's next to the related legalizeCustom.	2020-06-10 11:05:59 -04:00
Wang, Pengfei	6eb9eae010	[MS] Copy the symbols assigned to the former instruction when memory folding. The memory folding raplaced the old instruction without copying the symbols assigned. Which will resulted in built fail due to the lost symbols. Reviewed by craig.topper Differential Revision: https://reviews.llvm.org/D78471	2020-06-10 15:38:32 +08:00
diggerlin	edd819c757	[AIX] supporting the visibility attribute for aix assembly SUMMARY: in the aix assembly , it do not have .hidden and .protected directive. in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as. in aix assembly, the visibility attribute are support in the pseudo-op like .extern Name [ , Visibility ] .globl Name [, Visibility ] .weak Name [, Visibility ] in this patch, we implement the visibility attribute for the global variable, function or extern function . for example. extern __attribute__ ((visibility ("hidden"))) int bar(int* ip); __attribute__ ((visibility ("hidden"))) int b = 0; __attribute__ ((visibility ("hidden"))) int foo(int* ip){ return (*ip)++; } the visibility of .comm linkage do not support , we will have a separate patch for it. we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it. Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson Differential Revision: https://reviews.llvm.org/D75866	2020-06-09 16:15:06 -04:00
Matt Arsenault	32823091c3	GlobalISel: Set instr/debugloc before any legalizer action It was annoying enough that every custom lowering needed to set the insert point, but this was made worse since now these all needed to be updated to setInstrAndDebugLoc. Consolidate these so every legalization action has the right insert position by default. This should fix dropping debug info in every custom AMDGPU legalization.	2020-06-09 15:37:02 -04:00
Matt Arsenault	b94c9e3b55	GlobalISel: Improve MachineIRBuilder construction The current relationship between LegalizerHelper and MachineIRBuilder confuses me, because the LegalizerHelper modifies the MachineIRBuilder which it does not own. Constructing a LegalizerHelper destroys the insert point, since the constructor calls setMF, which clears all the fields. Try to separate these functions, so it's possible to construct a LegalizerHelper from an existing MachineIRBuilder without losing the insert point/debug loc.	2020-06-09 15:05:04 -04:00
Matt Arsenault	babbf4441b	GlobalISel: Move some trivial MIRBuilder methods into the header The construction APIs for MachineIRBuilder don't make much sense, and it's been annoying to sort through it with these trivial functions separate from the declaration.	2020-06-09 15:04:48 -04:00
Matt Arsenault	bb6cb6bfe4	GlobalISel: Remove redundant check in verifier This was already checked earlier for all instructions.	2020-06-09 15:04:27 -04:00
Matt Arsenault	6eeac6ae33	GlobalISel: Fix double printing new instructions in legalizer New instructions were getting printed both in createdInstr, and in the final printNewInstrs, so it made it look like the same instructions were created twice. This overall made reading the debug output harder. Stop printing the initial construction and only print new instructions in the summary at the end. This avoids printing the less useful case where instructions are sometimes initially created with no operands. I'm not sure this is the correct instance to remove; now the visible ordering is different. Now you will typically see the one erased instruction message before all the new instructions in order. I think this is the more logical view of typical legalization changes, although it's mechanically backwards from the normal insert-new-erase-old pattern.	2020-06-09 15:02:31 -04:00
David Green	2fea3fe41c	[MachineScheduler] Update available queue on the first mop of a new cycle If a resource can be held for multiple cycles in the schedule model then an instruction can be placed into the available queue, another instruction can be scheduled, but the first will not be taken back out if the two instructions hazard. To fix this make sure that we update the available queue even on the first MOp of a cycle, pushing available instructions back into the pending queue if they now conflict. This happens with some downstream schedules we have around MVE instruction scheduling where we use ResourceCycles=[2] to show the instruction executing over two beats. Apparently the test changes here are OK too. Differential Revision: https://reviews.llvm.org/D76909	2020-06-09 19:13:53 +01:00
Sanjay Patel	702cf93356	[DAGCombiner] allow more folding of fadd + fmul into fma If fmul and fadd are separated by an fma, we can fold them together to save an instruction: fadd (fma A, B, (fmul C, D)), N1 --> fma(A, B, fma(C, D, N1)) The fold implemented here is actually a specialization - we should be able to peek through >1 fma to find this pattern. That's another patch if we want to try that enhancement though. This transform was guarded by the TLI hook enableAggressiveFMAFusion(), so it was done for some in-tree targets like PowerPC, but not AArch64 or x86. The hook is protecting against forming a potentially more expensive computation when fma takes longer to execute than a single fadd. That hook may be needed for other transforms, but in this case, we are replacing fmul+fadd with fma, and the fma should never take longer than the 2 individual instructions. 'contract' FMF is all we need to allow this transform. That flag corresponds to -ffp-contract=fast in Clang, so we are allowed to form fma ops freely across expressions. Differential Revision: https://reviews.llvm.org/D80801	2020-06-09 10:41:27 -04:00
Guillaume Chatelet	800e100588	Revert "[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess" This reverts commit `f21c52667e`.	2020-06-09 10:43:59 +00:00
Simon Wallis	4dba59689d	[ARM] prologue instructions emitted for naked function with >64 byte argument Summary: The naked function attribute is meant to suppress all function prologue/epilogue instructions. On ARM, some are still emitted if an argument greater than 64 bytes in size (the threshold for using the byval attribute in IR) is passed partially in registers. Perform the check for Attribute::Naked and early exit in SelectionDAGISel::LowerArguments(). Checking in ARMFrameLowering::determineCalleeSaves() is too late. A test case is included. Reviewers: llvm-commits, olista01, danielkiss Reviewed By: danielkiss Subscribers: kristof.beyls, hiraditya, danielkiss Tags: #llvm Differential Revision: https://reviews.llvm.org/D80715 Change-Id: Icedecf2a4ad31bc3c35ab0df7489a9d346e1f7cc	2020-06-09 11:33:03 +01:00
Guillaume Chatelet	3b6196c9b3	[Alignment][NFC] TargetLowering::allowsMisalignedMemoryAccesses Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMisalignedMemoryAccesses` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81374	2020-06-09 10:17:42 +00:00
Guillaume Chatelet	f21c52667e	[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMemoryAccess` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81379	2020-06-09 10:11:07 +00:00
Guillaume Chatelet	e26ed6bdae	Fix unused variable warning	2020-06-09 08:56:05 +00:00
Kang Zhang	1b6602275d	[MachineVerifier] Add TiedOpsRewritten flag to fix verify two-address error Summary: Currently, MachineVerifier will attempt to verify that tied operands satisfy register constraints as soon as the function is no longer in SSA form. However, PHIElimination will take the function out of SSA form while TwoAddressInstructionPass will actually rewrite tied operands to match the constraints. PHIElimination runs first in the pipeline. Therefore, whenever the MachineVerifier is run after PHIElimination, it will encounter verification errors on any tied operands. This patch adds a function property called TiedOpsRewritten that will be set by TwoAddressInstructionPass and will control when the verifier checks tied operands. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D80538	2020-06-09 07:39:42 +00:00
David Sherwood	cc8872400c	[CodeGen] Ensure callers of CreateStackTemporary use sensible alignments In two instances of CreateStackTemporary we are sometimes promoting alignments beyond the stack alignment. I have introduced a new function called getReducedAlign that will return the alignment for the broken down parts of illegal vector types. For example, on NEON a <32 x i8> type is made up of two <16 x i8> types - in this case the sensible alignment is 16 bytes, not 32. In the legalization code wherever we create stack temporaries I have started using the reduced alignments instead for illegal vector types. I added a test to CodeGen/AArch64/build-one-lane.ll that tries to insert an element into an illegal fixed vector type that involves creating a temporary stack object. Differential Revision: https://reviews.llvm.org/D80370	2020-06-09 08:10:17 +01:00
Yonghong Song	3eb465a329	[DebugInfo] Fix assertion for extern void type Commit `d77ae1552f` ("[DebugInfo] Support to emit debugInfo for extern variables") added support to emit debuginfo for extern variables. Currently, only BPF target enables to emit debuginfo for extern variables. But if the extern variable has "void" type, the compilation will fail. -bash-4.4$ cat t.c extern void bla; void test() { void x = &bla; return x; } -bash-4.4$ clang -target bpf -g -O2 -S t.c missing global variable type !1 = distinct !DIGlobalVariable(name: "bla", scope: !2, file: !3, line: 1, isLocal: false, isDefinition: false) ... fatal error: error in backend: Broken module found, compilation aborted! PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: ... The IR requires a DIGlobalVariable must have a valid type and the "void" type does not generate any type, hence the above fatal error. Note that if the extern variable is defined as "const void", the compilation will succeed. -bash-4.4$ cat t.c extern const void bla; const void test() { const void x = &bla; return x; } -bash-4.4$ clang -target bpf -g -O2 -S t.c -bash-4.4$ cat t.ll ... !1 = distinct !DIGlobalVariable(name: "bla", scope: !2, file: !3, line: 1, type: !6, isLocal: false, isDefinition: false) !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: null) ... Since currently, "const void extern_var" is supported by the debug info, it is natural that "void extern_var" should also be supported. This patch disabled assertion of "void extern_var" in IR verifier and add proper guarding when emiting potential null debug info type to dwarf types. Differential Revision: https://reviews.llvm.org/D81131	2020-06-08 13:43:18 -07:00
Andrew Litteken	bb677cacc8	[SuffixTree][MachOpt] Factoring out Suffix Tree and adding Unit Tests This moves the SuffixTree test used in the Machine Outliner and moves it into Support for use in other outliners elsewhere in the compilation pipeline. Differential Revision: https://reviews.llvm.org/D80586	2020-06-08 12:44:18 -07:00
Hendrik Greving	f3d8a93970	[ModuloSchedule] Support instructions with > 1 destination when walking canonical use. Fixes a minor bug that led to finding the wrong register if the definition had more than one register destination.	2020-06-08 11:43:59 -07:00
Jan-Willem Maessen	3610d31e7a	[NFC] Fix quadratic LexicalScopes::constructScopeNest We sometimes have functions with large numbers of sibling basic blocks (usually with an error path exit from each one). This was triggering the qudratic behavior in this function - after visiting each child llvm would re-scan the parent from the beginning again. We modify the work stack to record the next index to be worked on alongside the pointer. This avoids the need to linearly search for the next unfinished child. Differential Revision: https://reviews.llvm.org/D80029	2020-06-08 18:40:56 +01:00
Christopher Tetreault	caa2fddce7	[SVE] Eliminate calls to default-false VectorType::get() from CodeGen Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet Reviewed By: spatel, gchatelet Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80313	2020-06-08 10:26:10 -07:00
Guillaume Chatelet	54076610dc	[Alignment][NFC] Deprecate dead code from CallingConvLower.h Summary: This is a followup on D81196. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81362	2020-06-08 14:49:39 +00:00
Matt Arsenault	5f7e38d8f4	GlobalISel: Use Register	2020-06-08 10:15:53 -04:00
Matt Arsenault	f13ba22227	GlobalISel: Remove unused header	2020-06-08 10:15:53 -04:00
Matt Arsenault	f41994f85b	GlobalISel: Make it clearer that regbank/class are mutually exclusive	2020-06-08 10:15:53 -04:00
Matt Arsenault	c1d771dc4b	GlobalISel: Simplify debug printing	2020-06-08 10:15:53 -04:00
Guillaume Chatelet	94b0c32a0b	[Alignment][NFC] Migrate HandleByVal to Align Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::HandleByVal` without marking it `override`. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81365	2020-06-08 10:50:27 +00:00
Sander de Smalen	ae09670ee4	[CodeGen][SVE] CopyToReg: Split scalable EVTs that are not powers of 2 Scalable vectors cannot use 'BUILD_VECTOR', so it is necessary to properly split and widen scalable vectors when passing them to CopyToReg/CopyFromReg. This functionality is added to TargetLoweringBase::getVectorTypeBreakdown(). This patch only adds support for 'splitting' scalable vectors that are a multiple of some legal type, e.g. <vscale x 6 x i64> -> 3 x <vscale x 2 x i64> Reviewers: efriedma, c-rhodes Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D80139	2020-06-08 10:39:18 +01:00
James Y Knight	748d92b4d3	Simplify MachineVerifier's block-successor verification. There's two properties we want to verify: 1. That the successors returned by analyzeBranch are in the CFG successor list, and 2. That there are no extraneous successors are in the CFG successor list. The previous implementation mostly accomplished this, but in a very convoluted manner. Differential Revision: https://reviews.llvm.org/D79793	2020-06-06 22:30:51 -04:00
James Y Knight	1978309db1	MachineBasicBlock::updateTerminator now requires an explicit layout successor. Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propspect, given the existence of successors that occur mid-block, such as invoke, and potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in particular would be problematic, because its successor blocks are not distinct from "normal" successors, as EHPads are.) Instead, require the caller to pass in the expected fallthrough successor explicitly. In most callers, the correct block is immediately clear. But, in MachineBlockPlacement, we do need to record the original ordering, before starting to reorder blocks. Unfortunately, the goal of decoupling the behavior of end-of-block jumps from the successor list has not been fully accomplished in this patch, as there is currently no other way to determine whether a block is intended to fall-through, or end as unreachable. Further work is needed there. Differential Revision: https://reviews.llvm.org/D79605	2020-06-06 22:30:51 -04:00
Simon Pilgrim	f14d4c9c54	EHPersonalities.h - reduce Triple.h include to forward declaration. NFC. Move implicit include dependencies down to source files.	2020-06-06 15:48:31 +01:00
Sanjay Patel	302cc8a121	[DAGCombiner] clean-up FMA+FMUL folds; NFC D80801 suggests some readability improvements before mocing this block.	2020-06-06 10:32:54 -04:00
Nikita Popov	cb5724c71e	[CGP] Remove unnecessary MaybeAlign use (NFC) Stores now always have an alignment.	2020-06-05 23:18:26 +02:00
Matt Arsenault	eaa8af9322	GlobalISel: Add helper for constructing load from offset	2020-06-05 15:06:03 -04:00
Matt Arsenault	45e1a22a92	GlobalISel: Make known bits/alignment API more consistent Just computing the alignment makes sense without caring about the general known bits, such as for non-integral pointers. Separate the two and start calling into the TargetLowering hooks for frame indexes. Start calling the TargetLowering implementation for FrameIndexes, which improves the AMDGPU matching for stack addressing modes. Also introduce a new hook for returning known alignment of target instructions. For AMDGPU, it would be useful to report the known alignment implied by certain intrinsic calls. Also stop using MaybeAlign.	2020-06-05 14:57:22 -04:00
Nikita Popov	d370088611	[LiveDebugValues] Fix output stream (NFC) This should dump to the provided Out, rather than dbgs(), though they coincide in current usage.	2020-06-05 20:02:22 +02:00
Nikita Popov	6a53264926	[LiveDebugValues] Remove PendingInLocs (NFC) PendingInLocs ends up having the same value as InLocs, just computed a bit more indirectly. It is a leftover of a previous implementation approach. This patch drops PendingInLocs, as well as the Diff and Removed calulations, which are no longer needed. Differential Revision: https://reviews.llvm.org/D80868	2020-06-05 20:01:29 +02:00
Sander de Smalen	937cb7a8c7	Reland D80640: [CodeGen][SVE] Calculate correct type legalization for scalable vectors. This reverts commit `9bcef270d7`.	2020-06-05 18:09:31 +01:00
Sander de Smalen	9bcef270d7	Revert "[CodeGen][SVE] Calculate correct type legalization for scalable vectors." Seems to break some buildbots, reverting the patch for now. This reverts commit `164f4b9d26`.	2020-06-05 16:03:52 +01:00
Sander de Smalen	164f4b9d26	[CodeGen][SVE] Calculate correct type legalization for scalable vectors. This patch updates TargetLoweringBase::computeRegisterProperties and TargetLoweringBase::getTypeConversion to support scalable vectors, and make the right calls on how to legalise them. These changes are required to legalise both MVTs and EVTs. Reviewers: efriedma, david-arm, ctetreau Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D80640	2020-06-05 15:20:34 +01:00
Denis Antrushin	dae64d8f42	Fix build breakage caused by `66a1b83bf9`	2020-06-05 15:53:09 +03:00
Denis Antrushin	66a1b83bf9	[TargetLowering][NFC] More efficient emitPatchpoint(). Current implementation of emitPatchpoint() is very inefficient: for every FrameIndex operand if creates new MachineInstr with that operand expanded and all other copied as is. Since PATCHPOINT/STATEPOINT instructions may have a lot of FrameIndex operands, we end up creating and erasing many machine instructions. But we can do it in single pass, with only one new machine instruction generated. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D81181	2020-06-05 14:57:29 +03:00
Kerry McLaughlin	89fc0166f5	[CodeGen][SVE] Legalisation of extends with scalable types Summary: This patch adds legalisation of extensions where the operand of the extend is a legal scalable type but the result is not. EXTRACT_SUBVECTOR is used to split the result, before being replaced by target-specific [S\|U]UNPK[HI\|LO] operations. For example: ``` zext <vscale x 16 x i8> %a to <vscale x 16 x i16> ``` should emit: ``` uunpklo z2.h, z0.b uunpkhi z1.h, z0.b ``` Reviewers: sdesmalen, efriedma, david-arm Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79587	2020-06-05 12:08:42 +01:00
Philip Reames	4c735439fd	[Statepoint] Migrate a few tests to gc-live bundle format and fix assert The assert was missed in `0e7c7705`, migrating the test revealed the problem.	2020-06-04 18:15:58 -07:00
Vedant Kumar	198762680e	[LiveDebugValues] Cache LexicalScopes::getMachineBasicBlocks, NFCI Summary: Cache the results from getMachineBasicBlocks in LexicalScopes to speed up UserValueScopes::dominates queries. This replaces the caching done in UserValueScopes. Compared to the old caching method, this reduces memory traffic when a VarLoc is copied (e.g. when a VarLocMap grows), and enables caching across basic blocks. When compiling sqlite 3.5.7 (CTMark version), this patch reduces the number of calls to getMachineBasicBlocks from 10,207 to 1,093. I also measured a small compile-time reduction (~ 0.1% of total wall time, on average, on my machine). As a drive-by, I made the DebugLoc in UserValueScopes a const reference to cut down on MetadataTracking traffic. Reviewers: jmorse, Orlando, aprantl, nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80957	2020-06-04 16:58:45 -07:00
Matt Arsenault	af867b7850	DAG: Change computeKnownBitsForFrameIndex to be usable by GISel This wasn't getting much value from the DAG or depth arguments, since it's only called on the frame index root nodes. FrameIndexes can also only return a scalar value, so it also didn't need DemandedElts.	2020-06-04 10:50:26 -04:00
Matt Arsenault	931a68f26b	RegAllocFast: Remove dead code	2020-06-04 09:38:31 -04:00
Sanjay Patel	652b3757c8	[x86] add test/code comment for chain value use (PR46195); NFC	2020-06-04 09:15:17 -04:00
Simon Pilgrim	adf10dcf2e	[DAG] scalarizeBinOpOfSplats - extract from the source of splat vector (PR46189) D79003/rG9fa58d1bf2f8 exposed an issue with scalarizeBinOpOfSplats that we were extracting from the splatted vector result instead of the source, the splat index is only valid for the source vector not the result, which may contain undefs, including at the splat index.	2020-06-04 11:58:59 +01:00
Tim Northover	87e24c3200	Revert "[DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI" This reverts commit `21dadd774f`. In at least PromoteIntBinOps, they wanted to know about users of all values produced by the node not just the integer being promoted. For example not replacing chain users if the operation was a load breaks the ordering of the DAG.	2020-06-04 11:53:14 +01:00
Madhur Amilkanthwar	b3cff3c720	Utility to dump .dot representation of SelectionDAG without firing viewer Summary: This patch adds support for dumping .dot representation of SelectionDAG. It is inspired from the fact that, a developer may want to just dump the graph at a predictable path with a simple name to compare. The exisitng utility (i.e. viewGraph) are overkill for this motive hence this patch adds the requires support while using the core routines from GraphWriter. Example usage: DAG.dumpDotGraph("/tmp/graph.dot", "MyGraph") will create /tmp/graph.dot file when DAG is an object of SelectionDAG class. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D80711	2020-06-04 11:51:48 +05:30
Philip Reames	ab6779bbd8	[Statepoint] Remove last of old ImmutableStatepoint code To do so, I had to sink the old school inline operand handling into GCStatepointInst which is non ideal. This code should be removed shortly and I was able to at least clean it up a bunch.	2020-06-03 20:31:17 -07:00
Philip Reames	91dd2f2536	[Statepoint] Delete more dead code from old wrappers The verify() routine duplicates IR/Verifier.cpp checks, so while not technically dead it doesn't add any value either.	2020-06-03 20:10:30 -07:00
Matt Arsenault	ed5017e153	GlobalISel: Start defining strict FP instructions The AMDGPU lowering for unconstrained G_FDIV sometimes needs to introduce a mode switch in the middle, so it's helpful to have constrained instructions available to legalize this. Right now nothing is preventing reordering of the mode switch with the other instructions in the expansion.	2020-06-03 20:46:37 -04:00
Quentin Colombet	ccb3c8e861	[RegisterCoalescer] Update empty subranges when rematerializing When we rematerialize a value as part of the coalescing, we may widen the register class of the destination register. When this happens, updateRegDefUses may create additional subranges to account for the wider register class. The created subranges are empty and if they are not defined by the rematerialized instruction we clean them up. However, if they are defined by the rematerialized instruction but unused, we failed to flag them as dead definition and would leave them as empty live-range. This is wrong because empty live-ranges don't interfere with anything, thus if we don't fix them, we would fail to account that the rematerialized instruction clobbers some lanes. E.g., let us consider the following pseudo code: def.lane_low64:reg128 = ldimm newdef:reg32 = COPY def.lane_low64_low32 When rematerialization happens for newdef, we end up with: newdef.lane_low64:reg128 = ldimm = use newdef.lane_low64_low32 Let's look at the live interval of newdef. Before rematerialization, we would get: newdef [defIdx, useIdx:0) 0@defIdx Right after updateRegDefUses, newdef register class is widen to reg128 and the subrange definitions will be augmented to fill the subreg that is used at the definition point, here lane_low64. The resulting live interval would be: newdef [newDefIdx, useIdx:0) 0@newDefIdx * lane_low64_high32 EMPTY * lane_low64_low32 [newDefIdx, useIdx:0) Before this patch this would be the final status of the live interval. Therefore we miss that lane_low64_high32 is actually live on the definition point of newdef. With this patch, after rematerializing, we check all the added subranges and for the ones that are defined but empty, we flag them as dead def. Thus, in that case, newdef would look like this: newdef [newDefIdx, useIdx:0) 0@newDefIdx * lane_low64_high32 [newDefIdx, newDefIdxDead) ; <-- instead of EMPTY * lane_low64_low32 [newDefIdx, useIdx:0) This fixes https://www.llvm.org/PR46154	2020-06-03 17:10:55 -07:00
Matt Arsenault	3866e0a563	GlobalISel: Fail expansion of G_DYN_STACKALLOC for StackGrowsUp	2020-06-03 19:56:07 -04:00
Philip Reames	382b3023cb	[Statepoints][CGP] Minor parameter type cleanup	2020-06-03 16:00:38 -07:00
Matt Arsenault	66251f7e1d	RegAllocFast: Record internal state based on register units Record internal state based on register units. This is often more efficient as there are typically fewer register units to update compared to iterating over all the aliases of a register. Original patch by Matthias Braun, but I've been rebasing and fixing it for almost 2 years and fixed a few bugs causing intermediate failures to make this patch independent of the changes in https://reviews.llvm.org/D52010.	2020-06-03 16:51:46 -04:00
Victor Huang	3abe7aca45	[CodeGen] Enable tail call position check for speculatable functions In the function "Analysis.cpp:isInTailCallPosition", it only checks whether a call is in a tail call position if the call has side effects, access memory or it is not safe to speculative execute. Therefore, a speculatable function will not go through tail call position check and improperly tail called when it is not in a tail-call position. This patch enables tail call position check for speculatable functions. Differential Revision: https://reviews.llvm.org/D80661	2020-06-03 10:37:45 -05:00
Kang Zhang	2cc77b2b8a	[LiveVariables] Don't set undef reg PHI used as live for FromMBB Summary: In the patch D73152, it adds a new function LiveVariables::addNewBlock. This new function will add the reg which PHI used to the MBB which reg is from. But the new function may cause LiveVariable Verification failed when the Src reg in PHI is undef. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D80077	2020-06-03 15:25:30 +00:00
Henry Kao	c57e41c000	[CodeGen][SVE] Replace deprecated calls in getCopyFromPartsVector() Summary: Replaced getVectorNumElements() with getVectorElementCount(). Added operator overloads for class ElementCount. Fixes warning in several AArch64 unit tests. Reviewers: sdesmalen, kmclaughlin, dancgr, efriedma, each, andwar, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80826	2020-06-03 11:20:02 -04:00
Simon Pilgrim	ea80b40669	[DAG] SimplifyDemandedBits - peek through SHL if we only demand sign bits. If we're only demanding the (shifted) sign bits of the shift source value, then we can use the value directly. This handles SimplifyDemandedBits/SimplifyMultipleUseDemandedBits for both ISD::SHL and X86ISD::VSHLI. Differential Revision: https://reviews.llvm.org/D80869	2020-06-03 16:11:54 +01:00
Simon Pilgrim	c438b257f1	[DAG] GetDemandedBits - don't bother asserting for a non-null cast<> result. NFC. cast<> will assert on failure anyhow. This lets us fold the cast<> with the getAPIntValue() that uses it.	2020-06-03 12:43:07 +01:00
Simon Pilgrim	7a96c181d0	TargetFrameLowering.h - remove unnecessary includes. NFC. Move TargetFrameLowering.h include to the top of the TargetFrameLoweringImpl.cpp includes (clang-format doesn't do this by default as the filenames don't match).	2020-06-03 11:12:42 +01:00
Kadir Cetinkaya	c5468253aa	[llvm] Fix unused variable warnings	2020-06-03 11:49:01 +02:00
Djordje Todorovic	dd1bc59b72	[CSInfo][MIPS][DwarfDebug] Add support for delay slots This adds call site info support for call instructions with delay slot. Search for instructions inside call delay slot, which load value into parameter forwarding registers. Return address of the call points to instruction after call delay slot, which is not the one, immediately after the call instruction. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D78107	2020-06-03 11:25:17 +02:00
Eric Christopher	153a24ab0f	Undo initialization of TRI in CGP as this is unconditionally initialized later.	2020-06-02 15:08:54 -07:00
Kadir Cetinkaya	af86a10bad	[llvm] Fix unused variable warning	2020-06-02 22:46:24 +02:00
Eric Christopher	971459c3ef	Fix up clang-tidy warnings around null and pointers.	2020-06-02 13:24:20 -07:00
Amy Kwan	a3ada630d8	[DAGCombiner] Combine shifts into multiply-high This patch implements a target independent DAG combine to produce multiply-high instructions from shifts. This DAG combine will combine shifts for any type as long as the MULH on the narrow type is legal. For now, it is enabled on PowerPC as PowerPC is the only target that has an implementation of the isMulhCheaperThanMulShift TLI hook introduced in D78271. Moreover, this DAG combine focuses on catching the pattern: (shift (mul (ext <narrow_type>:$a to <wide_type>), (ext <narrow_type>:$b to <wide_type>)), <narrow_width>) to produce mulhs when we have a sign-extend, and mulhu when we have a zero-extend. The patch performs the following checks: - Operation is a right shift arithmetic (sra) or logical (srl) - Input to the shift is a multiply - Both operands to the shift are sext/zext nodes - The extends into the multiply are both the same - The narrow type is half the width of the wide type - The shift amount is the width of the narrow type - The respective mulh operation is legal Differential Revision: https://reviews.llvm.org/D78272	2020-06-02 15:22:48 -05:00
Djordje Todorovic	4e8e5d60b4	[CSInfo][NFC] Interpret loaded parameter value separately The collectCallSiteParameters() method searches for instructions which load values into registers used for parameters passing. Previously, interpretation of those values, loaded by one such instruction, was implemented inside collectCallSiteParameters() method. This patch moves the interpretation code from collectCallSiteParameters() method into a separate static method named interpretValue. New method is called from collectCallSiteParameters() to process each instruction from targeted instruction scope. The collectCallSiteParameters() searches for loaded parameter value among instructions which precede the call instruction, inside the same basic block. When needed, new method (interpretValue) could be used for searching any instruction scope. This is preparation for search of parameter value, loaded inside call delay slot. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D78106	2020-06-02 13:05:04 +02:00
Sriraman Tallam	e0bca46b08	Options for Basic Block Sections, enabled in D68063 and D73674. This patch adds clang options: -fbasic-block-sections={all,<filename>,labels,none} and -funique-basic-block-section-names. LLVM Support for basic block sections is already enabled. + -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic block sections for all or a subset of basic blocks. "labels" only enables basic block symbols. + -funique-basic-block-section-names: Enables unique section names for basic block sections, disabled by default. Differential Revision: https://reviews.llvm.org/D68049	2020-06-02 00:23:32 -07:00
Denis Antrushin	fa818ded24	[StatepointLowering] Handle UNDEF gc values. Do not spill UNDEF GC values. Instead, replace corresponding gc.relocate intrinsic with an (arbitrary, but recognizable) constant. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D80714	2020-06-02 10:18:33 +03:00
Richard Smith	4ccb6c36a9	Fix violations of [basic.class.scope]p2. These cases all follow the same pattern: struct A { friend class X; //... class X {}; }; But 'friend class X;' injects 'X' into the surrounding namespace scope, rather than introducing a class member. So the second 'class X {}' is a completely different type, which changes the meaning of the earlier name 'X' from '::X' to 'A::X'. Additionally, the friend declaration is pointless -- members of a class don't need to be befriended to be able to access private members.	2020-06-01 22:03:05 -07:00
Vedant Kumar	776708b00b	[LiveDebugValues] Remove early-exit when testing regmasks, NFC In transferRegisterDef, if the instruction has a regmask attached, we'll check if any currently used register is clobbered by the regmask. The early exit in this scan isn't necessary, costs a set lookup, and is almost never taken [1]. Delete it. [1] http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp.html#L1136	2020-06-01 15:16:10 -07:00
Vedant Kumar	11c617c417	[LiveDebugValues] Add LocIndex::u32_{location,index}_t types for readability, NFC This is per Adrian's suggestion in https://reviews.llvm.org/D80684.	2020-06-01 11:02:36 -07:00
Vedant Kumar	2ecaf93525	[LiveDebugValues] Speed up removeEntryValue, NFC Summary: Instead of iterating over all VarLoc IDs in removeEntryValue(), just iterate over the interval reserved for entry value VarLocs. This changes the iteration order, hence the test update -- otherwise this is NFC. This appears to give an ~8.5x wall time speed-up for LiveDebugValues when compiling sqlite3.c 3.30.1 with a Release clang (on my machine): ``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- Before: 2.5402 ( 18.8%) 0.0050 ( 0.4%) 2.5452 ( 17.3%) 2.5452 ( 17.3%) Live DEBUG_VALUE analysis After: 0.2364 ( 2.1%) 0.0034 ( 0.3%) 0.2399 ( 2.0%) 0.2398 ( 2.0%) Live DEBUG_VALUE analysis ``` The change in removeEntryValue() is the only one that appears to affect wall time, but for consistency (and to resolve a pending TODO), I made the analogous changes for iterating over SpillLocKind VarLocs. Reviewers: nikic, aprantl, jmorse, djtodoro Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80684	2020-06-01 11:02:36 -07:00
Matt Arsenault	836c7dcf12	DAG: Fix getNode dropping flags if there's a glue output The AMDGPU non-strict fdiv lowering needs to introduce an FP mode switch in some cases, and has custom nodes to provide chain/glue for the intermediate FP operations. We need to propagate nofpexcept here, but getNode was dropping the flags. Adding nofpexcept in the AMDGPU custom lowering is left to a future patch. Also fix a second case where flags were dropped, but in this case it seems it just didn't handle this number of operands. Test will be included in future AMDGPU patch.	2020-06-01 13:48:02 -04:00
hsmahesha	0ed2c04636	[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545	2020-06-01 22:52:34 +05:30
Chen Zheng	2a24d350db	[MachineCombine] add a hook for resource length limit	2020-05-31 23:21:04 -04:00
Matt Arsenault	95f65a7c6c	AArch64/GlobalISel: Fix incorrect ptrmask usage for alignment I inverted the mask when I ported to the new form of G_PTRMASK in `8bc03d2168`. I don't think this really broke anything, since G_VASTART isn't handled for types with an alignment higher than the stack alignment.	2020-05-31 10:56:55 -04:00
Florian Hahn	ec25a71eb7	[ScheduleDAG] Avoid unnecessary recomputation of topological order. In some cases ScheduleDAGRRList has to add new nodes to resolve problems with interfering physical registers. When new nodes are added, it completely re-computes the topological order, which can take a long time, but is unnecessary. We only add nodes one by one, and initially they do not have any predecessors. So we can just insert them at the end of the vector. Later we add predecessors, but the helper function properly updates the topological order much more efficiently. With this change, the compile time for the program below drops from 300s to 30s on my machine. define i11129 @test1() { %L1 = load i11129, i11129* undef %B30 = ashr i11129 %L1, %L1 store i11129 %B30, i11129* undef ret i11129 %L1 } This should be generally beneficial, as we can skip a large amount of work. Theoretically there are some scenarios where we might not safe much, e.g. when we add a dependency between the first and last node. Then we would have to shift all nodes. But we still do not have to spend the time re-computing the initial order. Reviewers: MatzeB, atrick, efriedma, niravd, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D59722	2020-05-31 11:04:35 +01:00

... 5 6 7 8 9 ...

29326 Commits