llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	6d855ea024	[ConstantRange] Rename isWrappedSet() to isUpperWrapped() Split out from D59749. The current implementation of isWrappedSet() doesn't do what it says on the tin, and treats ranges like [X, Max] as wrapping, because they are represented as [X, 0) when using half-inclusive ranges. This also makes it inconsistent with the semantics of isSignWrappedSet(). This patch renames isWrappedSet() to isUpperWrapped(), in preparation for the introduction of a new isWrappedSet() method with corrected behavior. llvm-svn: 357107	2019-03-27 18:19:33 +00:00
Matt Arsenault	2e9ddcc30e	RegPressure: Fix crash on blocks with only dbg_value If there were only dbg_values in the block, recede would hit the beginning of the block and try to use thet dbg_value as a real instruction. llvm-svn: 357105	2019-03-27 18:14:02 +00:00
Amara Emerson	381188f1f3	[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101	2019-03-27 17:47:42 +00:00
Quentin Colombet	89daf49e5c	[PeepholeOpt] Don't stop simplifying copies on sequence of subregs This patch removes an overly conservative check that would prevent simplifying copies when the value we were tracking would go through several subregister indices. Indeed, the intend of this check was to not track values whenever we have to compose subregister, but actually what the check was doing was bailing anytime we see a second subreg, even if that second subreg would actually be the new source of truth (as opposed to a part of that subreg). Differential Revision: https://reviews.llvm.org/D59891 llvm-svn: 357095	2019-03-27 17:27:56 +00:00
Matt Arsenault	b19361243b	PEI: Delay checking requiresFrameIndexReplacementScavenging Currently this is called before the frame size is set on the function. For AMDGPU, the scavenger is used for large frames where part of the offset needs to be materialized in a register, so estimating the frame size is useful for knowing whether the scavenger is useful. llvm-svn: 357087	2019-03-27 16:37:31 +00:00
Matt Arsenault	733b8571b4	MIR: Freeze reserved regs after parsing everything The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. llvm-svn: 357083	2019-03-27 16:12:26 +00:00
Nirav Dave	b5630a2ab1	[DAGCombiner] Unify Lifetime and memory Op aliasing. Rework BaseIndexOffset and isAlias to fully work with lifetime nodes and fold in lifetime alias analysis. This is mostly NFC. Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59794 llvm-svn: 357070	2019-03-27 14:14:46 +00:00
Nirav Dave	96a264e053	[DAGCombine] Refactor GatherAllAliases. NFCI. llvm-svn: 357069	2019-03-27 14:14:35 +00:00
Hans Wennborg	5c0d7a24e8	Re-commit r355490 "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default" Original commit by Ayonam Ray. This commit adds a regression test for the issue discovered in the previous commit: that the range check for the jump table can only be omitted if the fall-through destination of the jump table is unreachable, which isn't necessarily true just because the default of the switch is unreachable. This addresses the missing optimization in PR41242. > During the lowering of a switch that would result in the generation of a > jump table, a range check is performed before indexing into the jump > table, for the switch value being outside the jump table range and a > conditional branch is inserted to jump to the default block. In case the > default block is unreachable, this conditional jump can be omitted. This > patch implements omitting this conditional branch for unreachable > defaults. > > Differential Revision: https://reviews.llvm.org/D52002 > Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 357067	2019-03-27 14:10:11 +00:00
Jonas Paulsson	38342a5185	[DAGCombiner] Don't allow addcarry if the carry producer is illegal. getAsCarry() checks that the input argument is a carry-producing node before allowing a transformation to addcarry. This patch adds a check to make sure that the carry-producing node is legal. If it is not, it may not remain in a form that is manageable by the target backend. The test case caused a compilation failure during instruction selection for this reason on SystemZ. Patch by Ulrich Weigand. Review: Sanjay Patel https://reviews.llvm.org/D59822 llvm-svn: 357052	2019-03-27 08:41:46 +00:00
Francis Visoiu Mistrih	ee1a6e70fa	[Remarks] Emit a section containing remark diagnostics metadata A section containing metadata on remark diagnostics will be emitted if the flag (-mllvm) -remarks-section is present. For now, the metadata is: * a magic number for remarks: "REMARKS\0" * the version number: a little-endian uint64_t * the absolute file path to the serialized remark diagnostics: a null-terminated string. Differential Revision: https://reviews.llvm.org/D59571 llvm-svn: 357043	2019-03-27 01:13:59 +00:00
Quentin Colombet	c74271c537	[LiveRange] Reset the VNIs when splitting subranges When splitting a subrange we end up with two different subranges covering two different, non overlapping, lanes. As part of this splitting the VNIs of the original live-range need to be dispatched to the subranges according to which lanes they are actually defining. Prior to this patch we were assuming that all values were defining all lanes. This was wrong as demonstrated by llvm.org/PR40835. Differential Revision: https://reviews.llvm.org/D59731 llvm-svn: 357032	2019-03-26 21:27:15 +00:00
Sanjay Patel	bb5cba3cca	[SDAG] add simplifications for FP at node creation time We have the folds for fadd/fsub/fmul already in DAGCombiner, so it may be possible to remove that code if we can guarantee that these ops are zapped before they can exist. llvm-svn: 357029	2019-03-26 20:54:15 +00:00
Ali Tamur	02e96648d7	Revert "[llvm] Reapply "Prevent duplicate files in debug line header in dwarf 5."" This reverts commit rL357020. The commit broke the test llvm/test/tools/llvm-objdump/embedded-source.test on some builds including clang-ppc64be-linux-multistage, clang-s390x-linux, clang-with-lto-ubuntu, clang-x64-windows-msvc, llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast (and others). llvm-svn: 357026	2019-03-26 20:05:27 +00:00
Ali Tamur	2f5cd03a3f	[llvm] Reapply "Prevent duplicate files in debug line header in dwarf 5." Reapply rL356941 after regenerating the object file in the failing test llvm/test/tools/llvm-objdump/embedded-source.test from source. Original commit message: [llvm] Prevent duplicate files in debug line header in dwarf 5. Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D59515 llvm-svn: 357018	2019-03-26 18:53:23 +00:00
Nirav Dave	a28c514581	[DAG] Avoid smart constructor-based dangling nodes. Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations or not fully pruning unused result values. This can result in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter as the current node deleter to make sure such nodes have the chance of being pruned. Many minor changes, mostly positive. llvm-svn: 356996	2019-03-26 15:08:14 +00:00
Simon Pilgrim	e24441aab0	[TargetLowering] Add SimplifyDemandedBits support for ISD::INSERT_VECTOR_ELT This helps us relax the extension of a lot of scalar elements before they are inserted into a vector. Its exposes an issue in DAGCombiner::convertBuildVecZextToZext as some/all the zero-extensions may be relaxed to ANY_EXTEND, so we need to handle that case to avoid a couple of AVX2 VPMOVZX test regressions. Once this is in it should be easier to fix a number of remaining failures to fold loads into VBROADCAST nodes. Differential Revision: https://reviews.llvm.org/D59484 llvm-svn: 356989	2019-03-26 12:32:01 +00:00
Yi Kong	74b874ac4c	Fix nondeterminism introduced in r353954 DenseMap iteration order is not guaranteed, use MapVector instead. Fix provided by srhines. Differential Revision: https://reviews.llvm.org/D59807 llvm-svn: 356988	2019-03-26 12:18:08 +00:00
Ali Tamur	fdce82a814	Revert "[llvm] Prevent duplicate files in debug line header in dwarf 5." This reverts commit `312ab05887`. My commit broke the build; I will revert and find out what happened. llvm-svn: 356951	2019-03-25 21:09:07 +00:00
Ali Tamur	312ab05887	[llvm] Prevent duplicate files in debug line header in dwarf 5. Summary: Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. Reviewers: dblaikie, probinson, aprantl, espindola Reviewed By: probinson Subscribers: emaste, jvesely, nhaehnle, aprantl, javed.absar, arichardson, hiraditya, MaskRay, rupprecht, jdoerfert, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D59515 llvm-svn: 356941	2019-03-25 20:08:00 +00:00
Simon Pilgrim	167af1bafb	[SelectionDAG] Add icmp UNDEF handling to SelectionDAG::FoldSetCC First half of PR40800, this patch adds DAG undef handling to icmp instructions to match the behaviour in llvm::ConstantFoldCompareInstruction and SimplifyICmpInst, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involved a lot of tweaking to reduced tests as bugpoint loves to reduce icmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D59363 llvm-svn: 356938	2019-03-25 18:51:57 +00:00
Teresa Johnson	3bd4b5a925	[CGP] Build the DominatorTree lazily Summary: In r355512 CGP was changed to build the DominatorTree only once per function traversal, to avoid repeatedly building it each time it was accessed. This solved one compile time issue but introduced another. In the second case, we now were building the DT unnecessarily many times when we performed many function traversals (i.e. more than once per function when running CGP because of changes made each time). Change to saving the DT in the CodeGenPrepare object, and building it lazily when needed. It is reset whenever we need to rebuild it. The case that exposed the issue there are 617 functions, and we walk them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a total of 12083 times (so previously we were building the DT 12083 times). With this patch we only build the DT 844 times (average of 1.37 times per function). We dropped the total time to compile this file from 538.11s without this patch to 339.63s with it. There is still an issue as CGP is taking much longer than all other passes even with this patch, and before a recent compiler release cut at r355392 the total time to this compile was only 97 sec with a huge reduction in CGP time. I suspect that one of the other recent changes to CGP led to iterating each function many more times on average, but I need to do some more investigation. Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59696 llvm-svn: 356937	2019-03-25 18:38:48 +00:00
Matt Arsenault	b27e4974d0	MISched: Don't schedule regions with 0 instructions I think this is correct, but may not necessarily be the correct fix for the assertion I'm really trying to solve. If a scheduling region was found that only has dbg_value instructions, the RegPressure tracker would end up in an inconsistent state because it would skip over any debug instructions and point to an instruction outside of the scheduling region. It may still be possible for this to happen if there are some real schedulable instructions between dbg_values, but I haven't managed to break this. The testcase is extremely sensitive and I'm not sure how to make it more resistent to future scheduler changes that would avoid stressing this situation. llvm-svn: 356926	2019-03-25 17:15:44 +00:00
Craig Topper	07e3071854	[LegalizeDAG] Expand i16 bswap directly to a rotate by 8 instead of relying on DAG combine. An i16 bswap can be implemented with an i16 rotate by 8. We previously emitted a shift and OR sequence that DAG combine should be able to turn back into rotate. But we might as well go there directly. If rotate isn't legal, LegalizeDAG should further legalize it to either the opposite rotate, or the shift and OR pattern. I don't know of any way to get the existing DAG combine reliance to fail. So I don't know any way to add new tests for this that wouldn't have worked previously. llvm-svn: 356860	2019-03-24 17:02:14 +00:00
Teresa Johnson	4dc851964c	[CGP] Make several static functions member functions (NFC) This is extracted from D59696 as suggested in the review. It is preparation for making the DominatorTree a member variable. llvm-svn: 356857	2019-03-24 15:18:50 +00:00
Simon Pilgrim	94e8f152c1	[TargetLowering] SimplifyDemandedBits trunc(srl(x, C1)) - early out for out of range C1. NFCI. llvm-svn: 356810	2019-03-22 20:53:49 +00:00
Matt Arsenault	b34afa311d	GlobalISel: Fix RegBankSelect for REG_SEQUENCE The AArch64 test was broken since the result register already had a set register class, so this test was a no-op. The mapping verify call would fail because the result size is not the same as the inputs like in a copy or phi. The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR copies which need much more work to handle correctly (same for phis), but add them as a baseline. llvm-svn: 356713	2019-03-21 20:45:36 +00:00
Craig Topper	9f0b17a248	[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687	2019-03-21 17:38:52 +00:00
Florian Hahn	71033f2987	[DAGCombiner] Use getTokenFactor in a few more cases. SDNodes can only have 64k operands and for some inputs (e.g. large number of stores), we can reach this limit when creating TokenFactor nodes. This patch is a follow up to D56740 and updates a few more places that potentially can create TokenFactors with too many operands. Reviewers: efriedma, craig.topper, aemerson, RKSimon Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D59156 llvm-svn: 356668	2019-03-21 14:32:09 +00:00
Simon Pilgrim	da4992bf8d	[DAGCombine] SimplifySelectCC - call FoldSetCC with the setcc result type We were calling FoldSetCC with the compare operand type instead of the result type. Found by OSS-Fuzz #13838 (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13838) llvm-svn: 356667	2019-03-21 14:07:18 +00:00
Sanjay Patel	d47eac59ef	[CodeGenPrepare] limit formation of overflow intrinsics (PR41129) This is probably a bigger limitation than necessary, but since we don't have any evidence yet that this transform led to real-world perf improvements rather than regressions, I'm making a quick, blunt fix. In the motivating x86 example from: https://bugs.llvm.org/show_bug.cgi?id=41129 ...and shown in the regression test, we want to avoid an extra instruction in the dominating block because that could be costly. The x86 LSR test diff is reversing the changes from D57789. There's no evidence that 1 version is any better than the other yet. Differential Revision: https://reviews.llvm.org/D59602 llvm-svn: 356665	2019-03-21 13:57:07 +00:00
Simon Pilgrim	54ed653870	[SelectionDAG] Add scalarization of ABS node (PR41149) Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D59577 llvm-svn: 356656	2019-03-21 11:18:54 +00:00
Craig Topper	8de7bc0bff	[ScalarizeMaskedMemIntrinsics] Reverse some if conditions to reduce indentations to remove curly braces. Pre-commit for D59180 llvm-svn: 356646	2019-03-21 05:54:37 +00:00
Stanislav Mekhanoshin	0a11829ab2	Allow machine dce to remove uses in the same instruction Machine DCE cannot remove a dead definition if there are non-dbg uses. A use however can be in the same instruction: dead %0 = INST %0 Such instructions sometimes created by Detect dead lanes pass. Allow this instruction to be deleted despite the use if the only use belongs to the same instruction. Differential Revision: https://reviews.llvm.org/D59565 llvm-svn: 356619	2019-03-20 21:42:05 +00:00
Sanjay Patel	a2250e923b	[CGP] fix formatting; NFC llvm-svn: 356572	2019-03-20 16:47:53 +00:00
Sanjay Patel	d1ce455f7b	[CGP] convert chain of 'if' to 'switch'; NFC This should be extended, but CGP does some strange things, so I'm intentionally not changing the potential order of any transforms yet. llvm-svn: 356566	2019-03-20 15:53:06 +00:00
Simon Pilgrim	51f65171e9	Remove out of date comment. NFCI. DAGCombiner::convertBuildVecZextToZext just requires the extractions to be sequential, they don't have to start from 0'th index. llvm-svn: 356552	2019-03-20 12:24:15 +00:00
Clement Courbet	238af52ded	[ExpandMemCmp] Trigger on bcmp too. Summary: Fixes 41150. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, ckennelly, sbenza, jyknight Tags: #llvm Differential Revision: https://reviews.llvm.org/D59593 llvm-svn: 356550	2019-03-20 11:51:11 +00:00
Florian Hahn	1663c9466f	[DwarfDebug] Skip entries to big for 16 bit size field in Dwarf < 5. Nothing prevents entries from being bigger than the 16 bit size field in Dwarf < 5. For entries that are too big, just emit an empty entry instead of crashing. This fixes PR41038. Reviewers: probinson, aprantl, davide Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D59518 llvm-svn: 356514	2019-03-19 20:37:06 +00:00
Matt Arsenault	cf55a657f0	CodeGen: Refactor regallocator command line and target selection This will allow targets more flexibility to replace the register allocator core passes. In a future commit, AMDGPU will run the core register assignment passes twice, and will also want to disallow using the standard -regalloc option. llvm-svn: 356506	2019-03-19 19:33:12 +00:00
Matt Arsenault	3c98cdd218	RegAllocFast: Do not allocate registers for undef uses Do not actually allocate a register for an undef use. Previously we we would create unnecessary reload instruction for undef uses where the register wasn't live. Patch by Matthias Braun llvm-svn: 356501	2019-03-19 19:16:04 +00:00
Matt Arsenault	c2e35a6f32	RegAllocFast: Remove early selection loop, the spill calculation will report cost 0 anyway for free regs The 2nd loop calculates spill costs but reports free registers as cost 0 anyway, so there is little benefit from having a separate early loop. Surprisingly this is not NFC, as many register are marked regDisabled so the first loop often picks up later registers unnecessarily instead of the first one available in the allocation order... Patch by Matthias Braun llvm-svn: 356499	2019-03-19 19:01:34 +00:00
Simon Pilgrim	77482120da	Fix for ABS legalization on PPC buildbot. llvm-svn: 356498	2019-03-19 18:55:46 +00:00
Philip Reames	db65a5b776	Allow unordered loads to be considered invariant in CodeGen The actual code change is fairly straight forward, but exercising it isn't. First, it turned out we weren't adding the appropriate flags in SelectionDAG. Second, it turned out that we've got some optimization gaps, so obvious test cases don't work. My first attempt (in atomic-unordered.ll) points out a deficiency in our peephole-opt folding logic which I plan to fix separately. Instead, I'm exercising this through MachineLICM. Differential Revision: https://reviews.llvm.org/D59375 llvm-svn: 356494	2019-03-19 18:27:18 +00:00
Philip Reames	2153c4b828	[AtomicExpand] Fix a crash bug when lowering unordered loads to cmpxchg Add tests for wider atomic loads and stores. In the process, fix a crasher where we appearently handled unorder stores, but not loads, when lowering to cmpxchg idioms. llvm-svn: 356482	2019-03-19 17:20:49 +00:00
Justin Bogner	b353d6887e	[DAGCombine] Fix a miscompile when reducing BUILD_VECTORs to a shuffle In r311255 we added a case where we split vectors whose elements are all derived from the same input vector so that we could shuffle it more efficiently. In doing so, createBuildVecShuffle was taught to adjust for the fact that all indices would be based off of the first vector when this happens, but it's possible for the code that checked that to fire incorrectly if we happen to have a BUILD_VECTOR of extracts from subvectors and don't hit this new optimization. Instead of trying to detect if we've split the vector by checking if we have extracts from the same base vector, we can just pass that information into createBuildVecShuffle, avoiding the miscompile. Differential Revision: https://reviews.llvm.org/D59507 llvm-svn: 356476	2019-03-19 16:52:00 +00:00
Simon Pilgrim	a56f2822d0	[SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468	2019-03-19 16:24:55 +00:00
Markus Lavin	b86ce219f4	[DebugInfo] Introduce DW_OP_LLVM_convert Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type. The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp to perform sext/zext on debug values but mainly the patch is about preparing terrain for adding other Dwarf v5 ops that need to reference a base_type. For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a complex shift & mask pattern is generated to emulate sext/zext. This is a recommit of r356442 with trivial fixes for the failing tests. Differential Revision: https://reviews.llvm.org/D56587 llvm-svn: 356451	2019-03-19 13:16:28 +00:00
Markus Lavin	ad78768d59	Revert "[DebugInfo] Introduce DW_OP_LLVM_convert" This reverts commit 1cf4b593a7ebd666fc6775f3bd38196e8e65fafe. Build bots found failing tests not detected locally. Failing Tests (3): LLVM :: DebugInfo/Generic/convert-debugloc.ll LLVM :: DebugInfo/Generic/convert-inlined.ll LLVM :: DebugInfo/Generic/convert-linked.ll llvm-svn: 356444	2019-03-19 09:17:28 +00:00
Markus Lavin	cd8a940b37	[DebugInfo] Introduce DW_OP_LLVM_convert Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type. The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp to perform sext/zext on debug values but mainly the patch is about preparing terrain for adding other Dwarf v5 ops that need to reference a base_type. For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a complex shift & mask pattern is generated to emulate sext/zext. Differential Revision: https://reviews.llvm.org/D56587 llvm-svn: 356442	2019-03-19 08:48:19 +00:00
Amara Emerson	a140276a1e	[GlobalISel] Include missing change from r356396 Forgot to add a change to relax some asserts in r356396. llvm-svn: 356411	2019-03-18 21:29:21 +00:00
Amara Emerson	8627178d46	Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy() After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396	2019-03-18 19:20:10 +00:00
Adhemerval Zanella	664c1ef528	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389	2019-03-18 18:40:07 +00:00
Nirav Dave	55c921f4bf	[DAG] Cleanup unused node in SimplifySelectCC. Delete temporarily constructed node uses for analysis after it's use, holding onto original input nodes. Ideally this would be rewritten without making nodes, but this appears relatively complex. Reviewers: spatel, RKSimon, craig.topper Subscribers: jdoerfert, hiraditya, deadalnix, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57921 llvm-svn: 356382	2019-03-18 17:02:38 +00:00
David Stenberg	8a2e4af7e7	[DebugInfo] Ignore bitcasts when lowering stack arg dbg.values Summary: Look past bitcasts when looking for parameter debug values that are described by frame-index loads in `EmitFuncArgumentDbgValue()`. In the attached test case we would be left with an undef `DBG_VALUE` for the parameter without this patch. A similar fix was done for parameters passed in registers in D13005. This fixes PR40777. Reviewers: aprantl, vsk, jmorse Reviewed By: aprantl Subscribers: bjope, javed.absar, jdoerfert, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D58831 llvm-svn: 356363	2019-03-18 11:27:32 +00:00
Tim Renouf	c4e128e221	[CodeGen] Defined MVTs v3i32, v3f32, v5i32, v5f32 AMDGPU would like to use these MVTs. Differential Revision: https://reviews.llvm.org/D58901 Change-Id: I6125fea810d7cc62a4b4de3d9904255a1233ae4e llvm-svn: 356351	2019-03-17 22:56:38 +00:00
Tim Renouf	c302b9b5fe	[CodeGen] Prepare for introduction of v3 and v5 MVTs AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This commit does not add them, but makes preparatory changes: * Exclude non-legal non-power-of-2 vector types from ComputeRegisterProp mechanism in TargetLoweringBase::getTypeConversion. * Cope with SETCC and VSELECT for odd-width i1 vector when the other vectors are legal type. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58899 Change-Id: Ib5f23377dbef511be3a936211a0b9f94e46331f8 llvm-svn: 356350	2019-03-17 21:43:12 +00:00
Matt Arsenault	884a18d792	RegAllocFast: Add hint to debug printing llvm-svn: 356348	2019-03-17 21:31:40 +00:00
Nikita Popov	9a4453592b	[DAGCombine] Fold (x & ~y) \| y patterns Fold (x & ~y) \| y and it's four commuted variants to x \| y. This pattern can in particular appear when a vselect c, x, -1 is expanded to (x & ~c) \| (-1 & c) and combined to (x & ~c) \| c. This change has some overlap with D59066, which avoids creating a vselect of this form in the first place during uaddsat expansion. Differential Revision: https://reviews.llvm.org/D59174 llvm-svn: 356333	2019-03-17 15:45:38 +00:00
Sanjay Patel	6a6e808b69	[TargetLowering] improve the default expansion of uaddsat/usubsat This is a subset of what was proposed in: D59006 ...and may overlap with test changes from: D59174 ...but it seems like a good general optimization to turn selects into bitwise-logic when possible because we never know exactly what can happen at this stage of DAG combining depending on how the target has defined things. Differential Revision: https://reviews.llvm.org/D59066 llvm-svn: 356332	2019-03-17 14:57:40 +00:00
Simon Pilgrim	3b0a6c69ee	[DAGCombine] combineShuffleOfScalars - handle non-zero SCALAR_TO_VECTOR indices (PR41097) rL356292 reduces the size of scalar_to_vector if we know the upper bits are undef - which means that shuffles may find they are suddenly referencing scalar_to_vector elements other than zero - so make sure we handle this as undef. llvm-svn: 356327	2019-03-16 17:36:26 +00:00
Heejin Ahn	66ce419468	[WebAssembly] Make rethrow take an except_ref type argument Summary: In the new wasm EH proposal, `rethrow` takes an `except_ref` argument. This change was missing in r352598. This patch adds `llvm.wasm.rethrow.in.catch` intrinsic. This is an intrinsic that's gonna eventually be lowered to wasm `rethrow` instruction, but this intrinsic can appear only within a catchpad or a cleanuppad scope. Also this intrinsic needs to be invokable - otherwise EH pad successor for it will not be correctly generated in clang. This also adds lowering logic for this intrinsic in `SelectionDAGBuilder::visitInvoke`. This routine is basically a specialized and simplified version of `SelectionDAGBuilder::visitTargetIntrinsic`, but we can't use it because if is only for `CallInst`s. This deletes the previous `llvm.wasm.rethrow` intrinsic and related tests, which was meant to be used within a `__cxa_rethrow` library function. Turned out this needs some more logic, so the intrinsic for this purpose will be added later. LateEHPrepare takes a result value of `catch` and inserts it into matching `rethrow` as an argument. `RETHROW_IN_CATCH` is a pseudo instruction that serves as a link between `llvm.wasm.rethrow.in.catch` and the real wasm `rethrow` instruction. To generate a `rethrow` instruction, we need an `except_ref` argument, which is generated from `catch` instruction. But `catch` instrutions are added in LateEHPrepare pass, so we use `RETHROW_IN_CATCH`, which takes no argument, until we are able to correctly lower it to `rethrow` in LateEHPrepare. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59352 llvm-svn: 356316	2019-03-16 05:38:57 +00:00
Amara Emerson	7097e83dab	[GlobalISel] Make isel verification checks of vregs run under NDEBUG only. llvm-svn: 356309	2019-03-16 01:02:10 +00:00
Amara Emerson	3739a20875	[GlobalISel] Allow MachineIRBuilder to build subregister copies. This relaxes some asserts about sizes, and adds an optional subreg parameter to buildCopy(). Also update AArch64 instruction selector to use this in places where we previously used MachineInstrBuilder manually. Differential Revision: https://reviews.llvm.org/D59434 llvm-svn: 356304	2019-03-15 21:59:50 +00:00
Simon Pilgrim	8fbe439345	[SelectionDAG] Add SimplifyDemandedBits handling for ISD::SCALAR_TO_VECTOR Fixes a lot of constant folding mismatches between i686 and x86_64 llvm-svn: 356273	2019-03-15 17:00:55 +00:00
Mikael Holmen	339daae806	[CodeGenPrepare] avoid crashing from replacing a phi twice Summary: This is a fix to bug 41052: https://bugs.llvm.org/show_bug.cgi?id=41052 While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi. Patch by: JesperAntonsson Reviewers: skatkov, aprantl Reviewed By: aprantl Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59358 llvm-svn: 356260	2019-03-15 13:51:05 +00:00
Sanjay Patel	2c9275a790	[CGP] add another bailout for degenerate code (PR41064) This is almost the same as: rL355345 ...and should prevent any potential crashing from examples like: https://bugs.llvm.org/show_bug.cgi?id=41064 ...although the bug was masked by: rL355823 ...and I'm not sure how to repro the problem after that change. llvm-svn: 356218	2019-03-14 23:14:31 +00:00
Matt Arsenault	bc6d07ca46	MIR: Allow targets to serialize MachineFunctionInfo This has been a very painful missing feature that has made producing reduced testcases difficult. In particular the various registers determined for stack access during function lowering were necessary to avoid undefined register errors in a large percentage of cases. Implement a subset of the important fields that need to be preserved for AMDGPU. Most of the changes are to support targets parsing register fields and properly reporting errors. The biggest sort-of bug remaining is for fields that can be initialized from the IR section will be overwritten by a default initialized machineFunctionInfo section. Another remaining bug is the machineFunctionInfo section is still printed even if empty. llvm-svn: 356215	2019-03-14 22:54:43 +00:00
Philip Reames	70d156991c	Allow code motion (and thus folding) for atomic (but unordered) memory operands Building on the work done in D57601, now that we can distinguish between atomic and volatile memory accesses, go ahead and allow code motion of unordered atomics. As seen in the diffs, this allows much better folding of memory operations into using instructions. (Mostly done by the PeepholeOpt pass.) Note: I have not reviewed all callers of hasOrderedMemoryRef since one of them - isSafeToMove - is very widely used. I'm relying on the documented semantics of each method to judge correctness. Differential Revision: https://reviews.llvm.org/D59345 llvm-svn: 356170	2019-03-14 17:20:59 +00:00
Adrian Prantl	e69917f166	Add IR debug info support for Elemental, Pure, and Recursive Procedures. Patch by Eric Schweitz! Differential Revision: https://reviews.llvm.org/D54043 llvm-svn: 356163	2019-03-14 16:29:54 +00:00
Matt Arsenault	133716929c	GlobalISel: Use multiple returns for intrinsic structs This is consistent with what SelectionDAG does and is much easier to work with than the extract sequence with an artificial wide register. For the AMDGPU control flow intrinsics, this was producing an s128 for the i64, i1 tuple return. Any legalization that should apply to a real s128 value would badly obscure the direct values that need to be seen. llvm-svn: 356147	2019-03-14 14:18:56 +00:00
Quentin Colombet	e77e5f44b8	[GlobalISel][Utils] Add a getConstantVRegVal variant that looks through instrs getConstantVRegVal used to only look for G_CONSTANT when looking at unboxing the value of a vreg. However, constants are sometimes not directly used and are hidden behind trunc, s\|zext or copy chain of computation. In particular this may be introduced by the legalization process that doesn't want to simplify these patterns because it can lead to infine loop when legalizing a constant. To circumvent that problem, add a new variant of getConstantVRegVal, named getConstantVRegValWithLookThrough, that allow to look through extensions. Differential Revision: https://reviews.llvm.org/D59227 llvm-svn: 356116	2019-03-14 01:37:13 +00:00
Craig Topper	66df7361ff	[ResetMachineFunctionPass] Add visited functions statistics info Adding a "NumFunctionsVisited" for collecting the visited function number. It can be used to collect function pass rate in some tests, the pass rate = (NumberVisited - NumberReset)/NumberVisited. e.g. it can be used for caculating GlobalISel pass rate in Test-Suite. Patch by Tianyang Zhu (zhutianyang) Differential Revision: https://reviews.llvm.org/D59285 llvm-svn: 356114	2019-03-14 01:13:15 +00:00
Nirav Dave	ee5183c796	[DAGCombiner] Fix Comment. NFC. llvm-svn: 356069	2019-03-13 17:44:40 +00:00
Nirav Dave	d6351340bb	[DAGCombiner] If a TokenFactor would be merged into its user, consider the user later. Summary: A number of optimizations are inhibited by single-use TokenFactors not being merged into the TokenFactor using it. This makes we consider if we can do the merge immediately. Most tests changes here are due to the change in visitation causing minor reorderings and associated reassociation of paired memory operations. CodeGen tests with non-reordering changes: X86/aligned-variadic.ll -- memory-based add folded into stored leaq value. X86/constant-combiners.ll -- Optimizes out overlap between stores. X86/pr40631_deadstore_elision -- folds constant byte store into preceding quad word constant store. Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet Reviewed By: courbet Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59260 llvm-svn: 356068	2019-03-13 17:07:09 +00:00
Clement Courbet	3bb5d0bb9b	Re-land r354244 "[DAGCombiner] Eliminate dead stores to stack." Always check candidates for hasOtherUses(), not only stores. llvm-svn: 356050	2019-03-13 13:56:23 +00:00
Simon Pilgrim	360ce82db2	[DAG] Move integer setcc %x, %x folding into FoldSetCC First step towards PR40800 - I intend to move the float case in a separate future patch. I had to tweak the (overly reduced) thumb2 test and the x86 widening test change is annoying (no longer rematerializable) but we should address this separately. Differential Revision: https://reviews.llvm.org/D59244 llvm-svn: 356040	2019-03-13 11:08:57 +00:00
Philip Reames	21a50ccf9c	[ImplicitNullChecks] Support unordered atomic accesses Update the INC pass to allow folding unordered atomics. This is the first optimization unblocked by the changes landed from D57601. llvm-svn: 356006	2019-03-13 03:25:20 +00:00
Matt Arsenault	bdfb6cfdf1	MIR: Stop reinitializing target information for every use Every time a physical register reference was parsed, this would initialize a string map for every register in in target, and discard it for the next. The same applies for the other fields initialized from target information. Follow along with how the function state is tracked, and add a new tracking class for target information. The string->register class/register bank for some reason were kept separately, so track them in the same place. llvm-svn: 355970	2019-03-12 20:42:12 +00:00
Philip Reames	18408d5e79	[CodeGen] Add MMOs to statepoint nodes during SelectionDAG The existing statepoint lowering code does something odd; it adds machine memory operands post instruction selection. This was copied from the stackmap/patchpoint implementation, but appears to be non-idiomatic. This change is largely NFC. It moves the MMO creation logic into SelectionDAG building. It ends up not quite being NFC because the size of the stack slot is reflected in the MMO. The old code blindly used pointer size for the MMO size, which appears to have always been incorrect for larger values. It just happened nothing actually relied on the MMOs, so it worked out okay. For context, I'm planning on removing the MOVolatile flag from these in a future commit, and then removing the MOStore flag from deopt spill slots in a separate one. Doing so is motivated by a small test case where we should be able to better schedule spill slots, but don't do so due to a memory use/def implied by the statepoint. Differential Revision: https://reviews.llvm.org/D59106 llvm-svn: 355953	2019-03-12 19:12:33 +00:00
Nikita Popov	149bc099f6	[SDAG] Expand pow2 mulo using shifts Expand MULO with constant power of two operand into a shift. The overflow is checked with (x << shift) >> shift == x, where the right shift will be logical for umulo and arithmetic for smulo (with exception for multiplications by signed_min). Differential Revision: https://reviews.llvm.org/D59041 llvm-svn: 355937	2019-03-12 16:57:25 +00:00
Simon Pilgrim	9f0a5ca843	[DAGCombine] Pull out repeated demanded bitmask generation. NFCI. llvm-svn: 355932	2019-03-12 15:58:28 +00:00
Tim Northover	8935aca9c7	CodeGenPrep: preserve inbounds attribute when sinking GEPs. Targets can potentially emit more efficient code if they know address computations never overflow. For example ILP32 code on AArch64 (which only has 64-bit address computation) can ignore the possibility of overflow with this extra information. llvm-svn: 355926	2019-03-12 15:22:23 +00:00
Eugene Leviant	1e249caaec	[CGP] Fix UB when GEP is bound to trivial PHINode Differential revision: https://reviews.llvm.org/D59140 llvm-svn: 355904	2019-03-12 10:10:29 +00:00
Sanjoy Das	3f5ce18658	Reland "Relax constraints for reduction vectorization" Change from original commit: move test (that uses an X86 triple) into the X86 subdirectory. Original description: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355889	2019-03-12 01:31:44 +00:00
Nathan Lanza	cc51dc649a	Add Swift enumerator value for CodeView::SourceLanguage Summary: Swift now generates PDBs for debugging on Windows. llvm and lldb need a language enumerator value too properly handle the output emitted by swiftc. Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59231 llvm-svn: 355882	2019-03-11 23:27:59 +00:00
Sanjoy Das	2136a5bc49	Revert "Relax constraints for reduction vectorization" This reverts commit r355868. Breaks hexagon. llvm-svn: 355873	2019-03-11 22:37:31 +00:00
Evgeniy Stepanov	aedec3f684	Remove ASan asm instrumentation. Summary: It is incomplete and has no users AFAIK. Reviewers: pcc, vitalybuka Subscribers: srhines, kubamracek, mgorny, krytarowski, eraman, hiraditya, jdoerfert, #sanitizers, llvm-commits, thakis Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D59154 llvm-svn: 355870	2019-03-11 21:50:10 +00:00
Sanjoy Das	93f8cc186a	Relax constraints for reduction vectorization Summary: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355868	2019-03-11 21:36:41 +00:00
Jessica Paquette	42d16501e6	[GlobalISel][AArch64] Always fall back on aarch64.neon.addp.* Overloaded intrinsics aren't necessarily safe for instruction selection. One such intrinsic is aarch64.neon.addp.*. This is a temporary workaround to ensure that we always fall back on that intrinsic. Eventually this will be replaced with a proper solution. https://bugs.llvm.org/show_bug.cgi?id=40968 Differential Revision: https://reviews.llvm.org/D59062 llvm-svn: 355865	2019-03-11 20:51:17 +00:00
Nikita Popov	aa7cfa75f9	[SDAG][AArch64] Legalize VECREDUCE Fixes https://bugs.llvm.org/show_bug.cgi?id=36796. Implement basic legalizations (PromoteIntRes, PromoteIntOp, ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes. There are more legalizations missing (esp float legalizations), but there's no way to test them right now, so I'm not adding them. This also includes a few more changes to make this work somewhat reasonably: * Add support for expanding VECREDUCE in SDAG. Usually experimental.vector.reduce is expanded prior to codegen, but if the target does have native vector reduce, it may of course still be necessary to expand due to legalization issues. This uses a shuffle reduction if possible, followed by a naive scalar reduction. * Allow the result type of integer VECREDUCE to be larger than the vector element type. For example we need to be able to reduce a v8i8 into an (nominally) i32 result type on AArch64. * Use the vector operand type rather than the scalar result type to determine the action, so we can control exactly which vector types are supported. Also change the legalize vector op code to handle operations that only have vector operands, but no vector results, as is the case for VECREDUCE. * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE), explicitly specify for which vector types the reductions are supported. This does not handle anything related to VECREDUCE_STRICT_*. Differential Revision: https://reviews.llvm.org/D58015 llvm-svn: 355860	2019-03-11 20:22:13 +00:00
Jonas Paulsson	8b8dc50e79	[RegAlloc] Avoid compile time regression with multiple copy hints. As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile time building opencollada"), this patch makes sure that no phys reg is hinted more than once from getRegAllocationHints(). This handles the case were many virtual registers are assigned to the same physreg. The previous compile time fix (r343686) in weightCalcHelper() only made sure that physical/virtual registers are passed no more than once to addRegAllocationHint(). Review: Dimitry Andric, Quentin Colombet https://reviews.llvm.org/D59201 llvm-svn: 355854	2019-03-11 19:00:37 +00:00
Simon Pilgrim	f3be93a2ff	[DAG] FoldSetCC - reuse valuetype + ensure its simple. llvm-svn: 355847	2019-03-11 17:56:18 +00:00
Brian Gesiak	4349dc76fa	[Utils] Extract EliminateUnreachableBlocks (NFC) Summary: Extract the functionality of eliminating unreachable basic blocks within a function, previously encapsulated within the -unreachableblockelim pass, and make it available as a function within BlockUtils.h. No functional change intended other than making the logic reusable. Exposing this logic makes it easier to implement https://reviews.llvm.org/D59068, which fixes coroutines bug https://bugs.llvm.org/show_bug.cgi?id=40979. Reviewers: mkazantsev, wmi, davidxl, silvas, davide Reviewed By: davide Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59069 llvm-svn: 355846	2019-03-11 17:51:57 +00:00
Simon Pilgrim	1bb5b56485	[DAG] Move SetCC NaN handling into FoldSetCC llvm-svn: 355845	2019-03-11 17:43:10 +00:00
Simon Pilgrim	53518b45a5	[DAG] TargetLowering::SimplifySetCC - call FoldSetCC early to handle constant/commute folds. Noticed while looking at PR40800 (and also D57921) llvm-svn: 355828	2019-03-11 15:01:31 +00:00
Sam Parker	52760bf435	[CGP] Limit distance between overflow math and cmp Inserting an overflowing arithmetic intrinsic can increase register pressure by producing two values at a point where only one is needed, while the second use maybe several blocks away. This increase in pressure is likely to be more detrimental on performance than rematerialising one of the original instructions. So, check that the arithmetic and compare instructions are no further apart than their immediate successor/predecessor. Differential Revision: https://reviews.llvm.org/D59024 llvm-svn: 355823	2019-03-11 13:19:46 +00:00
Benjamin Kramer	6ff32e143a	[MIPS GlobalISel] Silence uninitialized variable warning The control flow here cannot ever use the uninitialized value, but it's too hard for the compiler to figure that out. Clang warns: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: error: variable 'CarrySum' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized] for (unsigned i = 2; i < Factors.size(); ++i) ^~~~~~~~~~~~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2604:26: note: uninitialized use occurs here CarrySumPrevDstIdx = CarrySum; ^~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: note: remove the condition if it is always true for (unsigned i = 2; i < Factors.size(); ++i) ^~~~~~~~~~~~~~~~~~ llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2583:22: note: initialize the variable 'CarrySum' to silence this warning unsigned CarrySum; ^ = 0 llvm-svn: 355818	2019-03-11 10:39:15 +00:00
Petar Avramovic	5229f47f9f	[MIPS GlobalISel] NarrowScalar G_UMULH NarrowScalar G_UMULH in LegalizerHelper using multiplyRegisters helper function. NarrowScalar G_UMULH for MIPS32. Differential Revision: https://reviews.llvm.org/D58825 llvm-svn: 355815	2019-03-11 10:08:44 +00:00
Petar Avramovic	0b17e59b5c	[MIPS GlobalISel] NarrowScalar G_MUL Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 llvm-svn: 355814	2019-03-11 10:00:17 +00:00
Amaury Sechet	a135fd5562	Remove redundant extractBooleanFlip argument. NFC llvm-svn: 355794	2019-03-11 00:37:01 +00:00
Sanjay Patel	7d8260feb6	[CGP] fix comments; NFC llvm-svn: 355791	2019-03-10 18:42:30 +00:00
Craig Topper	1a872f2b15	Recommit r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary." Includes a fix to emit a CheckOpcode for build_vector when immAllZerosV/immAllOnesV is used as a pattern root. This means it can't be used to look through bitcasts when used as a root, but that's probably ok. This extra CheckOpcode will ensure that the first match in the isel table will be a SwitchOpcode which is needed by the caching optimization in the ISel Matcher. Original commit message: Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts. By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up. This removes something like 40,000 bytes from the X86 isel table. Differential Revision: https://reviews.llvm.org/D58595 llvm-svn: 355784	2019-03-10 05:21:52 +00:00
Amaury Sechet	b62642a115	Refactor isBooleanFlip into extractBooleanFlip so that users do not depend on the patern matched. NFC llvm-svn: 355769	2019-03-09 02:51:52 +00:00
Craig Topper	69f8c1653d	[ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. llvm-svn: 355767	2019-03-09 02:08:41 +00:00
Wei Mi	98214347c4	Rename a local variable counter to Counter. llvm-svn: 355759	2019-03-08 23:32:07 +00:00
Wei Mi	fb9693d1c9	[RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid repeated lookup operations llvm-svn: 355757	2019-03-08 23:29:46 +00:00
Craig Topper	d84f605910	[ScalarizeMaskedMemIntrin] Only set the ModifiedDT flag if new basic blocks were added. There are special cases in the scalarization for constant masks. If we hit one of the special cases we don't need to reset the iteration. Noticed while starting work on adding expandload/compressstore to this pass. llvm-svn: 355754	2019-03-08 23:03:43 +00:00
Rong Xu	ce3be45cac	[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst r44412 fixed a huge compile time regression but it needed ModifiedDT flag to be maintained correctly in optimizations in optimizeBlock() and optimizeInst(). Function optimizeSelectInst() does not update the flag. This patch propagates the flag in optimizeSelectInst() back to optimizeBlock(). This patch also removes ModifiedDT in CodeGenPrepare class (which is not used). The property of ModifiedDT is now recorded in a ref parameter. Differential Revision: https://reviews.llvm.org/D59139 llvm-svn: 355751	2019-03-08 22:46:18 +00:00
Matt Arsenault	26e76ef0e2	DAG: Don't try to cluster loads with tied inputs This avoids breaking possible value dependencies when sorting loads by offset. AMDGPU has some load instructions that write into the high or low bits of the destination register, and have a tied input for the other input bits. These can easily have the same base pointer, but be a swizzle so the high address load needs to come first. This was inserting glue forcing the opposite ordering, producing a cycle the InstrEmitter would assert on. It may be potentially expensive to look for the dependency between the other loads, so just skip any where this could happen. Fixes bug 40936 by reverting r351379, which added a hacky attempt to fix this by adding chains in this case, which I think was just working around broken glue before the InstrEmitter. The core of the patch is re-implementing the fix for that problem. llvm-svn: 355728	2019-03-08 20:46:15 +00:00
Amaury Sechet	782ac933b5	[DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a) Summary: This pattern is sometime created after legalization. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58874 llvm-svn: 355716	2019-03-08 19:39:32 +00:00
Wei Mi	72ec6801b5	[RegisterCoalescer] Limit the number of joins for large live interval with many valnos. Recently we found compile time out problem in several cases when SpeculativeLoadHardening was enabled. The significant compile time was spent in register coalescing pass, where register coalescer tried to join many other live intervals with some very large live intervals with many valnos. Specifically, every time JoinVals::mapValues is called, computeAssignment will be called by getNumValNums() times of the target live interval. If the large live interval has N valnos and has N copies associated with it, trying to coalescing those copies will at least cost N^2 complexity. The patch adds some limit to the effort trying to join those very large live intervals with others. By default, for live interval with > 100 valnos, and when it has been coalesced with other live interval by more than 100 times, we will stop coalescing for the live interval anymore. That put a compile time cap for the N^2 algorithm and effectively solves the compile time problem we saw. Differential revision: https://reviews.llvm.org/D59143 llvm-svn: 355714	2019-03-08 19:25:32 +00:00
Simon Pilgrim	04e8439f72	[DAGCombine] Merge visitSMULO+visitUMULO into visitMULO. NFCI. llvm-svn: 355690	2019-03-08 11:41:18 +00:00
Simon Pilgrim	c71d6d157f	[DAGCombine] Merge visitSADDO+visitUADDO into visitADDO. NFCI. llvm-svn: 355689	2019-03-08 11:30:33 +00:00
Simon Pilgrim	2c2e76a9e2	[DAGCombine] Merge visitSSUBO+visitUSUBO into visitSUBO. NFCI. llvm-svn: 355688	2019-03-08 11:16:55 +00:00
Brian Gesiak	4e467043fb	[CodeGen] Reuse BlockUtils for -unreachableblockelim pass (NFC) Summary: The logic in the -unreachableblockelim pass does the following: 1. It traverses the function it's given in depth-first order and creates a set of basic blocks that are unreachable from the function's entry node. 2. It iterates over each of those unreachable blocks and (1) removes any successors' references to the dead block, and (2) replaces any uses of instructions from the dead block with null. The logic in (2) above is identical to what the `llvm::DeleteDeadBlocks` function from `BasicBlockUtils.h` does. The only difference is that `llvm::DeleteDeadBlocks` replaces uses of instructions from dead blocks not with null, but with undef. Replace the duplicate logic in the -unreachableblockelim pass with a call to `llvm::DeleteDeadBlocks`. This results in less code but no functional change (NFC). Reviewers: mkazantsev, wmi, davidxl, silvas, davide Reviewed By: davide Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59064 llvm-svn: 355634	2019-03-07 20:40:55 +00:00
Paul Robinson	05efe0fdc4	[PS4] Emit a trap after a stack-protector fail call. llvm-svn: 355542	2019-03-06 19:57:43 +00:00
Philip Reames	9549f7560f	[AtomicExpand] Allow libcall expansion for non-zero address spaces (try 2) Restore a reverted commit, with the silly mistake fixed. Sorry for the previous breakage. Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355540	2019-03-06 19:27:13 +00:00
Simon Pilgrim	9d6347cfc1	[DAGCombine] Improve select (not Cond), N1, N2 -> select Cond, N2, N1 fold Move the x86 combine from D58974 into the DAGCombine VSELECT code and update the SELECT version to use the isBooleanFlip helper as well. Requested by @spatel on D59006 llvm-svn: 355533	2019-03-06 18:52:52 +00:00
Simon Pilgrim	cdf95f8f07	[DAGCombiner] Enable UADDO/USUBO vector combine support Differential Revision: https://reviews.llvm.org/D58965 llvm-svn: 355517	2019-03-06 16:11:03 +00:00
Sanjay Patel	1fefc30b08	[TargetLowering] simplify code for uaddsat/usubsat expansion; NFC We had 2 local variable names for the same type. llvm-svn: 355516	2019-03-06 16:06:27 +00:00
Alexander Kornienko	3d467a890e	Revert "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default" This reverts commit `2a0f2c5ef3` (r355490). The commit causes an assertion failure when compiling LLVM code: $ cat repro.cpp class QQQ { public: bool x() const; bool y() const; unsigned getSizeInBits() const { if (y() \|\| x()) return getScalarSizeInBits(); return getScalarSizeInBits() * 2; } unsigned getScalarSizeInBits() const; }; int f(const QQQ &Ty) { switch (Ty.getSizeInBits()) { case 1: case 8: return 0; case 16: return 1; case 32: return 2; case 64: return 3; default: __builtin_unreachable(); } } $ clang -O2 -o repro.o repro.cpp assert.h assertion failed at llvm/include/llvm/ADT/ilist_iterator.h:139 in llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, true, false>::operator() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = true, IsConst = false]: !NodePtr->isKnownSentinel() Check failure stack trace: * @ 0x558aab4afc10 __assert_fail @ 0x558aa885479b llvm::ilist_iterator<>::operator() @ 0x558aa8854715 llvm::MachineInstrBundleIterator<>::operator() @ 0x558aa92c33c3 llvm::X86InstrInfo::optimizeCompareInstr() @ 0x558aa9a9c251 (anonymous namespace)::PeepholeOptimizer::optimizeCmpInstr() @ 0x558aa9a9b371 (anonymous namespace)::PeepholeOptimizer::runOnMachineFunction() @ 0x558aa99a4fc8 llvm::MachineFunctionPass::runOnFunction() @ 0x558aab019fc4 llvm::FPPassManager::runOnFunction() @ 0x558aab01a3a5 llvm::FPPassManager::runOnModule() @ 0x558aab01aa9b (anonymous namespace)::MPPassManager::runOnModule() @ 0x558aab01a635 llvm::legacy::PassManagerImpl::run() @ 0x558aab01afe1 llvm::legacy::PassManager::run() @ 0x558aa5914769 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly() @ 0x558aa5910f44 clang::EmitBackendOutput() @ 0x558aa5906135 clang::BackendConsumer::HandleTranslationUnit() @ 0x558aa6d165ad clang::ParseAST() @ 0x558aa6a94e22 clang::ASTFrontendAction::ExecuteAction() @ 0x558aa590255d clang::CodeGenAction::ExecuteAction() @ 0x558aa6a94840 clang::FrontendAction::Execute() @ 0x558aa6a38cca clang::CompilerInstance::ExecuteAction() @ 0x558aa4e2294b clang::ExecuteCompilerInvocation() @ 0x558aa4df6200 cc1_main() @ 0x558aa4e1b37f ExecuteCC1Tool() @ 0x558aa4e1a725 main @ 0x7ff20d56abbd __libc_start_main @ 0x558aa4df51c9 _start llvm-svn: 355515	2019-03-06 15:23:50 +00:00
Teresa Johnson	b1daf0aef6	[CGP] Avoid repeatedly building DominatorTree causing long compile-time (NFC) Summary: In r354298 a DominatorTree construction was added via new function combineToUSubWithOverflow, which was subsequently restructured into replaceMathCmpWithIntrinsic in r354689. We are hitting a very long compile time due to this repeated construction, once per math cmp in the function. We shouldn't need to build the DominatorTree more than once per function, except when a transformation invalidates it. There is already a boolean flag that is returned from these methods indicating whether the DT has been modified. We can simply build the DT once per Function walk in CodeGenPrepare::runOnFunction, since any time a change is made we break out of the Function walk and restart it. I modified the code so that both replaceMathCmpWithIntrinsic as well as mergeSExts (which was also building a DT) use the DT constructed by the run method. From -mllvm -time-passes: Before this patch: CodeGen Prepare user time is 328s With this patch: CodeGen Prepare user time is 21s Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58995 llvm-svn: 355512	2019-03-06 14:57:40 +00:00
Francis Visoiu Mistrih	6b622ebea0	Revert "[Remarks] Refactor remark diagnostic emission in a RemarkStreamer" This reverts commit 2e8c4997a2089f8228c843fd81b148d903472e02. Breaks bots. llvm-svn: 355511	2019-03-06 14:52:37 +00:00
Sanjay Patel	89e534746f	[TargetLowering] simplify code for uaddsat/usubsat expansion; NFC llvm-svn: 355508	2019-03-06 14:34:59 +00:00
Francis Visoiu Mistrih	9052f50cb4	[Remarks] Refactor remark diagnostic emission in a RemarkStreamer This allows us to store more info about where we're emitting the remarks without cluttering LLVMContext. This is needed for future support for the remark section. Differential Revision: https://reviews.llvm.org/D58996 llvm-svn: 355507	2019-03-06 14:32:08 +00:00
Simon Pilgrim	1bdc2d1874	[DAGCombiner] Add SADDO/SSUBO combine support Basic constant handling folds, for both scalars and vectors Differential Revision: https://reviews.llvm.org/D58967 llvm-svn: 355506	2019-03-06 14:22:21 +00:00
Simon Pilgrim	642f53d292	[DAGCombiner] Enable SMULO/UMULO vector combine support (PR40442) Differential Revision: https://reviews.llvm.org/D58968 llvm-svn: 355495	2019-03-06 11:04:21 +00:00
Ayonam Ray	2a0f2c5ef3	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355490	2019-03-06 10:01:02 +00:00
Ayonam Ray	af92b7a3b8	Reversing the commit of revision 355483 since it is giving a regression on a newly added test. llvm-svn: 355487	2019-03-06 07:51:28 +00:00
Ayonam Ray	6025fa8e30	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355483	2019-03-06 07:27:45 +00:00
Mitch Phillips	f0c21e2ff5	Revert "[AtomicExpand] Allow libcall expansion for non-zero address spaces" for buildbot failures. llvm-svn: 355461	2019-03-06 00:25:40 +00:00
Philip Reames	1e4c5d3611	[AtomicExpand] Allow libcall expansion for non-zero address spaces Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355453	2019-03-05 23:00:14 +00:00
Craig Topper	57fd733140	Revert r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary." This caused the first matcher in the isel table for many targets to Opc_Scope instead of Opc_SwitchOpcode. This leads to a significant increase in isel match failures. llvm-svn: 355433	2019-03-05 19:18:16 +00:00
Craig Topper	2982b846e9	[Subtarget] Merge ProcSched and ProcDesc arrays in MCSubtargetInfo into a single array. These arrays are both keyed by CPU name and go into the same tablegenerated file. Merge them so we only need to store keys once. This also removes a weird space saving quirk where we used the ProcDesc.size() to create to build an ArrayRef for ProcSched. Differential Revision: https://reviews.llvm.org/D58939 llvm-svn: 355431	2019-03-05 18:54:38 +00:00
Craig Topper	ca26808da9	[Subtarget] Create a separate SubtargetSubtargetKV struct for ProcDesc to remove fields from the stack tables that aren't needed for CPUs The description for CPUs was just the CPU name wrapped with "Select the " and " processor". We can just do that directly in the help printer instead of making a separate version in the binary for each CPU. Also remove the Value field that isn't needed and was always 0. Differential Revision: https://reviews.llvm.org/D58938 llvm-svn: 355429	2019-03-05 18:54:34 +00:00
Sanjay Patel	8b72080d4d	[SDAG] move FP constant folding to helper function; NFC llvm-svn: 355411	2019-03-05 16:42:33 +00:00
Sanjay Patel	3b2d0bc7c2	[CodeGenPrepare] avoid crashing on non-canonical/degenerate code The test is reduced from an example in the post-commit thread for: rL354746 http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html While we must avoid dying here, the real question should be: Why is non-canonical and/or degenerate code making it to CGP when using the new pass manager? llvm-svn: 355345	2019-03-04 22:47:13 +00:00
Craig Topper	509a8a3cf1	[DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast (build_vector constants)) between legalize types and legalize dag. This patch enables combining integer bitcasts of integer build vectors when the new scalar type is legal. I've avoided floating point because the implementation bitcasts float to int along the way and we would need to check the intermediate types for legality Differential Revision: https://reviews.llvm.org/D58884 llvm-svn: 355324	2019-03-04 19:12:16 +00:00
Eugene Leviant	daea28ab64	[DebugInfo] Construct nested types on behalf of owner CU Differential revision: https://reviews.llvm.org/D58786 llvm-svn: 355303	2019-03-04 07:15:36 +00:00
Heejin Ahn	195a62e9ae	[WebAssembly] Delete ThrowUnwindDest map from WasmEHFuncInfo Summary: Before when we implemented the first EH proposal, 'catch <tag>' instruction may not catch an exception so there were multiple EH pads an exception can unwind to. That means a BB could have multiple EH pad successors. Now after we switched to the new proposal, every 'catch' instruction catches an exception, and there is only one catchpad per catchswitch, so we at most have one EH pad successor, making `ThrowUnwindDest` map in `WasmEHInfo` unnecessary. Keeping `ThrowUnwindDest` map in `WasmEHInfo` has its own problems, because other optimization passes can split a BB that contains possibly throwing calls (previously invokes), and we have to update the map every time that happens, which is not easy for common CodeGen passes. This also correctly updates successor info in LateEHPrepare when we add a rethrow instruction. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58486 llvm-svn: 355296	2019-03-03 22:35:56 +00:00
Simon Pilgrim	37a63a748e	Use SDValue::getConstantOperandAPInt helper where possible. NFCI. llvm-svn: 355267	2019-03-02 11:11:22 +00:00
Craig Topper	4cfc39179e	[TableGen][SelectionDAG][X86] Add specific isel matchers for immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary. Previously we had build_vector PatFrags that called ISD::isBuildVectorAllZeros/Ones. Internally the ISD::isBuildVectorAllZeros/Ones look through bitcasts, but we aren't able to take advantage of that in isel. Instead of we have to canonicalize the types of the all zeros/ones build_vectors and insert bitcasts. Then we have to pattern match those exact bitcasts. By emitting specific matchers for these 2 nodes, we can make isel look through any bitcasts without needing to explicitly match them. We should also be able to remove the canonicalization to vXi32 from lowering, but I've left that for a follow up. This removes something like 40,000 bytes from the X86 isel table. Differential Revision: https://reviews.llvm.org/D58595 llvm-svn: 355224	2019-03-01 20:18:38 +00:00
Thomas Lively	f3b4f99007	[WebAssembly] Remove uses of ThreadModel Summary: In the clang UI, replaces -mthread-model posix with -matomics as the source of truth on threading. In the backend, replaces -thread-model=posix with the atomics target feature, which is now collected on the WebAssemblyTargetMachine along with all other used features. These collected features will also be used to emit the target features section in the future. The default configuration for the backend is thread-model=posix and no atomics, which was previously an invalid configuration. This change makes the default valid because the thread model is ignored. A side effect of this change is that objects are never emitted with passive segments. It will instead be up to the linker to decide whether sections should be active or passive based on whether atomics are used in the final link. Reviewers: aheejin, sbc100, dschuff Subscribers: mehdi_amini, jgravelle-google, hiraditya, sunfish, steven_wu, dexonsmith, rupprecht, jfb, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D58742 llvm-svn: 355112	2019-02-28 18:39:08 +00:00
Bjorn Pettersson	d30f308a9f	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Matt Arsenault	d3093c2f1f	GlobalISel: Implement fewerElementsVector for phi llvm-svn: 355048	2019-02-28 00:16:32 +00:00
Matt Arsenault	72bcf15dbf	GlobalISel: Implement moreElementsVector for phi llvm-svn: 355047	2019-02-28 00:01:05 +00:00
Philip Reames	288a95fc8c	Seperate volatility and atomicity/ordering in SelectionDAG At the moment, we mark every atomic memory access as being also volatile. This is unnecessarily conservative and prohibits many legal transforms (DCE, folding, etc..). This patch removes MOVolatile from the MachineMemOperands of atomic, but not volatile, instructions. This should be strictly NFC after a series of previous patches which have gone in to ensure backend code is conservative about handling of isAtomic MMOs. Once it's in and baked for a bit, we'll start working through removing unnecessary bailouts one by one. We applied this same strategy to the middle end a few years ago, with good success. To make sure this patch itself is NFC, it is build on top of a series of other patches which adjust code to (for the moment) be as conservative for an atomic access as for a volatile access and build up a test corpus (mostly in test/CodeGen/X86/atomics-unordered.ll).. Previously landed D57593 Fix a bug in the definition of isUnordered on MachineMemOperand D57596 [CodeGen] Be conservative about atomic accesses as for volatile D57802 Be conservative about unordered accesses for the moment rL353959: [Tests] First batch of cornercase tests for unordered atomics. rL353966: [Tests] RMW folding tests w/unordered atomic operations. rL353972: [Tests] More unordered atomic lowering tests. rL353989: [SelectionDAG] Inline a single use helper function, and remove last non-MMO interface rL354740: [Hexagon, SystemZ] Be super conservative about atomics rL354800: [Lanai] Be super conservative about atomics rL354845: [ARM] Be super conservative about atomics Attention Out of Tree Backend Owners: This patch may break you. If it does, you can use the TLI getMMOFlags hook to restore the MOVolatile to any instruction you need to. (See llvm-dev thread titled "PSA: Changes to how atomics are handled in backends" started Feb 27, 2019.) Differential Revision: https://reviews.llvm.org/D57601 llvm-svn: 355025	2019-02-27 20:20:08 +00:00
Eugene Leviant	7f78d4712f	[DebugInfo] Apply subprogram attributes on behalf of owner CU When using full LTO it is possible that template function definition DIE is bound to one compilation unit and it's declaration to another. We should add function declaration attributes on behalf of its owner CU otherwise we may end up with malformed file identifier in function declaration DW_AT_decl_file attribute. Differential revision: https://reviews.llvm.org/D58538 llvm-svn: 354978	2019-02-27 14:46:59 +00:00
Petar Avramovic	bd39569913	[MIPS GlobalISel] Select G_UADDO Lower G_UADDO. Legalize G_UADDO for MIPS32 Differential Revision: https://reviews.llvm.org/D58671 llvm-svn: 354900	2019-02-26 17:22:42 +00:00

1 2 3 4 5 ...

26144 Commits