llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	62f0d1650a	[SLP] Add support for swapping icmp/fcmp predicates to permit vectorization We should be able to match elements with the swapped predicate as well - as long as we commute the source operands. Differential Revision: https://reviews.llvm.org/D59956 llvm-svn: 357243	2019-03-29 10:41:00 +00:00
Florian Hahn	2b85de4383	Revert Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Another buildbot failure http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/20402 clang-9: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/include/llvm/ADT/DenseMap.h:1228: llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type* llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::operator->() const [with KeyT = const llvm::Instruction; ValueT = unsigned int; KeyInfoT = llvm::DenseMapInfo<const llvm::Instruction>; Bucket = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; bool IsConst = false; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::pointer = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>; llvm::DenseMapIterator<KeyT, ValueT, KeyInfoT, Bucket, IsConst>::value_type = llvm::detail::DenseMapPair<const llvm::Instruction, unsigned int>]: Assertion `isHandleInSync() && "invalid iterator access!"' failed. 0. Program arguments: /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/bin/clang-9 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name ArchiveCommandLine.cpp -mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu skylake-avx512 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -coverage-notes-file /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip/Output/ArchiveCommandLine.llvm.gcno -resource-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0 -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/include -I ../../../include -D _GNU_SOURCE -D __STDC_LIMIT_MACROS -D NDEBUG -D BREAK_HANDLER -D UNICODE -D _UNICODE -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/C -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/myWindows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/include_windows -I /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP -I . -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D NDEBUG -D _REENTRANT -D ENV_UNIX -D _7ZIP_LARGE_PAGES -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward -internal-isystem /usr/local/include -internal-isystem /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/stage1.install/lib/clang/9.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -O3 -std=gnu++98 -fdeprecated-macro -fdebug-compilation-dir /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/7zip -ferror-limit 19 -fmessage-length 0 -pthread -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o Output/ArchiveCommandLine.llvm.o -x c++ /home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/test/test-suite/MultiSource/Benchmarks/7zip/CPP/7zip/UI/Common/ArchiveCommandLine.cpp -faddrsig This reverts r357222 (git commit `64cccfcc72`) llvm-svn: 357227	2019-03-29 00:22:26 +00:00
Florian Hahn	64cccfcc72	Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock." Recommitting after addressing a buildbot failure. This reverts commit `c87869ebea`. llvm-svn: 357222	2019-03-28 23:11:00 +00:00
Florian Hahn	45682fd633	[LSR] Fix signed overflow in GenerateCrossUseConstantOffsets. For the attached test case, unchecked addition of immediate starts and ends overflows, as they can be arbitrary i64 constants. Proof: https://rise4fun.com/Alive/Plqc Reviewers: qcolombet, gilr, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59218 llvm-svn: 357217	2019-03-28 22:17:29 +00:00
Florian Hahn	c87869ebea	Revert [DSE] Preserve basic block ordering using OrderedBasicBlock. This reverts r357208 (git commit `c0bfd37d38`) This causes a buildbot failure: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/16124 FAILED: lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/install/stage2/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/IR -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR -Iinclude -I/home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/include -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -flto=thin -O3 -UNDEBUG -fno-exceptions -fno-rtti -MD -MT lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -MF lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o.d -o lib/IR/CMakeFiles/LLVMCore.dir/IRBuilder.cpp.o -c /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/IR/IRBuilder.cpp clang-9: /home/buildslave/ps4-buildslave1/clang-with-thin-lto-ubuntu/llvm.src/lib/Analysis/OrderedBasicBlock.cpp:38: bool llvm::OrderedBasicBlock::comesBefore(const llvm::Instruction , const llvm::Instruction ): Assertion `!(LastInstFound == BB->end() && NextInstPos != 0) && "Instruction supposed to be in NumberedInsts"' failed. llvm-svn: 357211	2019-03-28 20:36:24 +00:00
Florian Hahn	c0bfd37d38	[DSE] Preserve basic block ordering using OrderedBasicBlock. By extending OrderedBB to allow removing and replacing cached instructions, we can preserve OrderedBBs in DSE easily. This eliminates one source of quadratic compile time in DSE. Fixes PR38829. Reviewers: rnk, efriedma, hfinkel Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59789 llvm-svn: 357208	2019-03-28 20:02:33 +00:00
Benjamin Kramer	ba2ea93ad1	Make helper functions static. NFC. llvm-svn: 357187	2019-03-28 17:18:42 +00:00
Pierre Gousseau	a833c2bd3e	[asan] Add options -asan-detect-invalid-pointer-cmp and -asan-detect-invalid-pointer-sub options. This is in preparation to a driver patch to add gcc 8's -fsanitize=pointer-compare and -fsanitize=pointer-subtract. Disabled by default as this is still an experimental feature. Reviewed By: morehouse, vitalybuka Differential Revision: https://reviews.llvm.org/D59220 llvm-svn: 357157	2019-03-28 10:51:24 +00:00
Florian Hahn	e21ed594d8	[VPlan] Determine Vector Width programmatically. With this change, the VPlan native path is triggered with the directive: #pragma clang loop vectorize(enable) There is no need to specify the vectorize_width(N) clause. Patch by Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D57598 llvm-svn: 357156	2019-03-28 10:37:12 +00:00
Nikita Popov	7462303e06	[InstCombine] Use uadd.sat and usub.sat for canonicalization Start using the uadd.sat and usub.sat intrinsics for the existing canonicalizations. These intrinsics should optimize better than expanded IR, have better handling in the X86 backend and should be no worse than expanded IR in other backends, as far as we know. rL357012 already introduced use of uadd.sat for the add+umin pattern. Differential Revision: https://reviews.llvm.org/D58872 llvm-svn: 357103	2019-03-27 17:56:15 +00:00
Sanjay Patel	81e8d76f5b	[InstCombine] form uaddsat from add+umin (PR14613) This is the last step towards solving the examples shown in: https://bugs.llvm.org/show_bug.cgi?id=14613 With this change, x86 should end up with psubus instructions when those are available. All known codegen issues with expanding the saturating intrinsics were resolved with: D59006 / rL356855 We also have some early evidence in D58872 that using the intrinsics will lead to better perf. If some target regresses from this, custom lowering of the intrinsics (as in the above for x86) may be needed. llvm-svn: 357012	2019-03-26 17:50:08 +00:00
Simon Pilgrim	6f96795b88	[SLPVectorizer] Merge reorderAltShuffleOperands into reorderInputsAccordingToOpcode As discussed on D59738, this generalizes reorderInputsAccordingToOpcode to handle multiple + non-commutative instructions so we can get rid of reorderAltShuffleOperands and make use of the extra canonicalizations that reorderInputsAccordingToOpcode brings. Differential Revision: https://reviews.llvm.org/D59784 llvm-svn: 356939	2019-03-25 20:05:27 +00:00
Simon Pilgrim	ff3abef395	[SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction canonicalization Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops. This is prep work towards: (a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands (b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops. Differential Revision: https://reviews.llvm.org/D59738 llvm-svn: 356913	2019-03-25 15:53:55 +00:00
Simon Pilgrim	5cd4eb96f6	[SLPVectorizer] shouldReorderOperands - just check for reordering. NFCI. Remove the I.getOperand() calls from inside shouldReorderOperands - reorderInputsAccordingToOpcode should handle the creation of the operand lists and shouldReorderOperands should just check to see whether the i'th element should be commuted. llvm-svn: 356854	2019-03-24 13:36:32 +00:00
Nikita Popov	977934f00f	[ConstantRange] Add getFull() + getEmpty() named constructors; NFC This adds ConstantRange::getFull(BitWidth) and ConstantRange::getEmpty(BitWidth) named constructors as more readable alternatives to the current ConstantRange(BitWidth, /* full */ false) and similar. Additionally private getFull() and getEmpty() member functions are added which return a full/empty range with the same bit width -- these are commonly needed inside ConstantRange.cpp. The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet) constructor is now mandatory for the few usages that still make use of it. Differential Revision: https://reviews.llvm.org/D59716 llvm-svn: 356852	2019-03-24 09:34:40 +00:00
Simon Pilgrim	1466e5c383	Fix unused variable warning on non-asserts builds. NFCI. llvm-svn: 356841	2019-03-23 16:56:23 +00:00
Simon Pilgrim	64feec7977	Remove unused function argument. NFCI. llvm-svn: 356840	2019-03-23 16:20:34 +00:00
Simon Pilgrim	c7ba9555cf	[SLPVectorizer] reorderInputsAccordingToOpcode - use InstructionState directly. NFCI. llvm-svn: 356832	2019-03-23 13:44:06 +00:00
Nikita Popov	0125e4484e	[LowerSwitch] Use ConstantRange::fromKnownBits(); NFC Using an unsigned range to stay NFC, but a signed range would really be more useful here. llvm-svn: 356831	2019-03-23 12:48:54 +00:00
Simon Pilgrim	f4f01f3cff	[SLPVectorizer] Don't repeat VL.size() call. NFCI. llvm-svn: 356830	2019-03-23 12:11:25 +00:00
Simon Pilgrim	b68322f9d0	[SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 llvm-svn: 356814	2019-03-22 21:27:11 +00:00
Daniel Sanders	ef8761fd3b	Fix non-determinism in Reassociate caused by address coincidences Summary: Between building the pair map and querying it there are a few places that erase and create Values. It's rare but the address of these newly created Values is occasionally the same as a just-erased Value that we already have in the pair map. These coincidences should be accounted for to avoid non-determinism. Thanks to Roman Tereshin for the test case. Reviewers: rtereshin, bogner Reviewed By: rtereshin Subscribers: mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59401 llvm-svn: 356803	2019-03-22 20:16:35 +00:00
Tim Renouf	94c163c34e	InstCombineSimplifyDemanded: Allow v3 results for AMDGCN buffer and image intrinsics This helps to avoid the situation where RA spots that only 3 of the v4f32 result of a load are used, and immediately reallocates the 4th register for something else, requiring a stall waiting for the load. Differential Revision: https://reviews.llvm.org/D58906 Change-Id: I947661edfd5715f62361a02b100f14aeeada29aa llvm-svn: 356768	2019-03-22 15:53:50 +00:00
Akira Hatanaka	b576c77a9e	Don't add a tail keyword to calls to ObjC runtime functions if the calls are annotated with notail. r356705 annotated calls to objc_retainAutoreleasedReturnValue with notail on x86-64. This commit teaches ARC optimizer to check the notail marker on the call before turning it into a tail call. rdar://problem/38675807 llvm-svn: 356707	2019-03-21 20:16:09 +00:00
Craig Topper	16dc165046	[InstCombine] Don't transform ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) if either zext or OP has another use. If they have other users we'll just end up increasing the instruction count. We might be able to weaken this to only one of them having a single use if we can prove that the and will be removed. Fixes PR41164. Differential Revision: https://reviews.llvm.org/D59630 llvm-svn: 356690	2019-03-21 17:50:49 +00:00
Simon Pilgrim	92cbcfc325	Fix -Wmisleading-indentation gcc7 warning. NFCI. llvm-svn: 356658	2019-03-21 11:58:22 +00:00
Mikael Holmen	5b1754f93d	Silence warning about unused variable in builds without asserts [NFC] llvm-svn: 356648	2019-03-21 07:54:44 +00:00
Philip Reames	60212be619	[instcombine] Add some todos, and arrange code for readibility llvm-svn: 356642	2019-03-21 03:23:40 +00:00
Alina Sbirlea	f69f807321	[NFC] Fix brace indentation. llvm-svn: 356596	2019-03-20 19:18:55 +00:00
Robert Lougher	f2158a8ef0	Resubmit r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines" Failing LLD tests have been fixed in r356593. llvm-svn: 356594	2019-03-20 19:08:18 +00:00
Philip Reames	e4588bbf80	Simplify operands of masked stores and scatters based on demanded elements If we know we're not storing a lane, we don't need to compute the lane. This could be improved by using the undef element result to further prune the mask, but I want to separate that into its own change since it's relatively likely to expose other problems. Differential Revision: https://reviews.llvm.org/D57247 llvm-svn: 356590	2019-03-20 18:44:58 +00:00
Alina Sbirlea	5baa72ea74	[LICM & MemorySSA] Don't sink/hoist stores in the presence of ordered loads. Summary: Before this patch, if any Use existed in the loop, with a defining access in the loop, we conservatively decide to not move the store. What this approach was missing, is that ordered loads are not Uses, they're Defs in MemorySSA. So, even when the clobbering walker does not find that volatile load to interfere, we still cannot hoist a store past a volatile load. Resolves PR41140. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59564 llvm-svn: 356588	2019-03-20 18:33:37 +00:00
Nikita Popov	37cf25c3c6	[InstCombine] Fold add nuw + uadd.with.overflow Fold add nuw and uadd.with.overflow with constants if the addition does not overflow. Part of https://bugs.llvm.org/show_bug.cgi?id=38146. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D59471 llvm-svn: 356584	2019-03-20 18:00:27 +00:00
Philip Reames	484d07c828	[instcombine] Add todos describing missing transforms for masked.* intrinsics llvm-svn: 356536	2019-03-20 03:36:05 +00:00
Robert Lougher	c67a759c99	Revert r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines" Due to buildbot failures (LLD tests). llvm-svn: 356516	2019-03-19 20:54:20 +00:00
Robert Lougher	de548ccab9	[TailCallElim] Add tailcall elimination pass to LTO pipelines LTO provides additional opportunities for tailcall elimination due to link-time inlining and visibility of nocapture attribute. Testing showed negligible impact on compilation times. Differential Revision: https://reviews.llvm.org/D58391 llvm-svn: 356511	2019-03-19 20:24:28 +00:00
Philip Reames	70537abe52	Demanded elements support for masked.load and masked.gather Teach instcombine to propagate demanded elements through a masked load or masked gather instruction. This is in the broader context of improving vector pointer instcombine under https://reviews.llvm.org/D57140. Differential Revision: https://reviews.llvm.org/D57372 llvm-svn: 356510	2019-03-19 20:10:00 +00:00
Sanjay Patel	5b820323ca	[InstCombine] fold logic-of-nan-fcmps (PR41069) Combine 2 fcmps that are checking for nan-ness: and (fcmp ord X, 0), (and (fcmp ord Y, 0), Z) --> and (fcmp ord X, Y), Z or (fcmp uno X, 0), (or (fcmp uno Y, 0), Z) --> or (fcmp uno X, Y), Z This is an exact match for a minimal reassociation pattern. If we want to handle this more generally that should go in the reassociate pass and allow removing this code. This should fix: https://bugs.llvm.org/show_bug.cgi?id=41069 llvm-svn: 356471	2019-03-19 16:39:17 +00:00
Markus Lavin	b86ce219f4	[DebugInfo] Introduce DW_OP_LLVM_convert Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type. The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp to perform sext/zext on debug values but mainly the patch is about preparing terrain for adding other Dwarf v5 ops that need to reference a base_type. For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a complex shift & mask pattern is generated to emulate sext/zext. This is a recommit of r356442 with trivial fixes for the failing tests. Differential Revision: https://reviews.llvm.org/D56587 llvm-svn: 356451	2019-03-19 13:16:28 +00:00
Markus Lavin	ad78768d59	Revert "[DebugInfo] Introduce DW_OP_LLVM_convert" This reverts commit 1cf4b593a7ebd666fc6775f3bd38196e8e65fafe. Build bots found failing tests not detected locally. Failing Tests (3): LLVM :: DebugInfo/Generic/convert-debugloc.ll LLVM :: DebugInfo/Generic/convert-inlined.ll LLVM :: DebugInfo/Generic/convert-linked.ll llvm-svn: 356444	2019-03-19 09:17:28 +00:00
Markus Lavin	cd8a940b37	[DebugInfo] Introduce DW_OP_LLVM_convert Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows for a convenient way to perform type conversions on the Dwarf expression stack. As an additional bonus it paves the way for using other Dwarf v5 ops that need to reference a base_type. The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp to perform sext/zext on debug values but mainly the patch is about preparing terrain for adding other Dwarf v5 ops that need to reference a base_type. For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a complex shift & mask pattern is generated to emulate sext/zext. Differential Revision: https://reviews.llvm.org/D56587 llvm-svn: 356442	2019-03-19 08:48:19 +00:00
Sanjay Patel	6063393536	[InstCombine] allow general vector constants for funnel shift to shift transforms Follow-up to: rL356338 rL356369 We can calculate an arbitrary vector constant minus the bitwidth, so there's no need to limit this transform to scalars and splats. llvm-svn: 356372	2019-03-18 14:27:51 +00:00
Sanjay Patel	84de8a30a0	[InstCombine] extend rotate-left-by-constant canonicalization to funnel shift Follow-up to: rL356338 Rotates are a special case of funnel shift where the 2 input operands are the same value, but that does not need to be a restriction for the canonicalization when the shift amount is a constant. llvm-svn: 356369	2019-03-18 14:10:11 +00:00
Sanjay Patel	b3bcd95771	[InstCombine] canonicalize rotate right by constant to rotate left This was noted as a backend problem: https://bugs.llvm.org/show_bug.cgi?id=41057 ...and subsequently fixed for x86: rL356121 But we should canonicalize these in IR for the benefit of all targets and improve IR analysis such as CSE. llvm-svn: 356338	2019-03-17 19:08:00 +00:00
Philip Reames	68a2e4d48b	[SimplifyDemandedVec] Strengthen handling all undef lanes (particularly GEPs) A change of two parts: 1) A generic enhancement for all callers of SDVE to exploit the fact that if all lanes are undef, the result is undef. 2) A GEP specific piece to strengthen/fix the vector index undef element handling, and call into the generic infrastructure when visiting the GEP. The result is that we replace a vector gep with at least one undef in each lane with a undef. We can also do the same for vector intrinsics. Once the masked.load patch (D57372) has landed, I'll update to include call tests as well. Differential Revision: https://reviews.llvm.org/D57468 llvm-svn: 356293	2019-03-15 19:54:06 +00:00
Robert Widmann	2f1ebe6ee8	[LLVM-C] Expose the "Add Discriminators" Pass To LLVM-C Summary: Add bindings to create a wrapped "Add Discriminators" pass. Now that we have debug info support, this is a handy transform to have. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: dblaikie, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58624 llvm-svn: 356272	2019-03-15 16:57:23 +00:00
Teresa Johnson	70ec64cb72	[ThinLTO] Restructure AliasSummary to contain ValueInfo of Aliasee Summary: The AliasSummary previously contained the AliaseeGUID, which was only populated when reading the summary from bitcode. This patch changes it to instead hold the ValueInfo of the aliasee, and always populates it. This enables more efficient access to the ValueInfo (specifically in the recent patch r352438 which needed to perform an index hash table lookup using the aliasee GUID). As noted in the comments in AliasSummary, we no longer technically need to keep a pointer to the corresponding aliasee summary, since it could be obtained by walking the list of summaries on the ValueInfo looking for the summary in the same module. However, I am concerned that this would be inefficient when walking through the index during the thin link for various analyses. That can be reevaluated in the future. By always populating this new field, we can remove the guard and special handling for a 0 aliasee GUID when dumping the dot graph of the summary. An additional improvement in this patch is when reading the summaries from LLVM assembly we now set the AliaseeSummary field to the aliasee summary in that same module, which makes it consistent with the behavior when reading the summary from bitcode. Reviewers: pcc, mehdi_amini Subscribers: inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D57470 llvm-svn: 356268	2019-03-15 15:11:38 +00:00
Florian Hahn	d9e88f7b7f	[LSR] Check for signed overflow in NarrowSearchSpaceByDetectingSupersets. We are adding a sign extended IR value to an int64_t, which can cause signed overflows, as in the attached test case, where we have a formula with BaseOffset = -1 and a constant with numeric_limits<int64_t>::min(). If the addition would overflow, skip the simplification for this formula. Note that the target triple is required to trigger the failure. Reviewers: qcolombet, gilr, kparzysz, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59211 llvm-svn: 356256	2019-03-15 12:17:36 +00:00
Matt Arsenault	1d83670dbd	AMDGPU: Remove intrinsic operand assert Before r355981, this was under LLVM_DEBUG. I don't think the assert is quite right, but this really should be a verifier check. Instcombine should not be asserting on this sort of thing. llvm-svn: 356219	2019-03-14 23:45:09 +00:00
Sanjay Patel	de1d5d3675	[InstCombine] canonicalize funnel shift constant shift amount to be modulo bitwidth The shift argument is defined to be modulo the bitwidth, so if that argument is a constant, we can always reduce the constant to its minimal form to allow better CSE and other follow-on transforms. We need to be careful to ignore constant expressions here, or we will likely infinite loop. I'm adding a general vector constant query for that case. Differential Revision: https://reviews.llvm.org/D59374 llvm-svn: 356192	2019-03-14 19:22:08 +00:00
Sam Parker	a86ff8640d	Fix for buildbots Remove unused private field. llvm-svn: 356135	2019-03-14 11:38:55 +00:00
Sam Parker	eb0b8019e8	[NFC][LSR] Cleanup Cost API Create members for Loop, ScalarEvolution, DominatorTree, TargetTransformInfo and Formula. Differential Revision: https://reviews.llvm.org/D58389 llvm-svn: 356131	2019-03-14 11:05:07 +00:00
Matt Arsenault	caf1316f71	IR: Add immarg attribute This indicates an intrinsic parameter is required to be a constant, and should not be replaced with a non-constant value. Add the attribute to all AMDGPU and generic intrinsics that comments indicate it should apply to. I scanned other target intrinsics, but I don't see any obvious comments indicating which arguments are intended to be only immediates. This breaks one questionable testcase for the autoupgrade. I'm unclear on whether the autoupgrade is supposed to really handle declarations which were never valid. The verifier fails because the attributes now refer to a parameter past the end of the argument list. llvm-svn: 355981	2019-03-12 21:02:54 +00:00
Philip Reames	9b6b4fac83	[SROA] Fix a crash when trying to convert a memset to an non-integral pointer type The included test case currently crashes on tip of tree. Rather than adding a bailout, I chose to restructure the code so that the existing helper function could be used. Given that, the majority of the diff is NFC-ish, but the key difference is that canConvertValue returns false when only one side is a non-integral pointer. Thanks to Cherry Zhang for the test case. Differential Revision: https://reviews.llvm.org/D59000 llvm-svn: 355962	2019-03-12 20:15:05 +00:00
Craig Topper	03e93f514a	[SanitizerCoverage] Avoid splitting critical edges when destination is a basic block containing unreachable This patch adds a new option to SplitAllCriticalEdges and uses it to avoid splitting critical edges when the destination basic block ends with unreachable. Otherwise if we split the critical edge, sanitizer coverage will instrument the new block that gets inserted for the split. But since this block itself shouldn't be reachable this is pointless. These basic blocks will stick around and generate assembly, but they don't end in sane control flow and might get placed at the end of the function. This makes it look like one function has code that flows into the next function. This showed up while compiling the linux kernel with clang. The kernel has a tool called objtool that detected the code that appeared to flow from one function to the next. https://github.com/ClangBuiltLinux/linux/issues/351#issuecomment-461698884 Differential Revision: https://reviews.llvm.org/D57982 llvm-svn: 355947	2019-03-12 18:20:25 +00:00
Liang Zou	4a8afeb970	[format] \t => ' ' Summary: 1. \t => ' ' 2. test commit access Reviewers: Higuoxing, liangdzou Reviewed By: Higuoxing, liangdzou Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59243 llvm-svn: 355924	2019-03-12 14:48:32 +00:00
Fangrui Song	b1dfbebe8b	[SimplifyLibCalls] Simplify optimizePuts The code might intend to replace puts("") with putchar('\n') even if the return value is used. It failed because use_empty() was used to guard the whole block. While returning '\n' (putchar('\n')) is technically correct (puts is only required to return a nonnegative number on success), doing this looks weird and there is really little benefit to optimize puts whose return value is used. So don't do that. llvm-svn: 355921	2019-03-12 14:20:22 +00:00
Simon Pilgrim	d3a8fd8bfb	Revert rL355906: [SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 ........ Reverted due to buildbot failures that I don't have time to track down. llvm-svn: 355913	2019-03-12 11:51:59 +00:00
Simon Pilgrim	5db95efdbd	Try to fix SLPVectorizer BoUpSLP::BoEdgeInfo::dump visibility on non-debug builds llvm-svn: 355912	2019-03-12 11:31:06 +00:00
Simon Pilgrim	2086a8894d	[SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 llvm-svn: 355906	2019-03-12 10:51:51 +00:00
Fangrui Song	f260967055	[SimplifyLibCalls] Fix comments about fputs, memchr, and s[n]printf. NFC llvm-svn: 355905	2019-03-12 10:31:52 +00:00
Kristina Brooks	5b1e1c0537	Very minor typo. NFC Typo `we we're` => `we were` in the pass EarlyCSE Patch by liangdzou (Liang ZOU) Differential Revision: https://reviews.llvm.org/D59241 llvm-svn: 355895	2019-03-12 07:08:19 +00:00
Sanjoy Das	3f5ce18658	Reland "Relax constraints for reduction vectorization" Change from original commit: move test (that uses an X86 triple) into the X86 subdirectory. Original description: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355889	2019-03-12 01:31:44 +00:00
Sanjoy Das	2136a5bc49	Revert "Relax constraints for reduction vectorization" This reverts commit r355868. Breaks hexagon. llvm-svn: 355873	2019-03-11 22:37:31 +00:00
Sanjoy Das	93f8cc186a	Relax constraints for reduction vectorization Summary: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355868	2019-03-11 21:36:41 +00:00
Nico Weber	885b790f89	Remove esan. It hasn't seen active development in years, and it hasn't reached a state where it was useful. Remove the code until someone is interested in working on it again. Differential Revision: https://reviews.llvm.org/D59133 llvm-svn: 355862	2019-03-11 20:23:40 +00:00
Brian Gesiak	d7b68132d8	[coroutines][PR40979] Ignore unreachable uses across suspend points Summary: Depends on https://reviews.llvm.org/D59069. https://bugs.llvm.org/show_bug.cgi?id=40979 describes a bug in which the -coro-split pass would assert that a use was across a suspend point from a definition. Normally this would mean that a value would "spill" across a suspend point and thus need to be stored in the coroutine frame. However, in this case the use was unreachable, and so it would not be necessary to store the definition on the frame. To prevent the assert, simply remove unreachable basic blocks from a coroutine function before computing spills. This avoids the assert reported in PR40979. Reviewers: GorNishanov, tks2103 Reviewed By: GorNishanov Subscribers: EricWF, jdoerfert, llvm-commits, lewissbaker Tags: #llvm Differential Revision: https://reviews.llvm.org/D59068 llvm-svn: 355852	2019-03-11 18:31:28 +00:00
Brian Gesiak	4349dc76fa	[Utils] Extract EliminateUnreachableBlocks (NFC) Summary: Extract the functionality of eliminating unreachable basic blocks within a function, previously encapsulated within the -unreachableblockelim pass, and make it available as a function within BlockUtils.h. No functional change intended other than making the logic reusable. Exposing this logic makes it easier to implement https://reviews.llvm.org/D59068, which fixes coroutines bug https://bugs.llvm.org/show_bug.cgi?id=40979. Reviewers: mkazantsev, wmi, davidxl, silvas, davide Reviewed By: davide Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59069 llvm-svn: 355846	2019-03-11 17:51:57 +00:00
Jeremy Morse	90ede5f4bf	[SimplifyCFG] Retain debug info when threading jumps with critical edges Fixes bug 38023: https://bugs.llvm.org/show_bug.cgi?id=38023 The SimplifyCFG pass will perform jump threading in some cases where doing so is trivial and would simplify the CFG. When folding a series of blocks with redundant conditional branches into an unconditional "critical edge" block, it does not keep the debug location associated with the previous conditional branch. This patch fixes the bug described by copying the debug info from the old conditional branch to the new unconditional branch instruction, and adds a regression test for the SimplifyCFG pass that covers this case. Patch by Stephen Tozer! Differential Revision: https://reviews.llvm.org/D59206 llvm-svn: 355833	2019-03-11 16:23:59 +00:00
Jeremy Morse	b60aea4131	[JumpThreading] Retain debug info when replacing branch instructions Fixes bug 37966: https://bugs.llvm.org/show_bug.cgi?id=37966 The Jump Threading pass will replace certain conditional branch instructions with unconditional branches when it can prove that only one branch can occur. Prior to this patch, it would not carry the debug info from the old instruction to the new one. This patch fixes the bug described by copying the debug info from the conditional branch instruction to the new unconditional branch instruction, and adds a regression test for the Jump Threading pass that covers this case. Patch by Stephen Tozer! Differential Revision: https://reviews.llvm.org/D58963 llvm-svn: 355822	2019-03-11 11:48:57 +00:00
Clement Courbet	8e16d73346	[SelectionDAG] Allow the user to specify a memeq function. Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 llvm-svn: 355672	2019-03-08 09:07:45 +00:00
David Green	ffc922ec35	[LSR] Attempt to increase the accuracy of LSR's setup cost In some loops, we end up generating loop induction variables that look like: {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1} As opposed to the simpler: {(zext i16 (%i0 * %i1) to i32),+,-1} i.e we count up from -limit to 0, not the simpler counting down from limit to 0. This is because the scores, as LSR calculates them, are the same and the second is filtered in place of the first. We end up with a redundant SUB from 0 in the code. This patch tries to make the calculation of the setup cost a little more thoroughly, recursing into the scev members to better approximate the setup required. The cost function for comparing LSR costs is: return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds, C1.ScaleCost, C1.ImmCost, C1.SetupCost) < std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds, C2.ScaleCost, C2.ImmCost, C2.SetupCost); So this will only alter results if none of the other variables turn out to be different. Differential Revision: https://reviews.llvm.org/D58770 llvm-svn: 355597	2019-03-07 13:44:40 +00:00
Fangrui Song	b0f764c737	[BDCE] Optimize find+insert with early insert llvm-svn: 355583	2019-03-07 06:38:03 +00:00
Nick Desaulniers	212c8ac23f	[LoopRotate] fix crash encountered with callbr Summary: While implementing inlining support for callbr (https://bugs.llvm.org/show_bug.cgi?id=40722), I hit a crash in Loop Rotation when trying to build the entire x86 Linux kernel (drivers/char/random.c). This is a small fix up to r353563. Test case is drivers/char/random.c (with callbr's inlined), then ran through creduce, then `opt -opt-bisect-limit=<limit>`, then bugpoint. Thanks to Craig Topper for immediately spotting the fix, and teaching me how to fish. Reviewers: craig.topper, jyknight Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D58929 llvm-svn: 355564	2019-03-06 23:04:40 +00:00
Nikita Popov	884feb1b69	[InstCombine] Fold add nsw + sadd.with.overflow Fold `add nsw` and `sadd.with.overflow` with constants if the addition does not overflow. Part of https://bugs.llvm.org/show_bug.cgi?id=38146. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D58881 llvm-svn: 355530	2019-03-06 18:30:00 +00:00
Evgeniy Stepanov	53d7c5cd44	[msan] Instrument x86 BMI intrinsics. Summary: They simply shuffle bits. MSan needs to do the same with shadow bits, after making sure that the shuffle mask is fully initialized. Reviewers: pcc, vitalybuka Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58858 llvm-svn: 355348	2019-03-04 22:58:20 +00:00
Jordan Rupprecht	090683b85e	[NFC] Fix PGO link error in shared libs build llvm-svn: 355346	2019-03-04 22:54:44 +00:00
Sanjay Patel	6e32b46b1d	[ConstantHoisting] avoid hang/crash from unreachable blocks (PR40930) I'm not too familiar with this pass, so there might be a better solution, but this appears to fix the degenerate: PR40930 PR40931 PR40932 PR40934 ...without affecting any real-world code. As we've seen in several other passes, when we have unreachable blocks, they can contain semi-bogus IR and/or cause unexpected conditions. We would not typically expect these patterns to make it this far, but we have to guard against them anyway. llvm-svn: 355337	2019-03-04 20:57:14 +00:00
Rong Xu	db29a3a438	[PGO] Context sensitive PGO (part 3) Part 3 of CSPGO changes (mostly related to PassMananger). Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355330	2019-03-04 20:21:27 +00:00
Davide Italiano	672bec223d	[InstCombine] Mark debug values as unavailable after DCE. Fixes PR40838. llvm-svn: 355301	2019-03-04 04:38:58 +00:00
Sanjay Patel	1f65903dc1	[InstCombine] move add after smin/smax Follow-up to rL355221. This isn't specifically called for within PR14613, but we'll get there eventually if it's not already requested in some other bug report. https://rise4fun.com/Alive/5b0 Name: smax Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i8 %x, C0 %cond = icmp sgt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp sgt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nsw i8 %u2, C0 Name: smin Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i32 %x, C0 %cond = icmp slt i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp slt i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nsw i32 %u2, C0 llvm-svn: 355272	2019-03-02 16:45:10 +00:00
Philip Reames	cf0a978e1f	[InstCombine] Extend saturating idempotent atomicrmw transform to FP I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw. Differential Revision: https://reviews.llvm.org/D58836 llvm-svn: 355222	2019-03-01 19:50:36 +00:00
Sanjay Patel	6e1e7e1c3e	[InstCombine] move add after umin/umax In the motivating cases from PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 ...moving the add enables us to narrow the min/max which eliminates zext/trunc which enables signficantly better vectorization. But that bug is still not completely fixed. https://rise4fun.com/Alive/5KQ Name: umax Pre: C1 u>= C0 %a = add nuw i8 %x, C0 %cond = icmp ugt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp ugt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nuw i8 %u2, C0 Name: umin Pre: C1 u>= C0 %a = add nuw i32 %x, C0 %cond = icmp ult i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp ult i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nuw i32 %u2, C0 llvm-svn: 355221	2019-03-01 19:42:40 +00:00
Philip Reames	2226e9a745	[LICM] Infer proper alignment from loads during scalar promotion This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load. For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an incredibly rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we may fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually is well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit. Differential Revision: https://reviews.llvm.org/D58809 llvm-svn: 355217	2019-03-01 18:45:05 +00:00
Philip Reames	77982868c5	[InstCombine] Extend "idempotent" atomicrmw optimizations to floating point An idempotent atomicrmw is one that does not change memory in the process of execution. We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR. Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load. As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future. Differential Revision: https://reviews.llvm.org/D58251 llvm-svn: 355210	2019-03-01 18:00:07 +00:00
Jonas Hahnfeld	e071cd86df	Hide two unused debugging methods, NFCI. GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205	2019-03-01 17:15:21 +00:00
Manman Ren	576124a319	Try to fix NetBSD buildbot breakage introduced in D57463. By including the header file in the source. llvm-svn: 355202	2019-03-01 15:25:24 +00:00
Fangrui Song	f4b25f700a	[ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM Summary: ConstIntInfoVec contains elements extracted from the previous function. In new PM, releaseMemory() is not called and the dangling elements can cause segfault in findConstantInsertionPoint. Rename releaseMemory() to cleanup() to deliver the idea that it is mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix this. Reviewers: ormris, zzheng, dmgreen, wmi Reviewed By: ormris, wmi Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58589 llvm-svn: 355174	2019-03-01 05:27:01 +00:00
Reid Kleckner	701593f1db	[sancov] Instrument reachable blocks that end in unreachable Summary: These sorts of blocks often contain calls to noreturn functions, like longjmp, throw, or trap. If they don't end the program, they are "interesting" from the perspective of sanitizer coverage, so we should instrument them. This was discussed in https://reviews.llvm.org/D57982. Reviewers: kcc, vitalybuka Subscribers: llvm-commits, craig.topper, efriedma, morehouse, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D58740 llvm-svn: 355152	2019-02-28 22:54:30 +00:00
Manman Ren	1829512dd3	Add a module pass for order file instrumentation The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name. In this pass, we add three global variables: (1) an order file buffer: a circular buffer at its own llvm section. (2) a bitmap for each module: one byte for each function to say if the function is already executed. (3) a global index to the order file buffer. At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index. Differential Revision: https://reviews.llvm.org/D57463 llvm-svn: 355133	2019-02-28 20:13:38 +00:00
Rong Xu	a6ff69f6dd	[PGO] Context sensitive PGO (part 2) Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131	2019-02-28 19:55:07 +00:00
Sanjay Patel	4a47f5f550	[InstCombine] fold adds of constants separated by sext/zext This is part of a transform that may be done in the backend: D13757 ...but it should always be beneficial to fold this sooner in IR for all targets. https://rise4fun.com/Alive/vaiW Name: sext add nsw %add = add nsw i8 %i, C0 %ext = sext i8 %add to i32 %r = add i32 %ext, C1 => %s = sext i8 %i to i32 %r = add i32 %s, sext(C0)+C1 Name: zext add nuw %add = add nuw i8 %i, C0 %ext = zext i8 %add to i16 %r = add i16 %ext, C1 => %s = zext i8 %i to i16 %r = add i16 %s, zext(C0)+C1 llvm-svn: 355118	2019-02-28 19:05:26 +00:00
Chijun Sima	586187639a	Make MergeBlockIntoPredecessor conformant to the precondition of calling DTU.applyUpdates Summary: It is mentioned in the document of DTU that "It is illegal to submit any update that has already been submitted, i.e., you are supposed not to insert an existent edge or delete a nonexistent edge." It is dangerous to violet this rule because DomTree and PostDomTree occasionally crash on this scenario. This patch fixes `MergeBlockIntoPredecessor`, making it conformant to this precondition. Reviewers: kuhar, brzycki, chandlerc Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58444 llvm-svn: 355105	2019-02-28 16:47:18 +00:00
Bjorn Pettersson	d30f308a9f	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Eric Christopher	07944353fc	Temporarily revert "ArgumentPromotion should copy all metadata to new Function" and the dependent patch "Refine ArgPromotion metadata handling" as they're causing segfaults in argument promotion. This reverts commits r354032 and r353537. llvm-svn: 355060	2019-02-28 01:11:12 +00:00
Reid Kleckner	4fb3502bc9	[InstrProf] Use separate comdat group for data and counters Summary: I hadn't realized that instrumentation runs before inlining, so we can't use the function as the comdat group. Doing so can create relocations against discarded sections when references to discarded __profc_ variables are inlined into functions outside the function's comdat group. In the future, perhaps we should consider standardizing the comdat group names that ELF and COFF use. It will save object file size, since __profv_$sym won't appear in the symbol table again. Reviewers: xur, vsk Subscribers: eraman, hiraditya, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58737 llvm-svn: 355044	2019-02-27 23:38:44 +00:00
Alina Sbirlea	fcfa7c5f92	[MemorySSA] Make insertDef insert corresponding phi nodes. Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040	2019-02-27 22:20:22 +00:00
Rong Xu	6cdf3d8086	Recommit r354930 "[PGO] Context sensitive PGO (part 1)" Fixed UBSan failures. llvm-svn: 355005	2019-02-27 17:24:33 +00:00
Vlad Tsyrklevich	c01643087e	Revert "[PGO] Context sensitive PGO (part 1)" This reverts commit r354930, it was causing UBSan failures. llvm-svn: 354953	2019-02-27 03:45:28 +00:00
Vedant Kumar	73522d1678	[HotColdSplit] Disable splitting for sanitized functions Splitting can make sanitizer errors harder to understand, as the trapping instruction may not be in the function where the bug was detected. rdar://48142697 llvm-svn: 354931	2019-02-26 22:55:46 +00:00
Rong Xu	35d2d51369	[PGO] Context sensitive PGO (part 1) Current PGO profile counts are not context sensitive. The branch probabilities for the inlined functions are kept the same for all call-sites, and they might be very different from the actual branch probabilities. These suboptimal profiles can greatly affect some downstream optimizations, in particular for the machine basic block placement optimization. In this patch, we propose to have a post-inline PGO instrumentation/use pass, which we called Context Sensitive PGO (CSPGO). For the users who want the best possible performance, they can perform a second round of PGO instrument/use on the top of the regular PGO. They will have two sets of profile counts. The first pass profile will be manly for inline, indirect-call promotion, and CGSCC simplification pass optimizations. The second pass profile is for post-inline optimizations and code-gen optimizations. A typical usage: // Regular PGO instrumentation and generate pass1 profile. > clang -O2 -fprofile-generate source.c -o gen > ./gen > llvm-profdata merge default.profraw -o pass1.profdata // CSPGO instrumentation. > clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2 > ./gen2 // Merge two sets of profiles > llvm-profdata merge default.profraw pass1.profdata -o profile.profdata // Use the combined profile. Pass manager will invoke two PGO use passes. > clang -O2 -fprofile-use=profile.profdata -o use This change touches many components in the compiler. The reviewed patch (D54175) will committed in phrases. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 354930	2019-02-26 22:37:46 +00:00
Alina Sbirlea	9026404125	[MemorySSA & SimpleLoopUnswitch] Update MemorySSA in ReplaceUsesOfWith. SimpleLoopUnswitch must update MemorySSA when removing instructions. Resolves PR39197. llvm-svn: 354919	2019-02-26 19:44:52 +00:00
Sanjay Patel	e8bf0f79bd	[InstCombine] canonicalize more unsigned saturated add with 'not' Yet another pattern variation suggested by: https://bugs.llvm.org/show_bug.cgi?id=14613 There are 8 more potential commuted patterns here on top of the 8 that were already handled (rL354221, rL354276, rL354393). We have the obvious commute of the 'add' + commute of the cmp predicate/operands (ugt/ult) + commute of the select operands: Name: base %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %x, %y %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %y, %x %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %y, %x %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt + commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %x, %y %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/den llvm-svn: 354887	2019-02-26 15:18:49 +00:00
Simon Pilgrim	a066f1f9e6	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790	2019-02-25 15:42:02 +00:00
Sanjay Patel	9907d3c8b4	[InstCombine] canonicalize add/sub with bool add A, sext(B) --> sub A, zext(B) We have to choose 1 of these forms, so I'm opting for the zext because that's easier for value tracking. The backend should be prepared for this change after: D57401 rL353433 This is also a preliminary step towards reducing the amount of bit hackery that we do in IR to optimize icmp/select. That should be waiting to happen at a later optimization stage. The seeming regression in the fuzzer test was discussed in: D58359 We were only managing that fold in instcombine by luck, and other passes should be able to deal with that better anyway. llvm-svn: 354748	2019-02-24 16:57:45 +00:00
Matt Arsenault	65b4ab9921	BreakCriticalEdges: Update PostDominatorTree llvm-svn: 354673	2019-02-22 15:01:41 +00:00
Roman Tereshin	99a6672bba	[LowerSwitch][AMDGPU] Do not handle impossible values This patch adds LazyValueInfo to LowerSwitch to compute the range of the value being switched over and reduce the size of the tree LowerSwitch builds to lower a switch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D58096 llvm-svn: 354670	2019-02-22 14:33:46 +00:00
Chijun Sima	70e97163e0	[DTU] Refine the interface and logic of applyUpdates Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669	2019-02-22 13:48:38 +00:00
Alina Sbirlea	90d2e3a16d	[MemorySSA & LoopPassManager] Resolve PR40038. The correct edge being deleted is not to the unswitched exit block, but to the original block before it was split. That's the key in the map, not the value. The insert is correct. The new edge is to the .split block. The splitting turns OriginalBB into: OriginalBB -> OriginalBB.split. Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete ParentBB->OriginalBB, not ParentBB->OriginalBB.split. llvm-svn: 354656	2019-02-22 07:18:37 +00:00
Chijun Sima	f131d6110e	[DTU] Deprecate insertEdge/deleteEdge Summary: This patch converts all existing `insertEdge/deleteEdge` to `applyUpdates` and marks `insertEdge/deleteEdge` as deprecated. Reviewers: kuhar, brzycki Reviewed By: kuhar, brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58443 llvm-svn: 354652	2019-02-22 05:41:43 +00:00
Alina Sbirlea	97468e9282	[MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks. MemorySSA is now updated when forming dedicated exit blocks. Resolves PR40037. llvm-svn: 354623	2019-02-21 21:13:34 +00:00
Alina Sbirlea	d2d3244363	[LoopSimplifyCFG] Update MemorySSA after r353911. Summary: MemorySSA is not properly updated in LoopSimplifyCFG after recent changes. Use SplitBlock utility to resolve that and clear all updates once handleDeadExits is finished. All updates that follow are removal of edges which are safe to handle via the removeEdge() API. Also, deleting dead blocks is done correctly as is, i.e. delete from MemorySSA before updating the CFG and DT. Reviewers: mkazantsev, rtereshin Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58524 llvm-svn: 354613	2019-02-21 19:54:05 +00:00
Alina Sbirlea	73446cd567	[EarlyCSE] Cleanup deadcode. [NFCI] Summary: Cleanup nop assignments. Reviewers: george.burgess.iv, davide Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58308 llvm-svn: 354612	2019-02-21 19:49:57 +00:00
Joey Gouly	fdf651ee8d	[InferAddressSpaces] Fix fallthrough error llvm-svn: 354580	2019-02-21 13:10:37 +00:00
Joey Gouly	92af1360f3	[InferAddressSpaces] Fix crash on select of non-ptr operands Check the operands of a select are pointers, to determine if it is an address expression or not. https://reviews.llvm.org/D58226 llvm-svn: 354576	2019-02-21 12:31:36 +00:00
Max Kazantsev	10489d76f6	[LoopSimplifyCFG] Add missing MSSA edge deletion When we create fictive switch in preheader, we should take care about MSSA and delete edge between old preheader and header. llvm-svn: 354547	2019-02-21 05:51:29 +00:00
Wei Mi	500606f270	[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled is false. Right now for inliner and partial inliner, we always pass the address of a valid ORE object to getInlineCost even if RemarkEnabled is false because of no -Rpass is specified. Since ComputeFullInlineCost will be set to true if ORE is non-null in getInlineCost, this introduces the problem that in getInlineCost we cannot return early even if we already know the cost is definitely higher than the threshold. It is a general problem for compile time. This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is false. Differential Revision: https://reviews.llvm.org/D58399 llvm-svn: 354542	2019-02-21 02:57:52 +00:00
Philip Reames	79d5e16f51	[GVN] Small tweaks to comments, style, and missed vector handling Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches. The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast. llvm-svn: 354412	2019-02-20 00:31:28 +00:00
Philip Reames	a259dc3263	[GVN] Fix last crasher w/non-integral pointers Same case as for memset and memcpy, but this time for clobbering stores and loads. We still can't allow coercion to or from non-integrals, regardless of the transform. Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers. My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues. My chanks to Cherry Zhang for helping debug. llvm-svn: 354407	2019-02-20 00:15:54 +00:00
Philip Reames	952d234d00	[GVN] Fix a crash bug w/non-integral pointers and memtransfers Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so. Since we shouldn't be doing such coercions to start with, easy fix. From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed. llvm-svn: 354403	2019-02-19 23:49:38 +00:00
Philip Reames	322eb7660e	[GVN] Fix a non-integral pointer bug w/vector types GVN generally doesn't forward structs or array types, but it will forward vector types to non-vectors and vice versa. As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves. llvm-svn: 354401	2019-02-19 23:19:51 +00:00
Philip Reames	92756a80e7	[GVN] Fix a crash bug around non-integral pointers If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed. Such a forward is not legal in general, but we can forward null pointers. Test for both cases are included. llvm-svn: 354399	2019-02-19 23:07:15 +00:00
Sanjay Patel	c1e0184317	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354393	2019-02-19 22:14:21 +00:00
Sanjay Patel	dcb93c0dda	[InstCombine] rearrange saturated add folds; NFC This is no-functional-change-intended, but that was also true when it was part of rL354276, and I managed to lose 2 predicates for the fold with constant...causing much bot distress. So this time I'm adding a couple of negative tests to avoid that. llvm-svn: 354384	2019-02-19 21:46:13 +00:00
Max Kazantsev	ebd95ea86e	[NFC] API for signaling that the current loop is being deleted We are planning to be able to delete the current loop in LoopSimplifyCFG in the future. Add API to notify the loop pass manager that it happened. llvm-svn: 354314	2019-02-19 11:14:05 +00:00
Max Kazantsev	30095d9795	[NFC] Store loop header in a local to keep it available after the loop is deleted llvm-svn: 354313	2019-02-19 11:13:58 +00:00
Sanjay Patel	8a35d339c9	Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op" This reverts commit `079b610c29`. Bots are failing after this change on a stage 2 compile of clang. llvm-svn: 354277	2019-02-18 16:04:22 +00:00
Sanjay Patel	079b610c29	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354276	2019-02-18 15:21:39 +00:00
Max Kazantsev	4561475e09	[NFC] Teach getInnermostLoopFor walk up the loop trees This should be NFC in current use case of this method, but it will help to use it for solving more compex tasks in follow-up patches. llvm-svn: 354227	2019-02-17 18:21:51 +00:00
Sanjay Patel	b341ee7071	[InstCombine] reduce more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: not op %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %notx, %y %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: not op ugt %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %y, %notx %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/niom (The matching here is still incomplete.) llvm-svn: 354224	2019-02-17 16:48:50 +00:00
Sanjay Patel	bee2073542	[InstCombine] reduce unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 (The matching here is incomplete. Trying to take minimal steps to make sure we don't induce infinite looping from existing canonicalizations of the 'select'.) llvm-svn: 354221	2019-02-17 15:58:48 +00:00
Max Kazantsev	d72c1a0c5c	[NFC] Fix name and clarifying comment for factored-out function llvm-svn: 354220	2019-02-17 15:22:48 +00:00
Max Kazantsev	0f943269a0	[NFC] Factor out a function for future reuse llvm-svn: 354218	2019-02-17 15:04:09 +00:00
Alina Sbirlea	383ccfb360	[EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE. Summary: Unlimitted number of calls to getClobberingAccess can lead to high compile times in pathological cases. Limitting getClobberingAccess to a fairly high number. Can be adjusted based on users/need. Note: this is the only user of MemorySSA currently enabled by default. The same handling exists in LICM (disabled atm). As MemorySSA gains more users, this logic of capping will need to move inside MemorySSA. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D58248 llvm-svn: 354182	2019-02-15 22:47:54 +00:00
Philip Reames	8220ecbce1	[InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC] Better addressing comments from https://reviews.llvm.org/D58290. llvm-svn: 354171	2019-02-15 21:31:39 +00:00
Philip Reames	cae6c767e8	[InstCombine] Convert atomicrmws to xchg or store where legal Implement two more transforms of atomicrmw: 1) We can convert an atomicrmw which produces a known value in memory into an xchg instead. 2) We can convert an atomicrmw xchg w/o users into a store for some orderings. Differential Revision: https://reviews.llvm.org/D58290 llvm-svn: 354170	2019-02-15 21:23:51 +00:00
Vedant Kumar	5f5cac3ae2	[CodeExtractor] Do not lift lifetime.end markers for region inputs If a lifetime.end marker occurs along one path through the extraction region, but not another, then it's still incorrect to lift the marker, because there is some path through the extracted function which would ordinarily not reach the marker. If the call to the extracted function is in a loop, unrolling can cause inputs to the function to become optimized out as undef after the first iteration. To prevent incorrect stack slot merging in the calling function, it should be sufficient to lift lifetime.start markers for region inputs. I've tested this theory out by doing a stage2 check-all with randomized splitting enabled. This is a follow-up to r353973, and there's additional context for this change in https://reviews.llvm.org/D57834. rdar://47896986 Differential Revision: https://reviews.llvm.org/D58253 llvm-svn: 354159	2019-02-15 18:46:58 +00:00
Vedant Kumar	47a0c9b69c	[HotColdSplit] Schedule splitting late to fix perf regression With or without PGO data applied, splitting early in the pipeline (either before the inliner or shortly after it) regresses performance across SPEC variants. The cause appears to be that splitting hides context for subsequent optimizations. Schedule splitting late again, in effect reversing r352080, which scheduled the splitting pass early for code size benefits (documented in https://reviews.llvm.org/D57082). Differential Revision: https://reviews.llvm.org/D58258 llvm-svn: 354158	2019-02-15 18:46:44 +00:00
Sanjay Patel	8a2b543a13	[InstCombine] fix crash while trying to narrow a binop of shuffles (PR40734) https://bugs.llvm.org/show_bug.cgi?id=40734 llvm-svn: 354144	2019-02-15 16:31:55 +00:00
Clement Courbet	f7e84a2ccc	[MergeICmps] Make base ordering really deterministic. Summary: The idea is that we now manipulate bases through a `unsigned BaseID` based on order of appearance in the comparison chain rather than through the `Value*`. Fixes 40714. Reviewers: gchatelet Subscribers: mgrang, jfb, jdoerfert, llvm-commits, hans Tags: #llvm Differential Revision: https://reviews.llvm.org/D58274 llvm-svn: 354131	2019-02-15 14:17:17 +00:00
Clement Courbet	cc004df7eb	[MergeICmps][NFC] Improve doc. llvm-svn: 354128	2019-02-15 12:58:06 +00:00
Max Kazantsev	c065b025a6	[NFCI] Factor out block removal from stack of nested loops llvm-svn: 354124	2019-02-15 12:18:10 +00:00
Simon Pilgrim	623c38d6cd	Fix "field 'DFS' will be initialized after field 'DTU'" warning. NFCI. llvm-svn: 354123	2019-02-15 12:13:16 +00:00
Max Kazantsev	136f09bea1	[NFC] Promote DFS to field for further use llvm-svn: 354118	2019-02-15 11:39:35 +00:00
Max Kazantsev	73db5c137a	[NFC] Tweak SplitBlockAndInsertIfThen to use existing ThenBlock llvm-svn: 354107	2019-02-15 08:18:00 +00:00
Teresa Johnson	d0b1f30b32	[ThinLTO] Detect partially split modules during the thin link Summary: The changes to disable LTO unit splitting by default (r350949) and detect inconsistently split LTO units (r350948) are causing some crashes when the inconsistency is detected in multiple threads simultaneously. Fix that by having the code always look for the inconsistently split LTO units during the thin link, by checking for the presence of type tests recorded in the summaries. Modify test added in r350948 to remove single threading required to fix a bot failure due to this issue (and some debugging options added in the process of diagnosing it). Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57561 llvm-svn: 354062	2019-02-14 21:22:50 +00:00
Philip Reames	db57ef6238	[InstCombine] Add todos for possible atomicrmw transforms llvm-svn: 354059	2019-02-14 20:48:36 +00:00
Philip Reames	485474208e	Canonicalize all integer "idempotent" atomicrmw ops For "idempotent" atomicrmw instructions which we can't simply turn into load, canonicalize the operation and constant. This reduces the matching needed elsewhere in the optimizer, but doesn't directly impact codegen. For any architecture where OR/Zero is not a good default choice, you can extend the AtomicExpand lowerIdempotentRMWIntoFencedLoad mechanism. I reviewed X86 to make sure this works well, haven't audited other backends. Differential Revision: https://reviews.llvm.org/D58244 llvm-svn: 354058	2019-02-14 20:41:17 +00:00
Philip Reames	97067d3c73	Teach instcombine about remaining "idempotent" atomicrmw types Expand on Quentin's r353471 patch which converts some atomicrmws into loads. Handle remaining operation types, and fix a slight bug. Atomic loads are required to have alignment. Since this was within the InstCombine fixed point, somewhere else in InstCombine was adding alignment before the verifier saw it, but still, we should fix. Terminology wise, I'm using the "idempotent" naming that is used for the same operations in AtomicExpand and X86ISelLoweringInfo. Once this lands, I'll add similar tests for AtomicExpand, and move the pattern match function to a common location. In the review, there was seemingly consensus that "idempotent" was slightly incorrect for this context. Once we setle on a better name, I'll update all uses at once. Differential Revision: https://reviews.llvm.org/D58242 llvm-svn: 354046	2019-02-14 18:39:14 +00:00
Teresa Johnson	c374a800e7	Refine ArgPromotion metadata handling Summary: In r353537 we now copy all metadata to the new function, with the old being removed when the old function is eliminated. In some cases the old function is dropped to a declaration (seems to only occur with the old PM). Go ahead and clear all metadata from the old function to handle that case, since verification will complain otherwise. This is consistent with what was being done for debug metadata before r353537. Reviewers: davidxl, uabelho Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58215 llvm-svn: 354032	2019-02-14 14:14:24 +00:00

1 2 3 4 5 ...

21568 Commits