llvm-project

Commit Graph

Author	SHA1	Message	Date
Volkan Keles	4ecdb44a64	BlockExtractor: Don’t delete functions directly Blocks may have function calls, so don’t erase functions directly to avoid erasing a function that has a user. llvm-svn: 327340	2018-03-12 22:28:18 +00:00
Saleem Abdulrasool	8b342680bf	ObjCARC: teach the cloner about funclets In the case that the CallInst that is being moved has an associated operand bundle which is a funclet, the move will construct an invalid instruction. The new site will have a different token and needs to be reassociated with the new instruction. Unfortunately, there is no way to alter the bundle after the construction of the instruction. Replace the call instruction cloning with a custom helper to clone the instruction and reassociate the funclet token. llvm-svn: 327336	2018-03-12 21:46:09 +00:00
Vedant Kumar	3a408538f0	Remove the LoopInstSimplify pass (-loop-instsimplify) LoopInstSimplify is unused and untested. Reading through the commit history the pass also seems to have a high maintenance burden. It would be best to retire the pass for now. It should be easy to recover if we need something similar in the future. Differential Revision: https://reviews.llvm.org/D44053 llvm-svn: 327329	2018-03-12 20:49:42 +00:00
Michael Zolotukhin	a3d8ef0f08	Improve caching scheme in ProvenanceAnalysis. Summary: ProvenanceAnalysis::related(A, B) currently memoizes its results, and on big tests the cache grows too large, and we're spending most of the time growing/looking through DenseMap. This patch reduces the size of the cache by normalizing keys first: we do that by calling GetUnderlyingObjCPtr on the input values. The results of GetUnderlyingObjCPtr are also memoized in a separate cache. The patch doesn't bring noticable changes to compile time on CTMark, however significantly helps one of our internal tests. Reviewers: gottesmm Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44270 llvm-svn: 327328	2018-03-12 20:36:25 +00:00
Craig Topper	ee99aa4dd0	[InstCombine] Replace calls to getNumUses with hasNUses or hasNUsesOrMore getNumUses is a linear time operation. It traverses the user linked list to the end and counts as it goes. Since we are only interested in small constant counts, we should use hasNUses or hasNUsesMore more that terminate the traversal as soon as it can provide the answer. There are still two other locations in InstCombine, but changing those would force a rebase of D44266 which if accepted would remove them. Differential Revision: https://reviews.llvm.org/D44398 llvm-svn: 327315	2018-03-12 18:46:05 +00:00
Craig Topper	3b4ad9c12d	[CallSiteSplitting] Use !Instruction::use_empty instead of checking for a non-zero return from getNumUses getNumUses is a linear operation. It walks a linked list to get a count. So in this case its better to just ask if there are any users rather than how many. llvm-svn: 327314	2018-03-12 18:40:59 +00:00
Eugene Leviant	19e238746b	[ThinLTO] Recommit of import global variables This wasreverted in r326638 due to link problems and fixed afterwards llvm-svn: 327254	2018-03-12 10:30:50 +00:00
Justin Lebar	24b6640b1b	Back out "Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." This reverts r326908, originally landed as D44102. Reverted for causing performance regressions on x86. (These regressions are not yet understood.) llvm-svn: 327252	2018-03-12 09:26:09 +00:00
Florian Hahn	a7dcfa746e	[PartialInlining] Use isInlineViable to detect constructs preventing inlining. Use isInlineViable to prevent inlining of functions with non-inlinable constructs, in case cost analysis is skipped. Reviewers: efriedma, sfertile, davide, davidxl Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D42846 llvm-svn: 327207	2018-03-10 14:53:44 +00:00
Ulrich Weigand	019dd2316d	Revert "[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf" This reverts commit r327175 as problems in debug info generation were shown. llvm-svn: 327176	2018-03-09 22:00:10 +00:00
Ulrich Weigand	fa4e63c0d6	[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf When hoisting common code from the "then" and "else" branches of a condition to before the "if", there is no need to require that debug intrinsics match before moving them (and merging them). Instead, we can simply always keep all debug intrinsics from both sides of the "if". This fixes PR36410, which describes a problem where as a result of the attempt to merge debug locations for two debug intrinsics we end up with an invalid intrinsic, where the scope indicated in the !dbg location no longer matches the scope of the variable tracked by the intrinsic. In addition, this has the benefit that we no longer throw away information that is actually still valid, helping to generate better debug data. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D44312 llvm-svn: 327175	2018-03-09 21:37:07 +00:00
Renato Golin	038ede2a16	[NFC] Consolidate six getPointerOperand() utility functions into one place There are six separate instances of getPointerOperand() utility. LoopVectorize.cpp has one of them, and I don't want to create a 7th one while I'm trying to move LoopVectorizationLegality into a separate file (eventual objective is to move it to Analysis tree). See http://lists.llvm.org/pipermail/llvm-dev/2018-February/120999.html for llvm-dev discussions Closes D43323. Patch by Hideki Saito <hideki.saito@intel.com>. llvm-svn: 327173	2018-03-09 21:05:58 +00:00
Peter Collingbourne	2974856ad4	Use branch funnels for virtual calls when retpoline mitigation is enabled. The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the branch predictor, and as a result it can lead to a measurable loss of performance. We can reduce the performance impact of retpolined virtual calls by replacing them with a special construct known as a branch funnel, which is an instruction sequence that implements virtual calls to a set of known targets using a binary tree of direct branches. This allows the processor to speculately execute valid implementations of the virtual function without allowing for speculative execution of of calls to arbitrary addresses. This patch extends the whole-program devirtualization pass to replace certain virtual calls with calls to branch funnels, which are represented using a new llvm.icall.jumptable intrinsic. It also extends the LowerTypeTests pass to recognize the new intrinsic, generate code for the branch funnels (x86_64 only for now) and lay out virtual tables as required for each branch funnel. The implementation supports full LTO as well as ThinLTO, and extends the ThinLTO summary format used for whole-program devirtualization to support branch funnels. For more details see RFC: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html Differential Revision: https://reviews.llvm.org/D42453 llvm-svn: 327163	2018-03-09 19:11:44 +00:00
Chad Rosier	95d9ccb2a0	[JumpThreading] Don't restrict cast-traversal to i1 In r263618, JumpThreading learned to look trough simple cast instructions, but only if the source of those cast instructions was a phi/cmp i1 (in an effort to limit compile time effects). I think this condition is too restrictive. For switches with limited value range, InstCombine will readily introduce an extra trunc instruction to a smaller integer type (e.g. from i8 to i2), leaving us in the somewhat perverse situation that jump-threading would work before running instcombine, but not after. Since instcombine produces this pattern, I think we need to consider it canonical and support it in JumpThreading. In general, for limiting recursion, I think the existing restriction to phi and cmp nodes should be sufficient to avoid looking through unprofitable chains of instructions. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42262 llvm-svn: 327150	2018-03-09 16:43:46 +00:00
Renato Golin	6b62039bb0	[LV] Fix vectorizer's isUniform() abuse triggers assert in SCEV Fixes PR36311. See more detailed analysis in https://bugs.llvm.org/show_bug.cgi?id=36311. isUniform() information is recomputed after LV started transforming the underlying IR and that triggered an assert in SCEV. From vectorizer's architectural perspective, such information, while still useful in vector code gen, should not be recomputed after the start of transforming the LLVM IR. Instead, we should collect and cache such information during the analysis phase of LV and use the cached info during code gen. From the symptom perspective, this assert as it stands right now is not very useful. Legality already rejected loops that would trigger the assert. As such, commenting out the assert is NFC from vectorizer's functionality perspective. On top of that, just above the assertion, we check for unit-strided load/store or gather scatter. Addresses can't be uniform below that check. From vectorization theory point of view, we don't have to reject all cases of stores to uniform addresses. Eventually, we should support safe/profitable cases. This patch resolves the issue by removing the useless assertion that is invoking LAA's isUniform() that requires up-to-date DomTree ---- once vector code gen starts modifying CFG, we don't have an up-to-date DomTree. Patch by Hideki Saito <hideki.saito@intel.com>. llvm-svn: 327109	2018-03-09 10:31:31 +00:00
Eric Christopher	3caa0fd050	Revert "[ThinLTO] Keep available_externally symbols live" This reverts commit r327041 and the followup attempts at fixing the testcase as they're still failing. llvm-svn: 327094	2018-03-09 01:25:18 +00:00
Adrian Prantl	5b477be72a	LowerDbgDeclare: ignore dbg.declares for allocas with volatile access There is no point in lowering a dbg.declare describing an alloca that has volatile loads or stores as users, since the alloca cannot be elided. Lowering the dbg.declare will result in larger debug info that may also have worse coverage than just describing the alloca. rdar://problem/34496278 llvm-svn: 327092	2018-03-09 00:45:04 +00:00
Philip Reames	fbffd126b8	[NFC] Factor out a helper function for checking if a block has a potential early implicit exit. llvm-svn: 327065	2018-03-08 21:25:30 +00:00
Kuba Mracek	8842da8e07	[asan] Fix a false positive ODR violation due to LTO ConstantMerge pass [llvm part, take 3] This fixes a false positive ODR violation that is reported by ASan when using LTO. In cases, where two constant globals have the same value, LTO will merge them, which breaks ASan's ODR detection. Differential Revision: https://reviews.llvm.org/D43959 llvm-svn: 327061	2018-03-08 21:02:18 +00:00
Kuba Mracek	f0bcbfef5c	Revert r327053. llvm-svn: 327055	2018-03-08 20:13:39 +00:00
Kuba Mracek	584bd10803	[asan] Fix a false positive ODR violation due to LTO ConstantMerge pass [llvm part, take 2] This fixes a false positive ODR violation that is reported by ASan when using LTO. In cases, where two constant globals have the same value, LTO will merge them, which breaks ASan's ODR detection. Differential Revision: https://reviews.llvm.org/D43959 llvm-svn: 327053	2018-03-08 20:05:45 +00:00
Vlad Tsyrklevich	7b66ef1036	[ThinLTO] Keep available_externally symbols live Summary: This change fixes PR36483. The bug was originally introduced by a change that marked non-prevailing symbols dead. This broke LowerTypeTests handling of available_externally functions, which are non-prevailing. LowerTypeTests uses liveness information to avoid emitting thunks for unused functions. Marking available_externally functions dead is incorrect, the functions are used though the function definitions are not. This change keeps them live, and lets the EliminateAvailableExternally/GlobalDCE passes remove them later instead. I've also enabled EliminateAvailableExternally for all optimization levels, I believe it being disabled for O1 was an oversight. Reviewers: pcc, tejohnson Reviewed By: tejohnson Subscribers: grimar, mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D43690 llvm-svn: 327041	2018-03-08 18:48:03 +00:00
Kuba Mracek	e834b22874	Revert r327029 llvm-svn: 327033	2018-03-08 17:32:00 +00:00
Kuba Mracek	0e06d37dba	[asan] Fix a false positive ODR violation due to LTO ConstantMerge pass [llvm part] This fixes a false positive ODR violation that is reported by ASan when using LTO. In cases, where two constant globals have the same value, LTO will merge them, which breaks ASan's ODR detection. Differential Revision: https://reviews.llvm.org/D43959 llvm-svn: 327029	2018-03-08 17:24:06 +00:00
Farhana Aleen	89196642f7	[AMDGPU] Increased vector length for global/constant loads. Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44179 llvm-svn: 326910	2018-03-07 17:09:18 +00:00
Justin Lebar	eccfbf1bcd	Re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for failing an assert in clang bootstrap builds. Re-landing with a fix for handling non-power-of-two inputs (e.g. udiv i24). Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326908	2018-03-07 16:56:49 +00:00
Farhana Aleen	347d12b4ce	Revert "[AMDGPU] Widened vector length for global/constant address space." This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03. llvm-svn: 326907	2018-03-07 16:55:27 +00:00
Farhana Aleen	0d03d0588d	[AMDGPU] Widened vector length for global/constant address space. llvm-svn: 326904	2018-03-07 16:29:05 +00:00
Justin Lebar	eeeb0eb049	Revert rL326898: "Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions." Breaks bootstrap builds: clang built with this patch asserts while building MCDwarf.cpp: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. llvm-svn: 326900	2018-03-07 16:05:43 +00:00
Justin Lebar	cb9e89c39b	Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Reviewers: spatel, sanjoy Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 326898	2018-03-07 15:11:13 +00:00
Sven van Haastregt	19f531d31e	[LoadStoreVectorizer] Differentiate between <1 x T> and T The LoadStoreVectorizer thought that <1 x T> and T were the same types when merging stores, leading to a crash later. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D44014 llvm-svn: 326884	2018-03-07 10:29:28 +00:00
Evgeny Stupachenko	204ade4102	Add early exit on reassociation of 0 expression. Summary: Before the patch a try to reassociate ((v * 16) * 0) * 1 fall into infinite loop Reviewers: pankajchawla Differential Revision: http://reviews.llvm.org/D41467 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326861	2018-03-07 02:17:08 +00:00
Eugene Zelenko	e2fc88a2fe	[Transforms] Add missing header for InstructionCombining.cpp, in order to export LLVMInitializeInstCombine as extern "C". Fixes PR35947. Patch by Brenton Bostick. Differential revision: https://reviews.llvm.org/D44140 llvm-svn: 326843	2018-03-06 23:06:13 +00:00
Sebastian Pop	bf6e1c26cf	DA: remove uses of GEP, only ask SCEV It's been quite some time the Dependence Analysis (DA) is broken, as it uses the GEP representation to "identify" multi-dimensional arrays. It even wrongly detects multi-dimensional arrays in single nested loops: from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6 ;; for (long int i = 0; i < 50; i++) { ;; A[i][3i - 6] = i; ;; B++ = A[i][i]; DA used to detect two subscripts, which makes no sense in the LLVM IR or in C/C++ semantics, as there are no guarantees as in Fortran of subscripts not overlapping into a next array dimension: maximum nesting levels = 1 SrcPtrSCEV = %A DstPtrSCEV = %A using GEPs subscript 0 src = {0,+,1}<nuw><nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} subscript 1 src = {-6,+,3}<nsw><%for.body> dst = {0,+,1}<nuw><nsw><%for.body> class = 1 loops = {1} Separable = {} Coupled = {1} With the current patch, DA will correctly work on only one dimension: maximum nesting levels = 1 SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body> DstSCEV = {%A,+,404}<%for.body> subscript 0 src = {(-2424 + %A)<nsw>,+,1212}<%for.body> dst = {%A,+,404}<%for.body> class = 1 loops = {1} Separable = {0} Coupled = {} This change removes all uses of GEP from DA, and we now only rely on the SCEV representation. The patch does not turn on -da-delinearize by default, and so the DA analysis will be more conservative in the case of multi-dimensional memory accesses in nested loops. I disabled some interchange tests, as the DA is not able to disambiguate the dependence anymore. To make DA stronger, we may need to compute a bound on the number of iterations based on the access functions and array dimensions. The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to avoid checking for snippets of LLVM IR: this form of checking is very hard to maintain. Instead, we now check for output of the pass that are more meaningful than dozens of lines of LLVM IR. Some tests now require -debug messages and thus only enabled with asserts. Patch written by Sebastian Pop and Aditya Kumar. Differential Revision: https://reviews.llvm.org/D35430 llvm-svn: 326837	2018-03-06 21:55:59 +00:00
Sanjay Patel	1f2f5d18d3	[InstCombine] simplify min/max canonicalization; NFCI llvm-svn: 326828	2018-03-06 19:01:18 +00:00
Sanjay Patel	7ed0bc26ac	[ValueTracking] move helpers for SelectPatterns from InstCombine to ValueTracking Most of the folds based on SelectPatternResult belong in InstSimplify rather than InstCombine, so the helper code should be available to other passes/analysis. llvm-svn: 326812	2018-03-06 16:57:55 +00:00
Florian Hahn	517dc51c48	[CallSiteSplitting] Do not crash when BB's terminator changes. Change doCallSiteSplitting to iterate until we reach the terminator instruction. tryToSplitCallSite can replace BB's terminator in case BB is a successor of itself. Then IE will be invalidated and we also have to check the current terminator. Reviewers: junbuml, davidxl, davide, fhahn Reviewed By: fhahn, junbuml Differential Revision: https://reviews.llvm.org/D43824 llvm-svn: 326793	2018-03-06 14:00:58 +00:00
Florian Hahn	f0a25f7253	[CloneFunction] Support BB == PredBB in DuplicateInstructionsInSplit. In case PredBB == BB and StopAt == BB's terminator, StopAt != &*BI will fail, because BB's terminator instruction gets replaced. By using BB.getTerminator() we get the current terminator which we can use to compare. Reviewers: sanjoy, anna, reames Reviewed By: anna Differential Revision: https://reviews.llvm.org/D43822 llvm-svn: 326779	2018-03-06 13:12:32 +00:00
Xin Tong	8fd561f572	[MergeICmp] Simplify how BCECmpBlock instructions are blacklisted llvm-svn: 326761	2018-03-06 02:24:02 +00:00
Xin Tong	98af9efca5	[MergeICmp] Fix printing. NFC llvm-svn: 326760	2018-03-06 02:04:57 +00:00
Daniel Neilson	82daad31fe	[RewriteStatepoints] Fix stale parse points Summary: RewriteStatepointsForGC collects parse points for further processing. During the collection if a callsite is found in an unreachable block (DominatorTree::isReachableFromEntry()) then all unreachable blocks are removed by removeUnreachableBlocks(). Some of the removed blocks could have been reachable according to DominatorTree::isReachableFromEntry(). In this case the collected parse points became stale and resulted in a crash when accessed. The fix is to unconditionally canonicalize the IR to removeUnreachableBlocks and then collect the parse points. The added test crashes with the old version and passes with this patch. Patch by Yevgeny Rouban! Reviewed by: Anna Differential Revision: https://reviews.llvm.org/D43929 llvm-svn: 326748	2018-03-05 22:27:30 +00:00
Daniel Neilson	bdda115e19	[InstCombine] Don't blow up in foldICmpWithCastAndCast on vector icmp instructions. Summary: Presently, InstCombiner::foldICmpWithCastAndCast() implicitly assumes that it is only invoked with icmp instructions of integer type. If that assumption is broken, and it is called with an icmp of vector type, then it fails (asserts/crashes). This patch addresses the deficiency. It allows it to simplify icmp (ptrtoint x), (ptrtoint/c) of vector type into a compare of the inputs, much as is done when the type is integer. Reviewers: apilipenko, fedor.sergeev, mkazantsev, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44063 llvm-svn: 326730	2018-03-05 18:05:51 +00:00
Craig Topper	8452faceae	[InstCombine] Add constant vector support to getMinimumFPType for visitFPTrunc. This patch teaches getMinimumFPType to support shrinking a vector of ConstantFPs. This should improve our ability to combine vector fptrunc with fp binops. Differential Revision: https://reviews.llvm.org/D43774 llvm-svn: 326729	2018-03-05 18:04:12 +00:00
Florian Hahn	0b7c6422fb	[IPSCCP] Add getCompare which returns either true, false, undef or null. getCompare returns true, false or undef constants if the comparison can be evaluated, or nullptr if it cannot. This is in line with what ConstantExpr::getCompare returns. It also allows us to use ConstantExpr::getCompare for comparing constants. Reviewers: davide, mssimpso, dberlin, anna Reviewed By: davide Differential Revision: https://reviews.llvm.org/D43761 llvm-svn: 326720	2018-03-05 17:33:50 +00:00
Sanjay Patel	53ffabdfcb	[CVP] fix formatting; NFC llvm-svn: 326711	2018-03-05 16:08:34 +00:00
Xin Tong	8345c0e3a5	[MergeICmp] We can discard initial blocks that do other work Summary: We can discard initial blocks that do other work We do not need to limit ourselves to just the first block in the chain. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44029 llvm-svn: 326698	2018-03-05 13:54:47 +00:00
Clement Courbet	34be1b0288	[MergeICmps][NFC] Improve logging. llvm-svn: 326683	2018-03-05 08:21:47 +00:00
Fedor Indutny	364b9c2adb	[CallSiteSplitting] fix use after-free Iterating through predecessors of `TailBB` while removing their terminators leads to use after-free, because the predecessor list is changing on each removal. llvm-svn: 326668	2018-03-03 22:34:38 +00:00
Fedor Indutny	f9e09c1dd0	[CallSiteSplitting] properly split musttail calls Summary: `musttail` calls can't be naively splitted. The split blocks must include not only the call instruction itself, but also (optional) `bitcast` and `return` instructions that follow it. Clone `bitcast` and `ret`, place them into the split blocks, and remove the tail block when done. Reviewers: junbuml, mcrosier, davidxl, davide, fhahn Reviewed By: fhahn Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D43729 llvm-svn: 326666	2018-03-03 21:40:14 +00:00
Sanjay Patel	1a8d5c3d1f	[InstCombine] (~X) - (~Y) --> Y - X llvm-svn: 326660	2018-03-03 17:53:25 +00:00
Chandler Carruth	a4619d9944	[ThinLTO] Revert r325320: Import global variables This caused some links to fail with ThinLTO due to missing symbols as well as causing some binaries to have failures at runtime. We're working with the author to get a test case, but want to get the tree green again. Further, it appears to introduce a data race. While the test usage of threads was disabled in r325361 & r325362, that isn't an acceptable fix. I've reverted both of these as well. This code needs to be thread safe. Test cases for this are already on the original commit thread. llvm-svn: 326638	2018-03-02 23:40:08 +00:00
Vedant Kumar	7fc591f8bb	[AggressiveInstCombine] Use use_empty() instead of !getNumUses(), NFC use_empty() runs in O(1), whereas getNumUses() runs in O(# uses). llvm-svn: 326635	2018-03-02 23:22:49 +00:00
Sanjay Patel	e29375d04c	[InstCombine] rearrange visitFMul; NFCI Put the simplest non-FMF folds first, so it's easier to see what's left to fix/group/add with the FMF folds. llvm-svn: 326632	2018-03-02 23:06:45 +00:00
Vedant Kumar	f69baf64eb	[Utils] Salvage debug info in block simplification In stage2 -O3 builds of llc, this results in small but measurable increases in the number of variables with locations, and in the number of unique source variables overall. (According to llvm-dwarfdump --statistics, there are 123 additional variables with locations, which is just a 0.006% improvement). The size of the .debug_loc section of the llc dsym increases by 0.004%. llvm-svn: 326629	2018-03-02 22:46:48 +00:00
Vedant Kumar	334fa57456	[Utils] Salvage debug info in recursive inst deletion In stage2 -O3 builds of llc, this results in a 0.3% increase in the number of variables with locations, and a 0.2% increase in the number of unique source variables overall. The size of the .debug_loc section of the llc dsym increases by 0.5%. llvm-svn: 326621	2018-03-02 21:36:35 +00:00
Craig Topper	c7461e1aad	[InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again. Instead of returning the smaller FP constant we now return the minimal Type the constant can fit into. We also return the Type of the input to any fp extends. The legality checks are then done on just the size of these Types. If we find something profitable we then emit FPTruncs in front of the smaller binop and assume those FPTruncs will be constant folded or combined with any ConstantFPs or fpextends. Differential Revision: https://reviews.llvm.org/D44038 llvm-svn: 326617	2018-03-02 21:25:18 +00:00
Sanjay Patel	2fd0acf05a	[InstCombine] partly fix FMF for fmul+log2 fold The code was checking that all of the instructions in the sequence are 'fast', but that's not necessary. The final multiply is all that we need to check (tests adjusted). The fmul doesn't need to be fully 'fast' either, but that can be another patch. llvm-svn: 326608	2018-03-02 20:32:46 +00:00
Yaxun Liu	3c42f1c3c9	LoopUnroll: respect pragma unroll when AllowRemainder is disabled Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops. Differential Revision: https://reviews.llvm.org/D43826 llvm-svn: 326585	2018-03-02 16:22:32 +00:00
Florian Hahn	515acd64fd	[LV][CFG] Add irreducible CFG detection for outer loops This patch adds support for detecting outer loops with irreducible control flow in LV. Current detection uses SCCs and only works for innermost loops. This patch adds a utility function that works on any CFG, given its RPO traversal and its LoopInfoBase. This function is a generalization of isIrreducibleCFG from lib/CodeGen/ShrinkWrap.cpp. The code in lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility function. Patch by Diego Caballero <diego.caballero@intel.com> Differential Revision: https://reviews.llvm.org/D40874 llvm-svn: 326568	2018-03-02 12:24:25 +00:00
Fedor Indutny	1571b1271e	[ArgumentPromotion] don't break musttail invariant PR36543 Summary: Do not break musttail invariant by promoting arguments of musttail callee or caller. Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk Reviewed By: rnk Subscribers: rnk, llvm-commits Differential Revision: https://reviews.llvm.org/D43926 llvm-svn: 326521	2018-03-02 00:59:27 +00:00
Sanjay Patel	d0cdb2f861	[InstCombine] allow fmul fold with less than 'fast' This is a retry of r326502 with updates to the reassociate test file that I missed the first time. @test15_reassoc in the supposed -reassociate test file (except that it tests 2 other passes too...) shows that there's no clear responsiblity for reassociation transforms. Instcombine now gets that case, but only because the constant values are identical. Otherwise, it would still miss that pattern. Reassociate doesn't get that case because it hasn't been updated to use less than 'fast' FMF. llvm-svn: 326513	2018-03-02 00:14:51 +00:00
Sanjay Patel	eb5d046890	revert r326502: [InstCombine] allow fmul fold with less than 'fast' I forgot that I added tests for 'reassoc' to -reassociate, but suprisingly that file calls -instcombine too, so it is affected. I'll update that file and try again. llvm-svn: 326510	2018-03-01 23:39:24 +00:00
Sanjay Patel	7373ae5c9a	[InstCombine] allow fmul fold with less than 'fast' llvm-svn: 326502	2018-03-01 22:53:47 +00:00
Craig Topper	2915bc0046	[SimplifyLibCalls] Update an obviously copy and pasted header comment to match this file. NFC llvm-svn: 326475	2018-03-01 20:05:09 +00:00
Sanjay Patel	f3b1af7aa4	[InstCombine] simplify code for (XY) X => (XX) Y ; NFCI llvm-svn: 326444	2018-03-01 15:50:26 +00:00
Benjamin Kramer	d1cf7ff5ab	[SCCP] Fix unused variable warning in release builds. llvm-svn: 326429	2018-03-01 11:31:44 +00:00
Reid Kleckner	3762a089d7	[IPSCCP] do not break musttail invariant (PR36485) Do not replace results of `musttail` calls with a constant if the call itself can't be removed. Do not zap returns of `musttail` callees, if the call site can't be removed and replaced with a constant. Do not zap returns of `musttail`-calling blocks, this breaks invariant too. Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43695 llvm-svn: 326404	2018-03-01 01:19:18 +00:00
Reid Kleckner	cb9611ca67	[DAE] don't remove args of musttail target/caller `musttail` requires identical signatures of caller and callee. Removing arguments breaks `musttail` semantics. PR36441 Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43708 llvm-svn: 326394	2018-03-01 00:09:35 +00:00
Sanjay Patel	eaf5a120ed	[InstCombine] simplify code for X * -1.0 --> -X; NFC I've added random FMF to one of the tests to show those are propagated. llvm-svn: 326377	2018-02-28 22:30:04 +00:00
Jonas Devlieghere	9ca064552a	[GlobalOpt] don't change CC of musttail calle(e\|r) When the function has musttail call - its cc is fixed to be equal to the cc of the musttail callee. In such case (and in the case of the musttail callee), GlobalOpt should not change the cc to fastcc as it will break the invariant. This fixes PR36546 Patch by: Fedor Indutny (indutny) Differential revision: https://reviews.llvm.org/D43859 llvm-svn: 326376	2018-02-28 22:28:44 +00:00
Craig Topper	b95298b041	[InstCombine] Split the FP constant code out of lookThroughFPExtensions and use nullptr as a sentinel Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings. This patch just moves all the code into a helper function and returns nullptr when it wants to stop. I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement. Differential Revision: https://reviews.llvm.org/D43833 llvm-svn: 326361	2018-02-28 20:14:34 +00:00
Vedant Kumar	9a041a7522	[InstrProfiling] Emit the runtime hook when no counters are lowered The API verification tool tapi has difficulty processing frameworks which enable code coverage, but which have no code. The profile lowering pass does not emit the runtime hook in this case because no counters are lowered. While the hook is not needed for program correctness (the profile runtime doesn't have to be linked in), it's needed to allow tapi to validate the exported symbol set of instrumented binaries. It was not possible to add a workaround in tapi for empty binaries due to an architectural issue: tapi generates its expected symbol set before it inspects a binary. Changing that model has a higher cost than simply forcing llvm to always emit the runtime hook. rdar://36076904 Differential Revision: https://reviews.llvm.org/D43794 llvm-svn: 326350	2018-02-28 19:00:08 +00:00
Sanjay Patel	b3f4f62698	[InstCombine] move invariant call out of loop; NFC We really shouldn't need a 2-loop here at all, but that's another cleanup. llvm-svn: 326330	2018-02-28 16:50:51 +00:00
Sanjay Patel	8fdd87f929	[InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly vague. The constant check is redundant in some cases, but it allows removing duplication for most of the calls. llvm-svn: 326329	2018-02-28 16:36:24 +00:00
Xin Tong	256869d8bc	Fix typo. NFC llvm-svn: 326319	2018-02-28 12:09:53 +00:00
Xin Tong	8ba674e43b	[MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed more than once. Summary: Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module. The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block is processed and second time when the last block is processed. We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43825 llvm-svn: 326318	2018-02-28 12:08:00 +00:00
David Green	7c35de124a	[Dominators] Remove verifyDomTree and add some verifying for Post Dom Trees Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315	2018-02-28 11:00:08 +00:00
Florian Hahn	1807c516c7	[NewGVN] Update phi-of-ops def block when updating existing ValuePHI. In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181	2018-02-27 09:34:51 +00:00
Sanjay Patel	31a90468e1	[InstCombine] allow fdiv folds with less than fully 'fast' ops Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098	2018-02-26 16:02:45 +00:00
Renato Golin	9d1b2acaaa	[LV] Move isLegalMasked* functions from Legality to CostModel All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079	2018-02-26 11:06:36 +00:00
Florian Hahn	a1822cbabc	[LoopInterchange] Loops with empty dependency matrix are safe. The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077	2018-02-26 10:45:25 +00:00
Adam Nemet	e4e1de60aa	Revert "StructurizeCFG: Test for branch divergence correctly" This reverts commit r325881. Breaks many bots llvm-svn: 326037	2018-02-24 17:29:09 +00:00
Sanjay Patel	2db2769499	[InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFC llvm-svn: 325968	2018-02-23 22:38:10 +00:00
Sanjay Patel	db53d1847b	[InstSimplify] sqrt(X) * sqrt(X) --> X This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965	2018-02-23 22:20:13 +00:00
Sanjay Patel	d32104e1b2	[InstCombine] allow fmul-sqrt folds with less than full -ffast-math Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960	2018-02-23 21:16:12 +00:00
Matt Davis	523c656e25	[Debug] Add dbg.value intrinsics for PHIs created during LCSSA. Summary: This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: ``` int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } ``` In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Reviewers: mzolotukhin, aprantl, vsk, davide Reviewed By: aprantl, vsk Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 325926	2018-02-23 17:38:27 +00:00
Sanjay Patel	6b9c7a9c83	[InstCombine] refactor fmul with negated op folds; NFCI The existing code was inefficiently looking for 'nsz' variants. That's unnecessary because we canonicalize those to the expected form with -0.0. We may also want to adjust or remove the fold that sinks negation. We don't do that for fdiv (or integer ops?). That should be uniform? It may also lead to missed optimization as in PR21914: https://bugs.llvm.org/show_bug.cgi?id=21914 ...or we just have to fix other passes to avoid that problem. llvm-svn: 325924	2018-02-23 17:14:28 +00:00
Sanjay Patel	4a9116e897	[InstCombine] use FMF-copying functions to reduce code; NFCI llvm-svn: 325923	2018-02-23 17:07:29 +00:00
Nicolai Haehnle	43c1115cd4	StructurizeCFG: Test for branch divergence correctly Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881	2018-02-23 10:45:46 +00:00
Bjorn Steinbrink	983d6c3f18	Mark MergedLoadStoreMotion as not preserving MemDep results Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880	2018-02-23 10:41:57 +00:00
Eric Christopher	675dcf02a8	Update comment for whether or not we can optimize an alias - we're checking the alias and not the aliasee. If the alias can be interposed then we shouldn't do anything. llvm-svn: 325837	2018-02-22 23:12:11 +00:00
Peter Collingbourne	32f5405bff	Fix DataFlowSanitizer instrumentation pass to take parameter position changes into account for custom functions. When DataFlowSanitizer transforms a call to a custom function, the new call has extra parameters. The attributes on parameters must be updated to take the new position of each parameter into account. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D43132 llvm-svn: 325820	2018-02-22 19:09:07 +00:00
Daniel Neilson	20c9207be3	[AlignmentFromAssumptions] Set source and dest alignments of memory intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816	2018-02-22 18:55:59 +00:00
Luke Cheeseman	6c1e6bbe0c	[FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations passes for naked functions - Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788	2018-02-22 14:42:08 +00:00
Mircea Trofin	56950974d4	[SampleProf] NFC. Expose reusable functionality in SampleProfile. Summary: Exposing getOffset and findFunctionSamples as members of SampleProfile. They are intimately tied to design choices of the sample profile format - using offsets instead of line numbers, and traversing inlined functions stack, respectively. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43605 llvm-svn: 325747	2018-02-22 06:42:57 +00:00
Vedant Kumar	1ceabcf080	[Utils] Avoid a hash table lookup in salvageDI, NFC According to the current coverage report salvageDebugInfo() is called 5.12 million times during testing and almost always returns early. The early return depends on LocalAsMetadata::getIfExists returning null, which involves a DenseMap lookup in an LLVMContextImpl. We can probably speed this up by simply checking the IsUsedByMD bit in Value. llvm-svn: 325738	2018-02-22 01:29:41 +00:00
Sanjay Patel	5a6f904520	[InstCombine] add and use Create*FMF functions; NFC llvm-svn: 325730	2018-02-21 22:18:55 +00:00
Evgeniy Stepanov	43271b1803	[hwasan] Fix inline instrumentation. This patch changes hwasan inline instrumentation: Fixes address untagging for shadow address calculation (use 0xFF instead of 0x00 for the top byte). Emits brk instruction instead of hlt for the kernel and user space. Use 0x900 instead of 0x100 for brk immediate (0x100 - 0x800 are unavailable in the kernel). Fixes and adds appropriate tests. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D43135 llvm-svn: 325711	2018-02-21 19:52:23 +00:00
Vedant Kumar	56492f9177	[BDCE] Salvage debug info from dying insts This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660	2018-02-21 01:55:33 +00:00
Sanjay Patel	6f716a7c5e	[InstCombine] C / -X --> -C / X We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. This is similar to rL325648. llvm-svn: 325649	2018-02-21 00:01:45 +00:00

1 2 3 4 5 ...

19641 Commits