llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	f09f4b2676	[OpenMPOpt] Initialize value to avoid use of uninitialized memory This should fix the issue reported here: https://reviews.llvm.org/D76058#1937554	2020-03-23 19:17:19 -05:00
Matt Arsenault	b20a1d840f	GVNSink: Allow handling addrspacecast	2020-03-23 16:50:58 -04:00
Stefanos Baziotis	a650d555fc	[Attributor][NFC] Refactorings and typos in doc Reviewed By: sstefan1, uenoku Differential Revision: https://reviews.llvm.org/D76175	2020-03-23 22:44:10 +02:00
Matt Arsenault	43d98a0ecf	Allow replacing intrinsic operands with variables Since intrinsics can now specify when an argument is required to be constant, it is now OK to replace arguments with variables if they aren't. This means intrinsics must now be accurately marked with immarg.	2020-03-23 15:51:57 -04:00
Sanjay Patel	a1fe6beb1e	[InstCombine] remove one-use check for ctpop -> cttz Two one-use checks were added with rGfdcb27105537, but only the first one is necessary to limit an increase in instruction count. The second transform only creates one instruction, so it is always a reasonable canonicalization/optimization.	2020-03-23 13:59:57 -04:00
Johannes Doerfert	9d38f98dc3	[OpenMPOpt] Validate declaration types against the expected types Validation of the found runtime library functions declarations types (return and argument types) with the expected types. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76058	2020-03-23 11:43:36 -05:00
Benjamin Kramer	ff2f5097ed	[Attributor] Fold single-use variable into assert Fixes unused variable warning in Release builds.	2020-03-23 17:41:52 +01:00
Johannes Doerfert	c57689bef2	[Attributor][NFC] Copy llvm::function_ref, don't use references On IRC this was called a "code smell" so we get rid of it.	2020-03-23 10:45:24 -05:00
Johannes Doerfert	68fed27067	[Attributor] Handle calls in AAValueConstantRange properly We did handle calls that were operands of certain instructions but not standalone calls we visit via indirection, e.g., selects.	2020-03-23 10:45:24 -05:00
Johannes Doerfert	54ec9b54f6	[Attributor] Unify handling of must-tail calls We special cased must-tail calls all over the place because they cannot be modified as other calls can be. However, we already centralized the modification API so we can centralize the handling as well. This simplifies the code and allows to remove must-tail calls completely.	2020-03-23 10:45:24 -05:00
Johannes Doerfert	0995001ce5	[Attributor][NFC] Predetermine the module before verification It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.	2020-03-23 10:45:23 -05:00
Johannes Doerfert	f3bf4b05c2	[Attributor][NFC] clang-format Attributor.{h,cpp}	2020-03-23 10:45:23 -05:00
Simon Pilgrim	fdcb271055	[InstCombine] Limit CTPOP -> CTTZ simplifications to one use Tweak D76568 so we only combine if it will remove the bit-twiddling. Suggested by @spatel	2020-03-23 14:33:41 +00:00
Guillaume Chatelet	32851f8d63	[Alignment][NFC] Deprecate VectorUtils::getAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, rogfer01, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76542	2020-03-23 13:54:15 +01:00
Simon Pilgrim	72d1419bfb	[InstCombine] Add CTPOP -> CTTZ simplifications (PR43513) As detailed on PR43513, we can simplify: ctpop(x \| -x) -> bitwidth - cttz(x, false) Alive2: http://volta.cs.utah.edu:8080/z/caw49X ctpop(~x & (x - 1)) -> cttz(x, false) Alive2: http://volta.cs.utah.edu:8080/z/5zfVrx I've tweaked the initial test cases I added at rG2d712fb75584 to increase commutativity testing. Differential Revision: https://reviews.llvm.org/D76568	2020-03-23 11:04:33 +00:00
Nikita Popov	dc81923659	[InstCombine] Remove ExpensiveCombines option D75801 removed the last and only user of this option, so we can drop it now. The original idea behind this was to only run expensive transforms under -O3, but apart from the one known bits transform, this has never really taken off. I believe nowadays the recommendation is to put expensive transforms in AggressiveInstCombine instead, though that isn't terribly popular either :) Differential Revision: https://reviews.llvm.org/D76540	2020-03-22 16:56:28 +01:00
Matt Arsenault	830cfda19f	Utils: Mostly convert memcpy expansion to use Align The TTI hooks aren't converted. I also think the intrinsics should have mandatory alignment and never return MaybeAlign.	2020-03-22 11:21:44 -04:00
Nikita Popov	a63eaa5449	[SLP] Avoid repeated visitation in getVectorElementSize(); NFC We need to insert into the Visited set at the same time we insert into the worklist. Otherwise we may end up pushing the same instruction to the worklist multiple times, and only adding it to the visited set later.	2020-03-22 14:34:29 +01:00
Simon Pilgrim	f00a4b531a	[InstCombine][X86] simplifyX86immShift - remove ConstantAggregateZero handling. NFC. The llvm::computeKnownBits path now handles this.	2020-03-21 11:30:44 +00:00
Nikita Popov	2b52e4e629	[InstCombine] Remove known bits constant folding If ExpensiveCombines is enabled (which is the case with -O3 on the legacy PM and always on the new PM), InstCombine tries to compute the known bits of all instructions in the hope that all bits end up being known, which is fairly expensive. How effective is it? If we add some statistics on how often the constant folding succeeds and how many KnownBits calculations are performed and run test-suite we get: "instcombine.NumConstPropKnownBits": 642, "instcombine.NumConstPropKnownBitsComputed": 18744965, In other words, we get one fold for every 30000 KnownBits calculations. However, the truth is actually much worse: Currently, known bits are computed before performing other folds, so there is a high chance that cases that get folded by known bits would also have been handled by other folds. What happens if we compute known bits after all other folds (hacky implementation: https://gist.github.com/nikic/751f25b3b9d9e0860db5dde934f70f46)? "instcombine.NumConstPropKnownBits": 0, "instcombine.NumConstPropKnownBitsComputed": 18105547, So it turns out despite doing 18 million known bits calculations, the known bits fold does not do anything useful on test-suite. I was originally planning to move this into AggressiveInstCombine so it only runs once in the pipeline, but seeing this, I think we're better off removing it entirely. As this is the only use of the "expensive combines" mechanism, it may be removed afterwards, but I'll leave that to a separate patch. Differential Revision: https://reviews.llvm.org/D75801	2020-03-20 20:54:06 +01:00
Nikita Popov	3205d1a860	[InstCombine] Handle known shl nsw sign bit in SimplifyDemanded Ideally SimplifyDemanded should compute the same known bits as computeKnownBits(). This patch addresses one discrepancy, where ValueTracking is more powerful: If we have a shl nsw shift, we know that the sign bit of the input and output must be the same. If this results in a conflict, the result is poison. This is implemented in `2c4ca6832f/lib/Analysis/ValueTracking.cpp (L1175-L1179)` and `2c4ca6832f/lib/Analysis/ValueTracking.cpp (L904-L908)`. This implements the same basic logic in SimplifyDemanded. It's slightly stronger, because I return undef instead of zero for the poison case (which is not an option inside ValueTracking). As mentioned in https://reviews.llvm.org/D75801#inline-698484, we could detect poison in more cases, this just establishes parity with the existing logic. Differential Revision: https://reviews.llvm.org/D76489	2020-03-20 18:16:05 +01:00
Simon Pilgrim	34659de5fd	[InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by scalar amounts to generic shifts (PR40391) The sll/srl/sra scalar vector shifts can be replaced with generic shifts if the shift amount is known to be in range. This also required public DemandedElts variants of llvm::computeKnownBits to be exposed (PR36319).	2020-03-20 15:48:06 +00:00
Nikita Popov	0372768776	[InstCombine] Simplify calls with "returned" attribute If a call argument has the "returned" attribute, we can simplify the call to the value of that argument. This was already partially handled by InstSimplify/InstCombine for the case where the argument is an integer constant, and the result is thus known via known bits. The non-constant (or non-int) argument cases weren't handled though. This previously landed as an InstSimplify transform, but was reverted due to assertion failures when compiling the Linux kernel. The reason is that simplifying a call to another call breaks assumptions in call graph updating during inlining. As the code is not easy to fix, and there is no particularly strong motivation for having this in InstSimplify, the transform is only performed in InstCombine instead. Differential Revision: https://reviews.llvm.org/D75815	2020-03-20 10:23:39 +01:00
Nikita Popov	5c10967157	[InstCombine] Don't replace musttail result based on known bits This is the same change as D75824, but for two cases where InstCombine performs the same optimization: Replacing an instruction whose bits are fully known with a constant. This is not (generally) legal for musttail calls. Differential Revision: https://reviews.llvm.org/D76457	2020-03-20 10:17:09 +01:00
Florian Hahn	be86bc76f0	[Matrix] Generalize ColumnMatrixTy to MatrixTy (NFC). This patch sets the stage for supporting both row and column major layouts for matrixes. It renames ColumnMatrixTy to MatrixTy, adds booleans indicating the underlying layout to both MatrixTy and ShapeInfo and generalizes the methods of MatrixTy to support both row and column major layouts. Reviewers: Gerolf, anemet, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76324	2020-03-20 08:32:13 +00:00
Florian Hahn	3a8372ed02	[DSE] Support traversing MemoryPhis. For MemoryPhis, we have to avoid that the MemoryPhi may be executed before before the access we are currently looking at. To do this we do a post-order numbering of the basic blocks in the function and bail out once we reach a MemoryPhi with a larger (or equal) post-order block number than the current MemoryAccess. This changes the order in which we visit stores for elimination. This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72148	2020-03-20 07:51:42 +00:00
Jun Ma	032251e34d	[Coroutines] Fix PR45130 For now, when final suspend can be simplified by simplifySuspendPoint, handleFinalSuspend is executed as well to remove last case in switch instruction. This patch fixes it. Differential Revision: https://reviews.llvm.org/D76345	2020-03-20 11:27:08 +08:00
Benjamin Kramer	1db8b341a6	[Matrix] Fold single-use variable into assert Avoids -Wunused-variable warnings in Release builds.	2020-03-19 21:42:22 +01:00
Florian Hahn	796fb2e474	[Matrix] Move multiply-add code generation into separate function (NFC). This logic can be shared with the tiled code generation. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75565	2020-03-19 20:26:19 +00:00
Kazu Hirata	e23d786526	[JumpThreading] Fix infinite loop (PR44611) Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44611 by preventing an infinite loop in the jump threading pass when -jump-threading-across-loop-headers is on. Specifically, without this patch, jump threading through two basic blocks would trigger on the same area of the CFG over and over, resulting in an infinite loop. Consider testcase PR44611-across-header-hang.ll in this patch. The first opportunity to thread through two basic blocks is: from bb_body2 through bb_header and bb_body1 to bb_body2. The pass duplicates bb_header and bb_body1 as, say, bb_header.thread1 and bb_body1.thread1. Since bb_header contains a successor edge back to itself, bb_header.thread1 also contains a successor edge to bb_header, immediately giving rise to the next jump threading opportunity: from bb_header.thread1 through bb_header and bb_body1 to bb_body2. After that, we repeatedly thread an incoming edge into bb_header through bb_header and bb_body1 to bb_body2. In other words, we keep peeling one iteration from bb_header's self loop. The patch fixes the problem by preventing the pass from duplicating a basic block containing a self loop. Reviewers: wmi, junparser, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76390	2020-03-19 12:49:36 -07:00
Florian Hahn	0cc2d23751	[Matrix] Hoist load/store generation logic, add helpers for tiled access. This patch slightly generalizes the code to emit loads and stores of a matrix and adds helpers to load/store a tile of a larger matrix. This will be used in a follow-up patch introducing initial tiling. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75564	2020-03-19 19:28:21 +00:00
Simon Pilgrim	a11e5b32df	[InstCombine][X86] simplifyX86immShift - handle variable out-of-range vector shift by immediate amounts (PR40391) If we know the SSE shift amount is out of range then we can simplify to zero value (logical) or a 'signsplat' bitwidth-1 shift (arithmetic). This allows us to remove the equivalent ConstantInt constant folding path from simplifyX86immShift.	2020-03-19 18:27:31 +00:00
Simon Pilgrim	433897da4a	[InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by immediate amounts to generic shifts (PR40391) The slli/srli/srai 'immediate' vector shifts (although its not immediate anymore to match gcc) can be replaced with generic shifts if the shift amount is known to be in range.	2020-03-19 15:44:24 +00:00
Florian Hahn	4a58996dd2	[SCCP] Use constant ranges for PHI nodes. For PHIs with multiple incoming values, we can improve precision by using constant ranges for integers. We can over-approximate phis by merging the incoming values. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71933	2020-03-19 12:45:33 +00:00
Florian Hahn	8a36594a7e	[SCCP] Use constant ranges for binary operators. If one of the operands of a binary operator is a constant range, we can use ConstantRange::binaryOp to approximate the result. We still handle single element constant ranges as we did previously, with ConstantExpr::get(), because ConstantRange::binaryOp still gives worse results in a few cases for single element ranges. Also note that we bail out early if any of the operands is still unknown. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71936	2020-03-19 09:35:48 +00:00
Huihui Zhang	2ea5495759	[InstCombine][SVE] Fix InstCombiner::visitAllocaInst for scalable vector. Summary: DataLayout::getTypeAllocSize() return TypeSize. For cases where scalable property doesn't matter (check for zero-sized alloca), we should explicitly call getKnownMinSize() to avoid implicit type conversion to uint64_t, which is invalid for scalable vector type. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76386	2020-03-18 20:57:14 -07:00
Florian Hahn	fd2c15e602	[VPlan] Do not print mapping for Value2VPValue. The latest improvements to VPValue printing make this mapping clear when printing the operand. Printing the mapping separately is not required any longer. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76375	2020-03-18 21:44:07 +00:00
Florian Hahn	00c1cd1934	[VPlan] Record underlying value for VPValues created by addVPValue (NFC). Now that printing VPValues uses the underlying IR value name, if available, recording the underlying value here improves printing. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76374	2020-03-18 21:30:58 +00:00
Eli Friedman	e24e95fe90	Remove CompositeType class. The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660	2020-03-18 13:53:17 -07:00
Florian Hahn	e6a74803d4	[VPlan] Use underlying value for printing, if available. When the an underlying value is available, we can use its name for printing, as discussed in D73078. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76200	2020-03-18 17:46:57 +00:00
Simon Pilgrim	f4e495a18e	[InstCombine][X86] simplifyX86varShift - convert variable in-range per-element shift amounts to generic shifts (PR40391) AVX2/AVX512 per-element shifts can be replaced with generic shifts if the shift amounts are guaranteed to be in-range (upper bits are known zero).	2020-03-18 11:26:54 +00:00
Florian Hahn	5672ae8d86	[SCCP] Use constant ranges for select, if cond is overdefined. For selects with an unknown condition, we can approximate the result by merging the state of both options. This automatically takes care of the case where on operand is undef. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71935	2020-03-18 09:26:02 +00:00
Michael Liao	f2f8bdc2b1	Fix `-Wunused-variable` warning. NFC.	2020-03-17 20:15:50 -04:00
Florian Hahn	a72ae99cf9	[SCCP] Split up callsite handling, only propagate result on change (NFC) Functions include their arguments in the use-list. Changed function values mean that the result of the function changed. We only need to update the call sites with the new function result and do not have to propagate the call arguments. To do so, this patch splits up the visitCallSite into handleCallResult and handleCallArguments and updates markUsersAsChanged to only update call results for functions. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75846	2020-03-17 20:05:35 +00:00
Sanjay Patel	be9e3d9416	[InstCombine] reduce demand-limited bool math to logic, part 2 Follow-on suggested in: D75961	2020-03-17 15:18:18 -04:00
Tyker	e8ac825f5b	[AssumeBundles] Detection of Empty bundles Summary: Prevent InstCombine from removing llvm.assume for which the arguement is true when they have operand bundles with usefull information. Reviewers: jdoerfert, nikic, lebedev.ri Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76147	2020-03-17 15:50:15 +01:00
Florian Hahn	1d6f919df2	[SCCP] Explicitly mark values as overdefined (NFC). This was part of D60582 but can be committed separately.	2020-03-17 12:13:30 +00:00
Roman Lebedev	398b497cd0	[NFC] LoopRotate: do issue debug message when not rotating due to instr count It is somewhat problematic to notice this issue otherwise.	2020-03-17 09:26:09 +03:00
Serguei Katkov	80c351cdb6	[InstCombine] Transform to undef incorrect atomic unordered mem intrinsics According to LangRef: If len is not a positive integer multiple of element_size, then the behaviour of the intrinsic is undefined. Add InstCombine rule to transform intrinsic to undef operation. This is a follow-up for D76116. Reviewers: reames Reviewed By: reames Subscribers: hiraditya, jfb, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D76215	2020-03-17 10:20:16 +07:00
Matt Arsenault	b0bdb186f5	Utils: Always set alignment when expanding mem intrinsics This was creating natural aligned loads and stores, which may not be the case. The target could request a wider type load with less alignment.	2020-03-16 14:34:29 -04:00
Matt Arsenault	05e7d8d6ce	TTI: Add addrspace parameters to memcpy lowering functions	2020-03-16 14:34:29 -04:00
Florian Hahn	4878aa36d4	[ValueLattice] Add new state for undef constants. This patch adds a new undef lattice state, which is used to represent UndefValue constants or instructions producing undef. The main difference to the unknown state is that merging undef values with constants (or single element constant ranges) produces the constant/constant range, assuming all uses of the merge result will be replaced by the found constant. Contrary, merging non-single element ranges with undef needs to go to overdefined. Using unknown for UndefValues currently causes mis-compiles in CVP/LVI (PR44949) and will become problematic once we use ValueLatticeElement for SCCP. Reviewers: efriedma, reames, davide, nikic Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75120	2020-03-14 17:19:59 +00:00
Whitney Tsang	aca7167535	[NFC][LoopUnrollAndJam] clang-format. I am currently working on this file.	2020-03-14 00:04:10 +00:00
Akira Hatanaka	c6f1713c46	[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't a tail call This reapplies the patch in https://reviews.llvm.org/rG1f5b471b8bf4, which was reverted because it was causing crashes. https://bugs.chromium.org/p/chromium/issues/detail?id=1061289#c2 Check that HasSafePathToCall is true before checking the call is a tail call. Original commit message: Previosly ARC optimizer removed the autoreleaseRV/retainRV pair in the following code, which caused the object returned by @something to be placed in the autorelease pool because the call to @something isn't a tail call: ``` %call = call i8* @something(...) %2 = call i8* @objc_retainAutoreleasedReturnValue(i8* %call) %3 = call i8* @objc_autoreleaseReturnValue(i8* %2) ret i8* %3 ``` Fix the bug by checking whether @something is a tail call. rdar://problem/59275894	2020-03-13 13:52:14 -07:00
Alexey Zhikhartsev	f71abec661	[LoopInterchange] Fix interchanging contents of preheader BBs Summary: Previously LCSSA was getting broken by placing instructions into the (newly) inner header instead of the preheader. Fixes PR43474 Reviewers: fhahn Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75943	2020-03-13 15:59:37 -04:00
Reid Kleckner	478b06e687	Revert "[ObjC][ARC] Check the basic block size before calling DominatorTree::dominate" This reverts commit `5c3117b0a9` This should not be necessary after `7593a480db`, and Florian Hahn has confirmed that the problem no longer reproduces with this patch. I happened to notice this code because the FIXME talks about OrderedBasicBlock. Reviewed By: fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D76075	2020-03-13 11:57:55 -07:00
Huihui Zhang	fc1f205745	[SLPVectorizer][SVE] Bail out early for scalable vector. Summary: SLPVectorizer try to vectorize list of scalar instructions of the same type, instructions already vectorized are rejected through isValidElementType(). Without this patch, tryToVectorizeList() will first try to determine vectorization factor of a list of Instructions before checking whether each instruction has unsupported type or not. For instructions already vectorized for SVE, it will crash at getVectorElementSize(), where it try to return a fixed size. This patch make sure invalid element types are rejected before trying to get vectorization factor. This make sure we are not trying to vectorize instructions already vectorized. Reviewers: sdesmalen, efriedma, spatel, RKSimon, ABataev, apazos, rengolin Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76017	2020-03-13 11:23:31 -07:00
Sanjay Patel	94f5d73182	[SimplifyCFG] fix formatting; NFC	2020-03-13 14:12:28 -04:00
Sanjay Patel	51e53af11c	[SimplifyCFG] fix debug print formatting; NFC	2020-03-13 14:12:28 -04:00
Florian Hahn	0c5b6e2ea5	Recommit "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This patch should fix the cause of the stage2 failures and PR45185. This reverts the revert commit `c52f839e72`.	2020-03-13 17:03:22 +00:00
Tyker	69375fd0a3	[AssumeBundles] Preserve Information in the inliner Summary: during inling Create and insert an llvm.assume with attributes to preserve them. to prevent any changes for now generation of llvm.assume is under a flag disabled by default. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75825	2020-03-13 17:35:47 +01:00
omarahmed1111	b285b333dc	[Attributor] Detect possibly unbounded cycles in functions This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not. Loops with maximum trip count are considered bounded, any other cycle not. It also contains some fixed tests and some added tests contain bounded and unbounded loops and non-loop cycles. Reviewed By: jdoerfert, uenoku, baziotis Differential Revision: https://reviews.llvm.org/D74691	2020-03-13 11:17:33 -05:00
Pankaj Gode	bf990530ae	[Attributor] Improve noalias preservation using reachability Resolution for below fixme: (ii) Check whether the value is captured in the scope using AANoCapture. FIXME: This is conservative though, it is better to look at CFG and check only uses possibly executed before this callsite. Propagates caller argument's noalias attribute to callee. Reviewed by: jdoerfert, uenoku Reviewers: jdoerfert, sstefan1, uenoku Subscribers: uenoku, sstefan1, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D71617	2020-03-13 21:09:08 +05:30
Sanjay Patel	cbeffa3f6c	[SimplifyCFG] convert if-else chain to switch; NFC Fix formatting of related function names while changing the code.	2020-03-13 10:28:41 -04:00
Nico Weber	86eb2c3991	Revert "[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't" This reverts commit `1f5b471b8b`. Causes asserts when building code with arc. See https://bugs.chromium.org/p/chromium/issues/detail?id=1061289#c2 for a full repro. Will post a creduced repro once creduce is done running.	2020-03-13 10:16:02 -04:00
Johannes Doerfert	a198adb490	[Attributor] IPO across definition boundary of a function marked alwaysinline Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D75590	2020-03-13 01:06:12 -05:00
rathod-sahaab	263c4a3c75	Fix compiler warning when compiling without asserts This patch aims to prevent warning-as-error failures in release build. As suggested in this comment https://reviews.llvm.org/D69930#1910922 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D75970	2020-03-13 00:26:49 -05:00
Huihui Zhang	118abf2017	[SVE] Update API ConstantVector::getSplat() to use ElementCount. Summary: Support ConstantInt::get() and Constant::getAllOnesValue() for scalable vector type, this requires ConstantVector::getSplat() to take in 'ElementCount', instead of 'unsigned' number of element count. This change is needed for D73753. Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74386	2020-03-12 13:22:41 -07:00
Florian Hahn	c52f839e72	Revert "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This commit is likely causing clang-with-lto-ubuntu to fail http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/16052 Also causes PR45185. This reverts commit `f1ac5d2263`.	2020-03-12 18:49:11 +00:00
Hideto Ueno	d9bf79f4e9	[Attributor][FIX] Add a missing dependence track in noalias deduction	2020-03-12 15:27:35 +00:00
Florian Hahn	f1ac5d2263	[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI) This patch switches SCCP to use ValueLatticeElement for lattice values, instead of the local LatticeVal, as first step to enable integer range support. This patch does not make use of constant ranges for additional operations and the only difference for now is that integer constants are represented by single element ranges. To preserve the existing behavior, the following helpers are used * isConstant(LV): returns true when LV is either a constant or a constant range with a single element. This should return true in the same cases where LV.isConstant() returned true previously. * getConstant(LV): returns a constant if LV is either a constant or a constant range with a single element. This should return a constant in the same cases as LV.getConstant() previously. * getConstantInt(LV): same as getConstant, but additionally casted to ConstantInt. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60582	2020-03-12 12:03:06 +00:00
Max Kazantsev	3dc6e53c97	[LoopPeel] Turn incorrect assert into a check Summary: This patch replaces incorrectt assert with a check. Previously it asserts that if SCEV cannot prove `isKnownPredicate(A != B)`, then it should be able to prove `isKnownPredicate(A == B)`. Both these fact may be not provable. It is shown in the provided test: Could not prove: `{-294,+,-2}<%bb1> != 0` Asserting: `{-294,+,-2}<%bb1> == 0` Obviously, this SCEV is not equal to zero, but 0 is in its range so we cannot also prove that it is not zero. Instead of assert, we should be checking the required conditions explicitly. Reviewers: lebedev.ri, fhahn, sanjoy, fedor.sergeev Reviewed By: lebedev.ri Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76050	2020-03-12 17:23:07 +07:00
Sanjay Patel	fae900921b	[InstCombine] reduce demand-limited bool math to logic The cmp math test is inspired by memcmp() patterns seen in D75840. I know there's at least 1 related fold we can do here if both values are sext'd, but I'm not seeing a way to generalize further. We have some other bool math patterns that we want to reduce, but that might require fixing the bogus transforms noted in D72396. Alive proof translations of the regression tests: https://rise4fun.com/Alive/zGWi Name: demand add 1 %xz = zext i1 %x to i32 %ys = sext i1 %y to i32 %sub = add i32 %xz, %ys %r = lshr i32 %sub, 31 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = zext i1 %and to i32 Name: demand add 2 %xz = zext i1 %x to i5 %ys = sext i1 %y to i5 %sub = add i5 %xz, %ys %r = and i5 %sub, 16 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = select i1 %and, i5 -16, i5 0 Name: demand add 3 %xz = zext i1 %x to i8 %ys = sext i1 %y to i8 %a = add i8 %ys, %xz %r = ashr i8 %a, 7 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = sext i1 %and to i8 Name: cmp math %gt = icmp ugt i32 %x, %y %lt = icmp ult i32 %x, %y %xz = zext i1 %gt to i32 %yz = zext i1 %lt to i32 %s = sub i32 %xz, %yz %r = lshr i32 %s, 31 => %r = zext i1 %lt to i32 Differential Revision: https://reviews.llvm.org/D75961	2020-03-11 15:45:58 -04:00
Florian Hahn	bc6c8c4bbb	[Matrix] Add remark propagation along the inlined-at chain. This patch adds support for propagating matrix expressions along the inlined-at chain and emitting remarks at the traversed function scopes. To motivate this new behavior, consider the example below. Without the remark 'up-leveling', we would only get remarks in load.h and store.h, but we cannot generate a remark describing the full expression in toplevel.cpp, which is the place where the user has the best chance of spotting/fixing potential problems. With this patch, we generate a remark for the load in load.h, one for the store in store.h and one for the complete expression in toplevel.cpp. For a bigger example, please see remarks-inlining.ll. load.h: template <typename Ty, unsigned R, unsigned C> Matrix<Ty, R, C> load(Ty Ptr) { Matrix<Ty, R, C> Result; Result.value = reinterpret_cast <typename Matrix<Ty, R, C>::matrix_t >(Ptr); return Result; } store.h: template <typename Ty, unsigned R, unsigned C> void store(Matrix<Ty, R, C> M1, Ty Ptr) { reinterpret_cast<typename decltype(M1)::matrix_t >(Ptr) = M1.value; } toplevel.cpp void test(double A, double B, double *C) { store(add(load<double, 3, 5>(A), load<double, 3, 5>(B)), C); } For a given function, we traverse the inlined-at chain for each matrix instruction (= instructions with shape information). We collect the matrix instructions in each DISubprogram we visit. This produces a mapping of DISubprogram -> (List of matrix instructions visible in the subpogram). We then generate remarks using the list of instructions for each subprogram in the inlined-at chain. Note that the list of instructions for a subprogram includes the instructions from its own subprograms recursively. For example using the example above, for the subprogram 'test' this includes inline functions 'load' and 'store'. This allows surfacing the remarks at a level useful to users. Please note that the current approach may create a lot of extra remarks. Additional heuristics to cut-off the traversal can be implemented in the future. For example, it might make sense to stop 'up-leveling' once all matrix instructions are at the same debug location. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D73600	2020-03-11 17:40:08 +00:00
Anna Welker	a6d3bec83f	[TTI][ARM][MVE] Refine gather/scatter cost model Refines the gather/scatter cost model, but also changes the TTI function getIntrinsicInstrCost to accept an additional parameter which is needed for the gather/scatter cost evaluation. This did require trivial changes in some non-ARM backends to adopt the new parameter. Extending gathers and truncating scatters are now priced cheaper. Differential Revision: https://reviews.llvm.org/D75525	2020-03-11 10:23:41 +00:00
Fangrui Song	a0c0389ffb	[SimplifyLibcalls] Don't replace locked IO (fgetc/fgets/fputc/fputs/fread/fwrite) with unlocked IO (_unlocked) This essentially reverts some of the SimplifyLibcalls part changes of D45736 [SimplifyLibcalls] Replace locked IO with unlocked IO. C11 7.21.5.2 The fflush function > If stream is a null pointer, the fflush function performs this flushing action on all streams for which the behavior is defined above. i.e. fopen'ed FILE is inherently captured. POSIX.1-2017 getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked - stdio with explicit client locking > These functions can safely be used in a multi-threaded program if and only if they are called while the invoking thread owns the ( FILE ) object, as is the case after a successful call to the flockfile() or ftrylockfile() functions. After a thread fopen'ed a FILE, when it is calling foobar() which is now replaced by foobar_unlocked(), if another thread is concurrently calling fflush(0), the behavior is undefined. C11 7.22.4.4 The exit function > Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed. The replacement is only feasible if the program is single threaded, or exit or fflush(0) is never called. See also http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556615.html for how the replacement makes libc interceptors difficult to implement. dalias: in a worst case, it's unbounded data corruption because of concurrent access to pointers without synchronization. f->wpos or rpos could get outside of the buffer, thread A could do f->wpos += j after knowing j is in bounds, while thread B also changes it concurrently. This can produce exploitable conditions depending on libc internals. Revert the SimplifyLibcalls part change because the cons obviously overweigh the pros. Even when the replacement is feasible, the benefit is indemonstrable, more so in an application instead of an artificial glibc benchmark. Theoretically the replacement could be beneficial when calling getc_unlocked/putc_unlocked in a loop, but then it is better using a blocked IO operation and the user is likely aware of that. The function attribute inference is still useful and thus kept. Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D75933	2020-03-10 11:11:58 -07:00
Benjamin Kramer	247a177cf7	Give helpers internal linkage. NFC.	2020-03-10 18:27:42 +01:00
Tyker	a4cde9ad7b	Fixed [AssumeBundles] Move to IR so it can be used by Analysis This is a recommit of `57c964aaa7` after fixing modules build.	2020-03-10 18:02:39 +01:00
Simon Moll	d871ef4e6a	[instcombine] remove fsub to fneg hacks; only emit fneg Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for unary operators to result in the same IR rewrites for fneg as for the idiom. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75467	2020-03-10 16:57:02 +01:00
Florian Hahn	c8c14d979a	[InstCombine] Support vectors in SimplifyAddWithRemainder. SimplifyAddWithRemainder currently also matches for vector types, but tries to create an integer constant, which causes a crash. By using Constant::getIntegerValue() we can support both the scalar and vector cases. The 2 added test cases crash without the fix. Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Differential Revision: https://reviews.llvm.org/D75906	2020-03-10 14:29:40 +00:00
Jonas Paulsson	c2dafe12dc	[SimplifyCFG] Skip merging return blocks if it would break a CallBr. SimplifyCFG should not merge empty return blocks and leave a CallBr behind with a duplicated destination since the verifier will then trigger an assert. This patch checks for this case and avoids the transformation. CodeGenPrepare has a similar check which also has a FIXME comment about why this is needed. It seems perhaps better if these two passes would eventually instead update the CallBr instruction instead of just checking and avoiding. This fixes https://bugs.llvm.org/show_bug.cgi?id=45062. Review: Craig Topper Differential Revision: https://reviews.llvm.org/D75620	2020-03-10 14:59:13 +01:00
Sanjay Patel	467eec0910	[InstCombine] fold gep-of-select-of-constants (PR45084) As shown in: https://bugs.llvm.org/show_bug.cgi?id=45084 ...we failed to combine a gep with constant indexes with a pointer operand that is a select of constants. Differential Revision: https://reviews.llvm.org/D75807	2020-03-10 09:25:13 -04:00
Florian Hahn	2d6ecf4648	[SLP] Support vectorizing functions provided by vector libs. It seems like the SLPVectorizer is currently not aware of vector versions of functions provided by libraries like Accelerate [1]. This patch updates SLPVectorizer to use the same infrastructure the LoopVectorizer uses to detect vectorizable library functions. For calls, it computes the cost of an intrinsic call (existing behavior) and the cost of a vector function library call, if available. Like LoopVectorizer, it assumes the cost of the vector function is simply the cost of a call to a vector function. [1] https://developer.apple.com/documentation/accelerate Reviewers: ABataev, RKSimon, spatel Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D75878	2020-03-10 13:10:50 +00:00
ahatanak	1f5b471b8b	[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't a tail call Previosly ARC optimizer removed the autoreleaseRV/retainRV pair in the following code, which caused the object returned by @something to be placed in the autorelease pool because the call to @something isn't a tail call: ``` %call = call i8* @something(...) %2 = call i8* @objc_retainAutoreleasedReturnValue(i8* %call) %3 = call i8* @objc_autoreleaseReturnValue(i8* %2) ret i8* %3 ``` Fix the bug by checking whether @something is a tail call. rdar://problem/59275894	2020-03-09 13:21:38 -07:00
Nikita Popov	c3ca6876ed	[InstCombine] Don't simplify calls without uses When simplifying a call without uses, replaceInstUsesWith() is going to do nothing, but we'll skip all following folds. We can only run into this problem with calls that both simplify and are not trivially dead if unused, which currently seems to happen only with calls to undef, as the test diff shows. When extending SimplifyCall() to handle "returned" attributes, this becomes a much bigger problem, so I'm fixing this first. Differential Revision: https://reviews.llvm.org/D75814	2020-03-09 18:47:46 +01:00
Jonas Devlieghere	882f589e20	Revert "[AssumeBundles] Move to IR so it can be used by Analysis" This breaks the modules build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/ http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/ This reverts commit `57c964aaa7`.	2020-03-09 09:02:47 -07:00
evgeny	6d2032e259	[WPD] Provide a way to prevent functions from being devirtualized Differential revision: https://reviews.llvm.org/D75617	2020-03-09 14:05:15 +03:00
Hideto Ueno	bdcbdb4848	[Attributor] Deduction based on path exploration This patch introduces the propagation of known information based on path exploration. For example, ``` int u(int c, int p){ if(c) { return p; } else { return *p + 1; } } ``` An argument `p` is dereferenced whatever c's value is. For an instruction `CtxI`, we accumulate branch instructions in the must-be-executed-context of `CtxI` and then, we take the conjunction of the successors' known state. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65593	2020-03-09 14:29:26 +09:00
Sanjay Patel	a69158c12a	[VectorCombine] fold extract-extract-op with different extraction indexes opcode (extelt V0, Ext0), (ext V1, Ext1) --> extelt (opcode (splat V0, Ext0), V1), Ext1 The first part of this patch generalizes the cost calculation to accept different extraction indexes. The second part creates a shuffle+extract before feeding into the existing code to create a vector op+extract. The patch conservatively uses "TargetTransformInfo::SK_PermuteSingleSrc" rather than "TargetTransformInfo::SK_Broadcast" (splat specifically from element 0) because we do not have a more general "SK_Splat" currently. That does not affect any of the current regression tests, but we might be able to find some cost model target specialization where that comes into play. I suspect that we can expose some missing x86 horizontal op codegen with this transform, so I'm speculatively adding a debug flag to disable the binop variant of this transform to allow easier testing. The test changes show that we're sensitive to cost model diffs (as we should be), so that means that patches like D74976 should have better coverage. Differential Revision: https://reviews.llvm.org/D75689	2020-03-08 09:57:55 -04:00
Tyker	57c964aaa7	[AssumeBundles] Move to IR so it can be used by Analysis Summary: Assume bundles need to be usable by Analysis and Transforms/Utils isn't. so this commit moves utilities to deal with asusme bundles to IR. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75618	2020-03-08 12:21:50 +01:00
Tyker	84056394e9	[AssumeBundles] Add API to query a bundles from a use Summary: Finding what information is know about a value from a use is generally useful and can be done quickly. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75616	2020-03-08 12:04:23 +01:00
Nikita Popov	51a466a61f	[InstCombine] Fix known bits handling in SimplifyDemandedUseBits Fixes a regression from D75801. SimplifyDemandedUseBits() is also supposed to compute the known bits (of the demanded subset) of the instruction. For unknown instructions it does so by directly calling computeKnownBits(). For known instructions it will compute known bits itself. However, for instructions where only some cases are handled directly (e.g. a constant shift amount) the known bits invocation for the unhandled case is sometimes missing. This patch adds the missing calls and thus removes the main discrepancy with ExpensiveCombines mode. Differential Revision: https://reviews.llvm.org/D75804	2020-03-07 18:16:41 +01:00
Stefanos Baziotis	01c48d7d11	[Attributor] Fold terminators before changing instructions to unreachable It is possible that an instruction to be changed to unreachable is in the same block with a terminator that can be constant-folded. In this case, as of now, the instruction will be changed to unreachable before the terminator is folded. But, then the whole BB becomes invalidated and so when we go ahead to fold the terminator, we trap. Change the order of these two. Differential Revision: https://reviews.llvm.org/D75780	2020-03-07 12:38:44 +02:00
Andrew Monshizadeh	c5a06019d2	Extend TimeTrace to LLVM's new pass manager With the addition of the LLD time tracing it made sense to include coverage for LLVM's various passes. Doing so ensures that ThinLTO is also covered with a time trace. Before: {F11333974} After: {F11333928} Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D74516	2020-03-06 14:45:19 -08:00
Anna Thomas	59029b9eef	[RS4GC] Handle uses of extractelement for conversion from vector to scalar base As mentioned in the comments, extractelement is special since we actually want a scalar base for that element we extracted from the vector (i.e. not a vector base). This same logic should apply to uses of the extractelement such as phis and selects which have the same BDV as the extractelement. Howeber, for these uses we conservatively mark the BDV state as conflict, since setting the EE's new base BDV does not always dominate these uses. Added testcase showcases the problem where the BDV identification chokes on the incorrect cast from vector to scalar for the phi use of extractelement. Tests-Run: make check, internal fuzzer testing Reviewers: reames, skatkov, dantrushin Reviewed-By: dantrushin Differential Revision: https://reviews.llvm.org/D75704	2020-03-06 16:28:49 -05:00
Roman Lebedev	1badf7c33a	[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org/z/G24anE This resolves phase-ordering bug that was introduced in D75145 for https://godbolt.org/z/2gBwF2 https://godbolt.org/z/XvgSua Reviewers: spatel, nikic, dmgreen, xbolva00 Reviewed By: nikic, xbolva00 Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75757	2020-03-06 21:39:07 +03:00
Jay Foad	11d1573bb6	[APFloat] Make use of new overloaded comparison operators. NFC. Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75744	2020-03-06 16:42:53 +00:00
Fangrui Song	952ee0df9e	ThinLTOBitcodeWriter: drop dso_local when a GlobalVariable is converted to a declaration If we infer the dso_local flag for -fpic, dso_local should be dropped when we convert a GlobalVariable a declaration. dso_local causes the generation of direct access (e.g. R_X86_64_PC32). Such relocations referencing STB_GLOBAL STV_DEFAULT objects are not allowed in a -shared link. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74749	2020-03-05 18:09:33 -08:00
Zhongduo Lin	eae228a292	[IndVarSimplify] Extend previous special case for load use instruction to any narrow type loop variant to avoid extra trunc instruction Summary: The widenIVUse avoids generating trunc by evaluating the use as AddRec, this will not work when: 1) SCEV traces back to an instruction inside the loop that SCEV can not expand, eg. add %indvar, (load %addr) 2) SCEV finds a loop variant, eg. add %indvar, %loopvariant While SCEV fails to avoid trunc, we can still try to use instruction combining approach to prove trunc is not required. This can be further extended with other instruction combining checks, but for now we handle the following case (sub can be "add" and "mul", "nsw + sext" can be "nus + zext") ``` Src: %c = sub nsw %b, %indvar %d = sext %c to i64 Dst: %indvar.ext1 = sext %indvar to i64 %m = sext %b to i64 %d = sub nsw i64 %m, %indvar.ext1 ``` Therefore, as long as the result of add/sub/mul is extended to wide type with right extension and overflow wrap combination, no trunc is required regardless of how %b is generated. This pattern is common when calculating address in 64 bit architecture. Note that this patch reuse almost all the code from D49151 by @az: https://reviews.llvm.org/D49151 It extends it by providing proof of why trunc is unnecessary in more general case, it should also resolve some of the concerns from the following discussion with @reames. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180910/585945.html Reviewers: sanjoy, efriedma, sebpop, reames, az, javed.absar, amehsan Reviewed By: az, amehsan Subscribers: hiraditya, llvm-commits, amehsan, reames, az Tags: #llvm Differential Revision: https://reviews.llvm.org/D73059	2020-03-05 16:27:59 -05:00
Hiroshi Yamauchi	76b9901fb1	[PGO][PGSO] Use IsColdXNthPercentile for sample PGO. Summary: This performs better for sample PGO. NFC as PGSOColdCodeOnlyForSamplePGO is still true. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75550	2020-03-05 09:54:54 -08:00
Florian Hahn	40e7bfc424	[VPlan] Use consecutive numbers to print VPValues instead of addresses. Currently when printing VPValues we use the object address, which makes it hard to distinguish VPValues as they usually are large numbers with varying distance between them. This patch adds a simple slot tracker, similar to the ModuleSlotTracker used for IR values. In order to dump a VPValue or anything containing a VPValue, a slot tracker for the enclosing VPlan needs to be created. The existing VPlanPrinter can take care of that for the existing code. We assign consecutive numbers to each VPValue we encounter in a reverse post order traversal of the VPlan. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D73078	2020-03-05 14:55:15 +00:00
Simon Pilgrim	01a91a6de7	Fix static analyzer uninitialized variable warning. NFCI.	2020-03-05 14:22:24 +00:00
Jun Ma	b10deb9487	[Coroutines] Optimized coroutine elision based on reachability Differential Revision: https://reviews.llvm.org/D75440	2020-03-05 14:43:50 +08:00
Sameer Sahasrabuddhe	42febbab91	StructurizeCFG: simplify phi nodes when possible After structurization, some phi nodes can have a single incoming edge and can be simplified away. This change runs a simplify query on all phis that are either modified or added by the structurizer. This also moves some phis closer to their use as a side benefit. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D75500	2020-03-05 10:33:15 +05:30
Guozhi Wei	ee9a3eba76	[CodeGenPrepare] Handle ExtractValueInst in dupRetToEnableTailCallOpts As the test case shows if there is an ExtractValueInst in the Ret block, function dupRetToEnableTailCallOpts can't duplicate it into the block containing call. So later no tail call is generated in CodeGen. This patch adds the ExtractValueInst handling code in function dupRetToEnableTailCallOpts and FoldReturnIntoUncondBranch, and later tail call can be generated for this case. Differential Revision: https://reviews.llvm.org/D74242	2020-03-04 11:10:32 -08:00
David Green	38e532278e	[LSR] Add masked load and store handling This teaches Loop Strength Reduction the details about masked load and store address operands, so that it can have a better time optimising them as it would for normal loads and stores. Differential Revision: https://reviews.llvm.org/D75371	2020-03-04 18:36:10 +00:00
Nikita Popov	293d813020	[InstCombine] Don't explicitly invoke const folding in shift combine InstCombine uses an IRBuilder that automatically performs target-dependent constant folding, so explicitly invoking it here is not necessary.	2020-03-04 18:33:00 +01:00
Nikita Popov	9b5de84e27	[InstCombine] Use IRBuilder to create bitcast This makes sure that the constant expression bitcast goes through target-dependent constant folding, and thus avoids an additional iteration of InstCombine.	2020-03-04 18:28:38 +01:00
Nikita Popov	0e890cd4d4	[ConstantFolding] Always return something from ConstantFoldConstant Spin-off from D75407. As described there, ConstantFoldConstant() currently returns null for non-ConstantExpr/ConstantVector inputs, but otherwise always returns non-null, independently of whether any folding has happened or not. This is confusing and makes consumer code more complicated. I would expect either that ConstantFoldConstant() returns only if it actually folded something, or that it always returns non-null. I'm going to the latter possibility here, which appears to be more useful considering existing usage. Differential Revision: https://reviews.llvm.org/D75543	2020-03-04 18:24:47 +01:00
Sanjay Patel	71a316883d	[PassManager] adjust VectorCombine placement The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs: https://bugs.llvm.org/show_bug.cgi?id=45015 https://bugs.llvm.org/show_bug.cgi?id=42022 This patch contains a few independent changes: 1. Move the pass up in the pipeline, so it happens just after loop-vectorization. This is only to keep vectorization passes together in the pipeline at the moment. I don't have evidence of interaction between these yet. 2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was partly proposed as far back as rL219644 (which is why it's effectively being moved in the old PM code). This is important because the subsequent -instcombine doesn't work as well without EarlyCSE. With the CSE, -instcombine is able to squash shuffles together in 1 of the tests (because those are simple "select" shuffles). 3. Remove the -vector-combine pass that was running after SLP. We may want to do that eventually, but I don't have a test case to support it yet. Differential Revision: https://reviews.llvm.org/D75145	2020-03-04 11:10:49 -05:00
Matt Arsenault	f9047ede58	LICM: Reorder condition checks Check the fast math flag before the more expensive loop check.	2020-03-03 17:15:57 -05:00
Brian Gesiak	aa85b437a9	[Coroutines] Use dbg.declare for frame variables Summary: https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f is an example of a small C++ program that uses C++20 coroutines that is difficult to debug, due to the loss of debug info for variables that "spill" across coroutine suspension boundaries. This patch addresses that issue by inserting 'llvm.dbg.declare' intrinsics that point the debugger to the variables' location at an offset to the coroutine frame. With this patch, I confirmed that running the 'frame variable' commands in https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f at the specified breakpoints results in the correct values being printed for coroutine frame variables 'i' and 'j' when using an lldb built from trunk, as well as with gdb 8.3 (lldb 9.0.1, however, could not print the values). The added test case also verifies this improved behavior. The existing coro-debug.ll test case is also modified to reflect the locations at which Clang actually places calls to 'dbg.declare', and additional checks are added to ensure this patch works as intended in that example as well. Reviewers: vsk, jmorse, GorNishanov, lewissbaker, wenlei Subscribers: EricWF, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75338	2020-03-03 17:13:46 -05:00
Tyker	c5ec8890c9	[NFC] Try fix ubsan buildbot after `876d133789`	2020-03-03 17:53:02 +01:00
Tyker	876d133789	[AssumeBundles] Add API to fill a map from operand bundles of an llvm.assume. Summary: This patch adds a new way to query operand bundles of an llvm.assume that is much better suited to some users like the Attributor that need to do many queries on the operand bundles of llvm.assume. Some modifications of the IR like replaceAllUsesWith can cause information in the map to be outdated, so this API is more suited to analysis passes and passes that don't make modification that could invalidate the map. Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75020	2020-03-03 14:22:52 +01:00
Florian Hahn	05afa55521	[VPlan] Add getPlan() to VPBlockBase. This patch adds a getPlan accessor to VPBlockBase, which finds the entry block of the plan containing the block and returns the plan set for this block. VPBlockBase contains a VPlan pointer, but it should only be set for the entry block of a plan. This allows moving blocks without updating the pointer for each moved block and in the future we might introduce a parent relationship between plans and blocks, similar to the one in LLVM IR. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D74445	2020-03-03 13:20:13 +00:00
David Green	ec7e4a9a80	[LoopVectorizer] Add reduction tests for inloop reductions. NFC Also adds a force-reduction-intrinsics option for testing, for forcing the generation of reduction intrinsics even when the backend is not requesting them.	2020-03-03 10:54:00 +00:00
Alok Kumar Sharma	6f029dadf6	[DebugInfo] Avoid generating duplicate llvm.dbg.value Summary: This is to avoid generating duplicate llvm.dbg.value instrinsic if it already exists after the Instruction. Before inserting llvm.dbg.value instruction, LLVM checks if the same instruction is already present before the instruction to avoid duplicates. Currently it misses to check if it already exists after the instruction. flang generates IR like this. %4 = load i32, i32* %i1_311, align 4, !dbg !42 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 When this IR is processed in llvm, it ends up inserting duplicates. %4 = load i32, i32* %i1_311, align 4, !dbg !42 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 We have now updated LdStHasDebugValue to include the cases when instruction is already followed by same dbg.value instruction we intend to insert. Now, Definition and usage of function LdStHasDebugValue are deleted. RemoveRedundantDbgInstrs is called for the cleanup of duplicate dbg.value's Testing: Added unit test for validation check-llvm check-debuginfo (the debug info integration tests) Reviewers: aprantl, probinson, dblaikie, jmorse, jini.susan.george SouraVX, awpandey, dstenb, vsk Reviewed By: aprantl, jmorse, dstenb, vsk Differential Revision: https://reviews.llvm.org/D74030	2020-03-03 09:56:45 +05:30
Juneyoung Lee	9f1f244d3c	[LICM] Allow freeze to hoist/sink out of a loop Summary: This patch allows LICM to hoist/sink freeze instructions out of a loop. Reviewers: reames, fhahn, efriedma Reviewed By: reames Subscribers: jfb, lebedev.ri, hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75400	2020-03-03 12:29:39 +09:00
Sumanth Gundapaneni	9897daa6bf	Update LSR's logic that identifies a post-increment SCEV value. One of the checks has been removed as it seem invalid. The LoopStep size is always almost a 32-bit. Differential Revision: https://reviews.llvm.org/D75079	2020-03-02 16:34:18 -06:00
Teresa Johnson	80bf137fa1	Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"" This reverts commit `80d0a137a5`, and the follow on fix in `873c0d0786`. It is causing test failures after a multi-stage clang bootstrap. See discussion on D73242 and D75201.	2020-03-02 14:02:13 -08:00
Teresa Johnson	873c0d0786	[ThinLTO/LowerTypeTests] Handle unpromoted local type ids Summary: Fixes an issue that cropped up after the changes in D73242 to delay the lowering of type tests. LTT couldn't handle any type tests with non-string type id (which happens for local vtables, which we try to promote during the compile step but cannot always when there are no exported symbols). We can simply treat the same as having an Unknown resolution, which delays their lowering, still allowing such type tests to be used in subsequent optimization (e.g. planned usage during ICP). The final lowering which simply removes these handles them fine. Beefed up an existing ThinLTO test for such unpromoted type ids so that the internal vtable isn't removed before lower type tests, which hides the problem. Reviewers: evgeny777, pcc Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, aganea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75201	2020-03-02 09:31:44 -08:00
Arkady Shlykov	3dcaf296ae	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: RKSimon, xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-03-02 08:37:11 -08:00
David Green	d0d38df091	[LoopVectorizer] Change types of lists from pointers to references. NFC getReductionVars, getInductionVars and getFirstOrderRecurrences were all being returned from LoopVectorizationLegality as pointers to lists. This just changes them to be references, cleaning up the interface slightly. Differential Revision: https://reviews.llvm.org/D75448	2020-03-02 15:04:41 +00:00
Reid Kleckner	1adbe86d87	[WinEH] Fix inttoptr+phi optimization in presence of catchswitch getFirstInsertionPt's return value must be checked for validity before casting it to Instruction*. Don't attempt to insert casts after a phi in a catchswitch block. Fixes PR45033, introduced in D37832. Reviewed By: davidxl, hfinkel Differential Revision: https://reviews.llvm.org/D75381	2020-03-01 07:49:28 -08:00
Juneyoung Lee	5cbb265694	[GVN] Fold equivalent freeze instructions Summary: This patch defines two freeze instructions to have the same value number if they are equivalent. This is allowed because GVN replaces all uses of a duplicated instruction with another. If it partially rewrites use, it is not allowed. e.g) ``` a = freeze(x) b = freeze(x) use(a) use(a) use(b) => use(a) use(b) // This is not allowed! use(b) ``` Reviewers: fhahn, reames, spatel, efriedma Reviewed By: fhahn Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75398	2020-03-01 07:32:05 +09:00
Simon Pilgrim	7e9747b50b	[X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554) This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default. Differential Revision: https://reviews.llvm.org/D75162	2020-02-29 18:57:35 +00:00
Vedant Kumar	dd1ea9de2e	Reland: [Coverage] Revise format to reduce binary size Try again with an up-to-date version of D69471 (`99317124` was a stale revision). --- Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2020-02-28 18:12:04 -08:00
Vedant Kumar	3388871714	Revert "[Coverage] Revise format to reduce binary size" This reverts commit `99317124e1`. This is still busted on Windows: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/40873 The llvm-cov tests report 'error: Could not load coverage information'.	2020-02-28 18:03:15 -08:00
Vedant Kumar	99317124e1	[Coverage] Revise format to reduce binary size Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2020-02-28 17:33:25 -08:00
Matt Morehouse	30bb737a75	[DFSan] Add __dfsan_cmp_callback. Summary: When -dfsan-event-callbacks is specified, insert a call to __dfsan_cmp_callback on every CMP instruction. Reviewers: vitalybuka, pcc, kcc Reviewed By: kcc Subscribers: hiraditya, #sanitizers, eugenis, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75389	2020-02-28 15:49:44 -08:00
Matt Morehouse	f668baa459	[DFSan] Add __dfsan_mem_transfer_callback. Summary: When -dfsan-event-callbacks is specified, insert a call to __dfsan_mem_transfer_callback on every memcpy and memmove. Reviewers: vitalybuka, kcc, pcc Reviewed By: kcc Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75386	2020-02-28 15:48:25 -08:00
Matt Morehouse	52f889abec	[DFSan] Add __dfsan_load_callback. Summary: When -dfsan-event-callbacks is specified, insert a call to __dfsan_load_callback() on every load. Reviewers: vitalybuka, pcc, kcc Reviewed By: vitalybuka, kcc Subscribers: hiraditya, #sanitizers, llvm-commits, eugenis, kcc Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75363	2020-02-28 14:26:09 -08:00
Austin Kerbow	4fa63fd452	[VectorCombine] Fix assert on compare extract index Extract index could be a differnet integral type. Differential Revision: https://reviews.llvm.org/D75327	2020-02-28 10:37:08 -08:00
Valery N Dmitriev	d723ec4f04	[SLP][NFC] Assert that tree entry operands completed when scheduler looks for dependencies. This change adds an assertion to prevent tricky bug related to recursive approach of building vectorization tree. For loop below takes number of operands directly from tree entry rather than from scalars. If the entry at this moment turns out incomplete (i.e. not all operands set) then not all the dependencies will be seen by the scheduler. This can lead to failed scheduling (and thus failed vectorization) for perfectly vectorizable tree. Here is code example which is likely to fire the assertion: for (i : VL0->getNumOperands()) { ... TE->setOperand(i, Operands); buildTree_rec(Operands, Depth + 1,...); } Correct way is two steps process: first set all operands to a tree entry and then recursively process each operand. Differential Revision: https://reviews.llvm.org/D75296	2020-02-28 10:34:48 -08:00
Hiroshi Yamauchi	f16d2bec40	Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration. This aims to fix a missed inlining case. If there's a virtual call in the callee on an alloca (stack allocated object) in the caller, and the callee is inlined into the caller, the post-inline cleanup would devirtualize the virtual call, but if the next iteration of DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is based on a heuristic to determine whether to reiterate, we may miss inlining the devirtualized call. This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp. This is a second commit after a revert https://reviews.llvm.org/rG4569b3a86f8a4b1b8ad28fe2321f936f9d7ffd43 and a fix https://reviews.llvm.org/rG41e06ae7ba91. Differential Revision: https://reviews.llvm.org/D69591	2020-02-28 09:43:32 -08:00
Hiroshi Yamauchi	41e06ae7ba	[CallPromotionUtils] Add missing promotion legality check to tryPromoteCall. Summary: This fixes the crash that led to the revert of D69591. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75307	2020-02-28 09:35:09 -08:00
Valery N Dmitriev	02e5e47e17	[SLP][NFC] Delete some unreachable code. This patch deletes some dead code out of SLP vectorizer. Couple of changes taken out of D57059 to slightly lighten it plus one more similar case fixed. Differential Revision: https://reviews.llvm.org/D75276	2020-02-28 09:22:51 -08:00
Teresa Johnson	f9ca75f19b	[Inliner] Inlining should honor nobuiltin attributes Summary: Final patch in series to fix inlining between functions with different nobuiltin attributes/options, which was specifically an issue in LTO. See discussion on D61634 for background. The prior patch in this series (D67923) enabled per-Function TLI construction that identified the nobuiltin attributes. Here I have allowed inlining to proceed if the callee's nobuiltins are a subset of the caller's nobuiltins, but not in the reverse case, which should be conservatively correct. This is controlled by a new option, -inline-caller-superset-nobuiltin, which is enabled by default. Reviewers: hfinkel, gchatelet, chandlerc, davidxl Subscribers: arsenm, jvesely, nhaehnle, mehdi_amini, eraman, hiraditya, haicheng, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74162	2020-02-28 07:34:14 -08:00
Pierre-vh	2809abbd98	[Transform][MemCpyOpt] Add missing DebugLoc to %tmpbitcast Fix for https://bugs.llvm.org/show_bug.cgi?id=37967 Differential Revision: https://reviews.llvm.org/D75173	2020-02-28 15:20:51 +00:00
Juneyoung Lee	cc28a75467	Let EarlyCSE fold equivalent freeze instructions Summary: This patch makes EarlyCSE fold equivalent freeze instructions. Another optimization that I think will be useful is to remove freeze if its operand is used as a branch condition or at llvm.assume: ``` %c = ... br i1 %c, label %A, .. A: %d = freeze %c ; %d can be optimized to %c because %c cannot be poison or undef (or 'br %c' would be UB otherwise) ``` If it make sense for EarlyCSE to support this as well, I will make a patch for this. Reviewers: spatel, reames, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75334	2020-02-28 20:35:20 +09:00
Hans Wennborg	d48c981697	SROA: Don't drop atomic load/store alignments (PR45010) SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266	2020-02-28 10:38:40 +01:00
Jun Ma	43c8307c6c	[Coroutines] CoroElide enhancement Fix regression of CoreElide pass when current function is coroutine. Differential Revision: https://reviews.llvm.org/D71663	2020-02-28 10:41:59 +08:00
Juneyoung Lee	2b5a897651	Revert "[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison" .. due to performance regression. This patch is reverted until infrastructore for CSE/LICM support for freeze is added. This reverts commit `181628b`	2020-02-28 11:10:46 +09:00
Eli Friedman	b299926453	[IndVars] Fix sort comparator. std::sort will compare an element to itself in some cases. We should not crash if this happens. Differential Revision: https://reviews.llvm.org/D75000	2020-02-27 17:25:18 -08:00
Matt Morehouse	470db54cbd	[DFSan] Add flag to insert event callbacks. Summary: For now just insert the callback for stores, similar to how MSan tracks origins. In the future we may want to add callbacks for loads, memcpy, function calls, CMPs, etc. Reviewers: pcc, vitalybuka, kcc, eugenis Reviewed By: vitalybuka, kcc, eugenis Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits, kcc Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75312	2020-02-27 17:14:19 -08:00
Matt Morehouse	2a29617b9d	[DFSan] Remove unused IRBuilder. NFC Reviewers: pcc, vitalybuka, kcc Reviewed By: kcc Subscribers: hiraditya, llvm-commits, kcc Tags: #llvm Differential Revision: https://reviews.llvm.org/D75190	2020-02-27 16:27:20 -08:00
Artur Pilipenko	02e3d5c3a2	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Nikita Popov	4ef272ec9c	[InstCombine] DCE instructions earlier When InstCombine initially populates the worklist, it already performs constant folding and DCE. However, as the instructions are initially visited in program order, this DCE can pick up only the last instruction of a dead chain, the rest would only get picked up in the main InstCombine run. To avoid this, we instead perform the DCE in separate pass over the collected instructions in reverse order, which will allow us to pick up full dead instruction chains. We already need to do this reverse iteration anyway to populate the worklist, so this shouldn't add extra cost. This by itself only fixes a small part of the problem though: The same basic issue also applies during the main InstCombine loop. We generally always want DCE to occur as early as possible, because it will allow one-use folds to happen. Address this by also performing DCE while adding deferred instructions to the main worklist. This drops the number of tests that perform more than 2 InstCombine iterations from ~80 to ~40. There's some spurious test changes due to operand order / icmp toggling. Differential Revision: https://reviews.llvm.org/D75008	2020-02-27 18:45:59 +01:00
Simon Moll	ddd11273d9	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Pierre-vh	f64e457cb7	[Transforms][Debugify] Ignore PHI nodes when checking for DebugLocs Fix for: https://bugs.llvm.org/show_bug.cgi?id=37964 Differential Revision: https://reviews.llvm.org/D75242	2020-02-27 16:14:11 +00:00

1 2 3 4 5 ...

23691 Commits