llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	1f65903dc1	[InstCombine] move add after smin/smax Follow-up to rL355221. This isn't specifically called for within PR14613, but we'll get there eventually if it's not already requested in some other bug report. https://rise4fun.com/Alive/5b0 Name: smax Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i8 %x, C0 %cond = icmp sgt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp sgt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nsw i8 %u2, C0 Name: smin Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i32 %x, C0 %cond = icmp slt i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp slt i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nsw i32 %u2, C0 llvm-svn: 355272	2019-03-02 16:45:10 +00:00
Philip Reames	cf0a978e1f	[InstCombine] Extend saturating idempotent atomicrmw transform to FP I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw. Differential Revision: https://reviews.llvm.org/D58836 llvm-svn: 355222	2019-03-01 19:50:36 +00:00
Sanjay Patel	6e1e7e1c3e	[InstCombine] move add after umin/umax In the motivating cases from PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 ...moving the add enables us to narrow the min/max which eliminates zext/trunc which enables signficantly better vectorization. But that bug is still not completely fixed. https://rise4fun.com/Alive/5KQ Name: umax Pre: C1 u>= C0 %a = add nuw i8 %x, C0 %cond = icmp ugt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp ugt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nuw i8 %u2, C0 Name: umin Pre: C1 u>= C0 %a = add nuw i32 %x, C0 %cond = icmp ult i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp ult i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nuw i32 %u2, C0 llvm-svn: 355221	2019-03-01 19:42:40 +00:00
Philip Reames	2226e9a745	[LICM] Infer proper alignment from loads during scalar promotion This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load. For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an incredibly rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we may fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually is well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit. Differential Revision: https://reviews.llvm.org/D58809 llvm-svn: 355217	2019-03-01 18:45:05 +00:00
Philip Reames	77982868c5	[InstCombine] Extend "idempotent" atomicrmw optimizations to floating point An idempotent atomicrmw is one that does not change memory in the process of execution. We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR. Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load. As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future. Differential Revision: https://reviews.llvm.org/D58251 llvm-svn: 355210	2019-03-01 18:00:07 +00:00
Jonas Hahnfeld	e071cd86df	Hide two unused debugging methods, NFCI. GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used in Release builds. Hide them behind 'ifndef NDEBUG'. llvm-svn: 355205	2019-03-01 17:15:21 +00:00
Manman Ren	576124a319	Try to fix NetBSD buildbot breakage introduced in D57463. By including the header file in the source. llvm-svn: 355202	2019-03-01 15:25:24 +00:00
Fangrui Song	f4b25f700a	[ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM Summary: ConstIntInfoVec contains elements extracted from the previous function. In new PM, releaseMemory() is not called and the dangling elements can cause segfault in findConstantInsertionPoint. Rename releaseMemory() to cleanup() to deliver the idea that it is mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix this. Reviewers: ormris, zzheng, dmgreen, wmi Reviewed By: ormris, wmi Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58589 llvm-svn: 355174	2019-03-01 05:27:01 +00:00
Reid Kleckner	701593f1db	[sancov] Instrument reachable blocks that end in unreachable Summary: These sorts of blocks often contain calls to noreturn functions, like longjmp, throw, or trap. If they don't end the program, they are "interesting" from the perspective of sanitizer coverage, so we should instrument them. This was discussed in https://reviews.llvm.org/D57982. Reviewers: kcc, vitalybuka Subscribers: llvm-commits, craig.topper, efriedma, morehouse, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D58740 llvm-svn: 355152	2019-02-28 22:54:30 +00:00
Manman Ren	1829512dd3	Add a module pass for order file instrumentation The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name. In this pass, we add three global variables: (1) an order file buffer: a circular buffer at its own llvm section. (2) a bitmap for each module: one byte for each function to say if the function is already executed. (3) a global index to the order file buffer. At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index. Differential Revision: https://reviews.llvm.org/D57463 llvm-svn: 355133	2019-02-28 20:13:38 +00:00
Rong Xu	a6ff69f6dd	[PGO] Context sensitive PGO (part 2) Part 2 of CSPGO changes (mostly related to ProfileSummary). Note that I use a default parameter in setProfileSummary() and getSummary(). This is to break the dependency in clang. I will make the parameter explicit after changing clang in a separated patch. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355131	2019-02-28 19:55:07 +00:00
Sanjay Patel	4a47f5f550	[InstCombine] fold adds of constants separated by sext/zext This is part of a transform that may be done in the backend: D13757 ...but it should always be beneficial to fold this sooner in IR for all targets. https://rise4fun.com/Alive/vaiW Name: sext add nsw %add = add nsw i8 %i, C0 %ext = sext i8 %add to i32 %r = add i32 %ext, C1 => %s = sext i8 %i to i32 %r = add i32 %s, sext(C0)+C1 Name: zext add nuw %add = add nuw i8 %i, C0 %ext = zext i8 %add to i16 %r = add i16 %ext, C1 => %s = zext i8 %i to i16 %r = add i16 %s, zext(C0)+C1 llvm-svn: 355118	2019-02-28 19:05:26 +00:00
Chijun Sima	586187639a	Make MergeBlockIntoPredecessor conformant to the precondition of calling DTU.applyUpdates Summary: It is mentioned in the document of DTU that "It is illegal to submit any update that has already been submitted, i.e., you are supposed not to insert an existent edge or delete a nonexistent edge." It is dangerous to violet this rule because DomTree and PostDomTree occasionally crash on this scenario. This patch fixes `MergeBlockIntoPredecessor`, making it conformant to this precondition. Reviewers: kuhar, brzycki, chandlerc Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58444 llvm-svn: 355105	2019-02-28 16:47:18 +00:00
Bjorn Pettersson	d30f308a9f	Add support for computing "zext of value" in KnownBits. NFCI Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099	2019-02-28 15:45:29 +00:00
Eric Christopher	07944353fc	Temporarily revert "ArgumentPromotion should copy all metadata to new Function" and the dependent patch "Refine ArgPromotion metadata handling" as they're causing segfaults in argument promotion. This reverts commits r354032 and r353537. llvm-svn: 355060	2019-02-28 01:11:12 +00:00
Reid Kleckner	4fb3502bc9	[InstrProf] Use separate comdat group for data and counters Summary: I hadn't realized that instrumentation runs before inlining, so we can't use the function as the comdat group. Doing so can create relocations against discarded sections when references to discarded __profc_ variables are inlined into functions outside the function's comdat group. In the future, perhaps we should consider standardizing the comdat group names that ELF and COFF use. It will save object file size, since __profv_$sym won't appear in the symbol table again. Reviewers: xur, vsk Subscribers: eraman, hiraditya, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58737 llvm-svn: 355044	2019-02-27 23:38:44 +00:00
Alina Sbirlea	fcfa7c5f92	[MemorySSA] Make insertDef insert corresponding phi nodes. Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040	2019-02-27 22:20:22 +00:00
Rong Xu	6cdf3d8086	Recommit r354930 "[PGO] Context sensitive PGO (part 1)" Fixed UBSan failures. llvm-svn: 355005	2019-02-27 17:24:33 +00:00
Vlad Tsyrklevich	c01643087e	Revert "[PGO] Context sensitive PGO (part 1)" This reverts commit r354930, it was causing UBSan failures. llvm-svn: 354953	2019-02-27 03:45:28 +00:00
Vedant Kumar	73522d1678	[HotColdSplit] Disable splitting for sanitized functions Splitting can make sanitizer errors harder to understand, as the trapping instruction may not be in the function where the bug was detected. rdar://48142697 llvm-svn: 354931	2019-02-26 22:55:46 +00:00
Rong Xu	35d2d51369	[PGO] Context sensitive PGO (part 1) Current PGO profile counts are not context sensitive. The branch probabilities for the inlined functions are kept the same for all call-sites, and they might be very different from the actual branch probabilities. These suboptimal profiles can greatly affect some downstream optimizations, in particular for the machine basic block placement optimization. In this patch, we propose to have a post-inline PGO instrumentation/use pass, which we called Context Sensitive PGO (CSPGO). For the users who want the best possible performance, they can perform a second round of PGO instrument/use on the top of the regular PGO. They will have two sets of profile counts. The first pass profile will be manly for inline, indirect-call promotion, and CGSCC simplification pass optimizations. The second pass profile is for post-inline optimizations and code-gen optimizations. A typical usage: // Regular PGO instrumentation and generate pass1 profile. > clang -O2 -fprofile-generate source.c -o gen > ./gen > llvm-profdata merge default.profraw -o pass1.profdata // CSPGO instrumentation. > clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2 > ./gen2 // Merge two sets of profiles > llvm-profdata merge default.profraw pass1.profdata -o profile.profdata // Use the combined profile. Pass manager will invoke two PGO use passes. > clang -O2 -fprofile-use=profile.profdata -o use This change touches many components in the compiler. The reviewed patch (D54175) will committed in phrases. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 354930	2019-02-26 22:37:46 +00:00
Alina Sbirlea	9026404125	[MemorySSA & SimpleLoopUnswitch] Update MemorySSA in ReplaceUsesOfWith. SimpleLoopUnswitch must update MemorySSA when removing instructions. Resolves PR39197. llvm-svn: 354919	2019-02-26 19:44:52 +00:00
Sanjay Patel	e8bf0f79bd	[InstCombine] canonicalize more unsigned saturated add with 'not' Yet another pattern variation suggested by: https://bugs.llvm.org/show_bug.cgi?id=14613 There are 8 more potential commuted patterns here on top of the 8 that were already handled (rL354221, rL354276, rL354393). We have the obvious commute of the 'add' + commute of the cmp predicate/operands (ugt/ult) + commute of the select operands: Name: base %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %x, %y %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %y, %x %r = select i1 %c, i32 -1, i32 %a => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ult i32 %y, %x %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: ugt + commute select %notx = xor i32 %x, -1 %a = add i32 %notx, %y %c = icmp ugt i32 %x, %y %r = select i1 %c, i32 %a, i32 -1 => %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/den llvm-svn: 354887	2019-02-26 15:18:49 +00:00
Simon Pilgrim	a066f1f9e6	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790	2019-02-25 15:42:02 +00:00
Sanjay Patel	9907d3c8b4	[InstCombine] canonicalize add/sub with bool add A, sext(B) --> sub A, zext(B) We have to choose 1 of these forms, so I'm opting for the zext because that's easier for value tracking. The backend should be prepared for this change after: D57401 rL353433 This is also a preliminary step towards reducing the amount of bit hackery that we do in IR to optimize icmp/select. That should be waiting to happen at a later optimization stage. The seeming regression in the fuzzer test was discussed in: D58359 We were only managing that fold in instcombine by luck, and other passes should be able to deal with that better anyway. llvm-svn: 354748	2019-02-24 16:57:45 +00:00
Matt Arsenault	65b4ab9921	BreakCriticalEdges: Update PostDominatorTree llvm-svn: 354673	2019-02-22 15:01:41 +00:00
Roman Tereshin	99a6672bba	[LowerSwitch][AMDGPU] Do not handle impossible values This patch adds LazyValueInfo to LowerSwitch to compute the range of the value being switched over and reduce the size of the tree LowerSwitch builds to lower a switch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D58096 llvm-svn: 354670	2019-02-22 14:33:46 +00:00
Chijun Sima	70e97163e0	[DTU] Refine the interface and logic of applyUpdates Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669	2019-02-22 13:48:38 +00:00
Alina Sbirlea	90d2e3a16d	[MemorySSA & LoopPassManager] Resolve PR40038. The correct edge being deleted is not to the unswitched exit block, but to the original block before it was split. That's the key in the map, not the value. The insert is correct. The new edge is to the .split block. The splitting turns OriginalBB into: OriginalBB -> OriginalBB.split. Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete ParentBB->OriginalBB, not ParentBB->OriginalBB.split. llvm-svn: 354656	2019-02-22 07:18:37 +00:00
Chijun Sima	f131d6110e	[DTU] Deprecate insertEdge/deleteEdge Summary: This patch converts all existing `insertEdge/deleteEdge` to `applyUpdates` and marks `insertEdge/deleteEdge` as deprecated. Reviewers: kuhar, brzycki Reviewed By: kuhar, brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58443 llvm-svn: 354652	2019-02-22 05:41:43 +00:00
Alina Sbirlea	97468e9282	[MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks. MemorySSA is now updated when forming dedicated exit blocks. Resolves PR40037. llvm-svn: 354623	2019-02-21 21:13:34 +00:00
Alina Sbirlea	d2d3244363	[LoopSimplifyCFG] Update MemorySSA after r353911. Summary: MemorySSA is not properly updated in LoopSimplifyCFG after recent changes. Use SplitBlock utility to resolve that and clear all updates once handleDeadExits is finished. All updates that follow are removal of edges which are safe to handle via the removeEdge() API. Also, deleting dead blocks is done correctly as is, i.e. delete from MemorySSA before updating the CFG and DT. Reviewers: mkazantsev, rtereshin Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58524 llvm-svn: 354613	2019-02-21 19:54:05 +00:00
Alina Sbirlea	73446cd567	[EarlyCSE] Cleanup deadcode. [NFCI] Summary: Cleanup nop assignments. Reviewers: george.burgess.iv, davide Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58308 llvm-svn: 354612	2019-02-21 19:49:57 +00:00
Joey Gouly	fdf651ee8d	[InferAddressSpaces] Fix fallthrough error llvm-svn: 354580	2019-02-21 13:10:37 +00:00
Joey Gouly	92af1360f3	[InferAddressSpaces] Fix crash on select of non-ptr operands Check the operands of a select are pointers, to determine if it is an address expression or not. https://reviews.llvm.org/D58226 llvm-svn: 354576	2019-02-21 12:31:36 +00:00
Max Kazantsev	10489d76f6	[LoopSimplifyCFG] Add missing MSSA edge deletion When we create fictive switch in preheader, we should take care about MSSA and delete edge between old preheader and header. llvm-svn: 354547	2019-02-21 05:51:29 +00:00
Wei Mi	500606f270	[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled is false. Right now for inliner and partial inliner, we always pass the address of a valid ORE object to getInlineCost even if RemarkEnabled is false because of no -Rpass is specified. Since ComputeFullInlineCost will be set to true if ORE is non-null in getInlineCost, this introduces the problem that in getInlineCost we cannot return early even if we already know the cost is definitely higher than the threshold. It is a general problem for compile time. This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is false. Differential Revision: https://reviews.llvm.org/D58399 llvm-svn: 354542	2019-02-21 02:57:52 +00:00
Philip Reames	79d5e16f51	[GVN] Small tweaks to comments, style, and missed vector handling Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches. The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast. llvm-svn: 354412	2019-02-20 00:31:28 +00:00
Philip Reames	a259dc3263	[GVN] Fix last crasher w/non-integral pointers Same case as for memset and memcpy, but this time for clobbering stores and loads. We still can't allow coercion to or from non-integrals, regardless of the transform. Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers. My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues. My chanks to Cherry Zhang for helping debug. llvm-svn: 354407	2019-02-20 00:15:54 +00:00
Philip Reames	952d234d00	[GVN] Fix a crash bug w/non-integral pointers and memtransfers Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so. Since we shouldn't be doing such coercions to start with, easy fix. From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed. llvm-svn: 354403	2019-02-19 23:49:38 +00:00
Philip Reames	322eb7660e	[GVN] Fix a non-integral pointer bug w/vector types GVN generally doesn't forward structs or array types, but it will forward vector types to non-vectors and vice versa. As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves. llvm-svn: 354401	2019-02-19 23:19:51 +00:00
Philip Reames	92756a80e7	[GVN] Fix a crash bug around non-integral pointers If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed. Such a forward is not legal in general, but we can forward null pointers. Test for both cases are included. llvm-svn: 354399	2019-02-19 23:07:15 +00:00
Sanjay Patel	c1e0184317	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354393	2019-02-19 22:14:21 +00:00
Sanjay Patel	dcb93c0dda	[InstCombine] rearrange saturated add folds; NFC This is no-functional-change-intended, but that was also true when it was part of rL354276, and I managed to lose 2 predicates for the fold with constant...causing much bot distress. So this time I'm adding a couple of negative tests to avoid that. llvm-svn: 354384	2019-02-19 21:46:13 +00:00
Max Kazantsev	ebd95ea86e	[NFC] API for signaling that the current loop is being deleted We are planning to be able to delete the current loop in LoopSimplifyCFG in the future. Add API to notify the loop pass manager that it happened. llvm-svn: 354314	2019-02-19 11:14:05 +00:00
Max Kazantsev	30095d9795	[NFC] Store loop header in a local to keep it available after the loop is deleted llvm-svn: 354313	2019-02-19 11:13:58 +00:00
Sanjay Patel	8a35d339c9	Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op" This reverts commit `079b610c29`. Bots are failing after this change on a stage 2 compile of clang. llvm-svn: 354277	2019-02-18 16:04:22 +00:00
Sanjay Patel	079b610c29	[InstCombine] reduce even more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354276	2019-02-18 15:21:39 +00:00
Max Kazantsev	4561475e09	[NFC] Teach getInnermostLoopFor walk up the loop trees This should be NFC in current use case of this method, but it will help to use it for solving more compex tasks in follow-up patches. llvm-svn: 354227	2019-02-17 18:21:51 +00:00
Sanjay Patel	b341ee7071	[InstCombine] reduce more unsigned saturated add with 'not' op We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: not op %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %notx, %y %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: not op ugt %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %y, %notx %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/niom (The matching here is still incomplete.) llvm-svn: 354224	2019-02-17 16:48:50 +00:00

1 2 3 4 5 ...

21388 Commits