llvm-project

Commit Graph

Author	SHA1	Message	Date
Sjoerd Meijer	cdfc678572	[SCCPSolver] Fix use-after-free in markArgInFuncSpecialization In SCCPSolver::markArgInFuncSpecialization, the ValueState map may be reallocated after the initial ValueLatticeElement reference is grabbed, but before its use in copy initialization. This causes a use-after-free. To fix this, this commit changes the behavior to create the new ValueLatticeElement before assigning the old one to it. Patch by: https://github.com/duck-37/ Differential Revision: https://reviews.llvm.org/D111112	2021-10-05 12:56:32 +01:00
Bjorn Pettersson	7f84fa4ad4	[TargetLibraryInfo] Refactor size_t checks in isValidProtoForLibFunc. NFC In TargetLibraryInfoImpl::isValidProtoForLibFunc we no longer need the IsSizeTTy lambda function and the SizeTTy object. Instead we just follow the regular structure of checking for integer types given an exepected number of bits.	2021-10-04 15:46:39 +02:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Dávid Bolvanský	5f2f611880	Fixed more warnings in LLVM produced by -Wbitwise-instead-of-logical	2021-10-03 13:58:10 +02:00
Arthur Eubanks	a7b4ce9cfd	[NFC][AttributeList] Replace index_begin/end with an iterator We expose the fact that we rely on unsigned wrapping to iterate through all indexes. This can be confusing. Rather, keeping it as an implementation detail through an iterator is less confusing and is less code. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D110885	2021-10-01 10:17:41 -07:00
Kazu Hirata	4f0225f6d2	[Transforms] Migrate from getNumArgOperands to arg_size (NFC) Note that getNumArgOperands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-10-01 09:57:40 -07:00
Krasimir Georgiev	685f1bfd0a	Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns" It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds/7145. This reverts commit `1fb37334bd`.	2021-10-01 11:39:43 +02:00
David Sherwood	1fb37334bd	[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns This patch adds further support for vectorisation of loops that involve selecting an integer value based on a previous comparison. Consider the following C++ loop: int r = a; for (int i = 0; i < n; i++) { if (src[i] > 3) { r = b; } src[i] += 2; } We should be able to vectorise this loop because all we are doing is selecting between two states - 'a' and 'b' - both of which are loop invariant. This just involves building a vector of values that contain either 'a' or 'b', where the final reduced value will be 'b' if any lane contains 'b'. The IR generated by clang typically looks like this: %phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ] ... %pred = icmp ugt i32 %val, i32 3 %phi.update = select i1 %pred, i32 %b, i32 %phi We already detect min/max patterns, which also involve a select + cmp. However, with the min/max patterns we are selecting loaded values (and hence loop variant) in the loop. In addition we only support certain cmp predicates. This patch adds a new pattern matching function (isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp. We only support selecting values that are integer and loop invariant, however we can support any kind of compare - integer or float. Tests have been added here: Transforms/LoopVectorize/AArch64/sve-select-cmp.ll Transforms/LoopVectorize/select-cmp-predicated.ll Transforms/LoopVectorize/select-cmp.ll Differential Revision: https://reviews.llvm.org/D108136	2021-10-01 08:41:03 +01:00
Kazu Hirata	f631173d80	[llvm] Migrate from arg_operands to args (NFC) Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.	2021-09-30 08:51:21 -07:00
Anna Thomas	452714f8f8	[BPI] Keep BPI available in loop passes through LoopStandardAnalysisResults This is analogous to D86156 (which preserves "lossy" BFI in loop passes). Lossy means that the analysis preserved may not be up to date with regards to new blocks that are added in loop passes, but BPI will not contain stale pointers to basic blocks that are deleted by the loop passes. This is achieved through BasicBlockCallbackVH in BPI, which calls eraseBlock that updates the data structures in BPI whenever a basic block is deleted. This patch does not have any changes in the upstream pipeline, since none of the loop passes in the pipeline use BPI currently. However, since BPI wasn't previously preserved in loop passes, the loop predication pass was invoking BPI on the entire function every time it ran in an LPM. This caused massive compile time in our downstream LPM invocation which contained loop predication. See updated test with an invocation of a loop-pipeline containing loop predication and -debug-pass turned ON. Reviewed-By: asbirlea, modimo Differential Revision: https://reviews.llvm.org/D110438	2021-09-30 10:27:05 -04:00
Djordje Todorovic	f8dfc35256	NFC: [Debugify] Fix a typo when checking variables in the original mode	2021-09-29 04:35:10 -07:00
Adrian Prantl	1b998a5f0c	Add salvageDebugInfo support for truncating/extending ptr/int conversions. This patch enables debug info salvaging for truncating/extending ptr int conversions. The testcase uncovered a bug in adce, which is addressed separately. rdar://80227769 Differential Revision: https://reviews.llvm.org/D110461	2021-09-28 10:24:50 -07:00
Congzhe Cao	c42772752a	[CodeMoverUtils] Enhance isSafeToMoveBefore() when control flow equivalence is satisfied With improved analysis in determining CFG equivalence that does not require strict dominance and post-dominance conditions, we now relax isSafeToMoveBefore() such that an instruction I can be moved before InsertPoint even if they do not strictly dominate each other, as long as they follow the same control flow path. For example, we can move Instruction 0 before Instruction 1, and vice versa. ``` if (cond1) // Instruction 0: %add = add i32 1, 2 if (cond1) // Instruction 1: %add2 = add i32 2, 1 ``` Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D110456	2021-09-27 18:37:36 -04:00
Jun Ma	3a998c06a8	Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""" This reverts commit `8ba2adcf9e`.	2021-09-27 20:39:05 +08:00
Congzhe Cao	751be2a064	[CodeMoverUtils] Enhance isSafeToMoveBefore() when moving BBs When moving an entire basic block BB before InsertPoint, currently we check for all instructions whether the operands dominates InsertPoint, however, this can be improved such that even an operand does not dominate InsertPoint, as long as it appears as a previous instruction in the same BB, it is safe to move. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D110378	2021-09-24 05:48:15 -04:00
Fangrui Song	1a6e1ee42a	Resolve {GlobalValue,GloalIndirectSymol}::getBaseObject confusion While both GlobalAlias and GlobalIFunc are GlobalIndirectSymbol, their `getIndirectSymbol()` usage is quite different (GlobalIFunc's resolver is an entity different from GlobalIFunc itself). As discussed on https://lists.llvm.org/pipermail/llvm-dev/2020-September/144904.html ("[IR] Modelling of GlobalIFunc"), the name `getBaseObject` is confusing when used with GlobalIFunc. To resolve the confusion: * Move GloalIndirectSymol::getBaseObject to GlobalAlias:: (GlobalIFunc should use `getResolver` instead) * Change GlobalValue::getBaseObject not to inspect GlobalIFunc. Note: the function has 7 references. * Add GlobalIFunc::getResolverFunction to peel off potential ConstantExpr indirection (`strlen` in `test/LTO/Resolution/X86/ifunc.ll`) Note: GlobalIFunc::getResolver (like GlobalAlias::getAliasee which does not peel off ConstantExpr indirection) is kept to be used by ValueEnumerator. Reviewed By: ibookstein Differential Revision: https://reviews.llvm.org/D109792	2021-09-23 09:23:35 -07:00
Simon Pilgrim	5f2c53bdf4	Pass some DataLayout arguments by const-ref Avoid unnecessary copies, reported by MSVC static analyzer.	2021-09-23 15:50:31 +01:00
Bjorn Pettersson	85a586501b	[BasicBlockUtils] Fixup of an assumed typo in MergeBlockIntoPredecessor The NFC commit `e5692a564a` changed the logic for DomTreeUpdates to use the range [succ_begin, succ_begin) when looking for SuccsOfPredBB rather than using [succ_begin, succ_end). As the commit was NFC this is identified as a typo (it has been discussed briefly in phabricator). The typo was found when inspecting the code, so I've got no idea if changing back to the old range has any significant impact (such as solving any PR:s or causing some new problems). But at least this restores the code to the originally indented behavior.	2021-09-23 13:03:26 +02:00
Arthur Eubanks	e7249e4acf	[SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest When determining whether to fold branches to a common destination by merging two blocks, SimplifyCFG will count the number of instructions to be moved into the first basic block. However, there's no reason to count free instructions like bitcasts and other similar instructions. This resolves missed branch foldings with -fstrict-vtable-pointers in llvm-test-suite's lambda benchmark. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D108837	2021-09-22 09:52:37 -07:00
Max Kazantsev	073b254cff	[SimplifyCFG] Redirect switch cases that lead to UB into an unreachable block When following a case of a switch instruction is guaranteed to lead to UB, we can safely break these edges and redirect those cases into a newly created unreachable block. As result, CFG will become simpler and we can remove some of Phi inputs to make further analyzes easier. Patch by Dmitry Bakunevich! Differential Revision: https://reviews.llvm.org/D109428 Reviewed By: lebedev.ri	2021-09-21 10:45:19 +07:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Arthur Eubanks	d49cb5b303	[SimplifyCFG] Add bonus when seeing vector ops to branch fold to common dest This makes some tests in vector-reductions-logical.ll more stable when applying D108837. The cost of branching is higher when vector ops are involved due to potential SLP transformations. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D108935	2021-09-16 10:50:36 -07:00
Kazu Hirata	24c8eaec94	[Transforms] Use make_early_inc_range (NFC)	2021-09-15 19:55:24 -07:00
Owen Anderson	68079ef0eb	Teach SimplifyCFG to fold switches into lookup tables in more cases. In particular, it couldn't handle cases where lookup table constant expressions involved bitcasts. This does not seem to come up frequently in C++, but comes up reasonably often in Rust via `#[derive(Debug)]`. Originally reported by pcwalton. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D109565	2021-09-15 22:07:08 +00:00
Anna Thomas	b6cb03e6b9	Revert use of getUniqueUndroppableUser in AssumeBundleBuilder Fix build bot failure in rG4ac4e521 caused due to assumeBundleBuilder using new API (getUniqueUndroppableUser). We now continue using the existing API for AssumeBundleBuilder (getSingleUndroppableUser). Sorry for the noise here. Tests-Run: failing testcase passes.	2021-09-15 17:45:09 -04:00
Anna Thomas	4ac4e52189	[InstCombine] Improve TryToSinkInstruction with multiple uses This patch allows sinking an instruction which can have multiple uses in a single user. We were previously over-restrictive by looking for exactly one use, rather than one user. Also, the API for retrieving undroppable user has been updated accordingly since in both usecases (Attributor and InstCombine), we seem to care about the user, rather than the use. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D109700	2021-09-15 20:39:38 +00:00
Markus Lavin	1ac209ed76	[NPM] Added -print-pipeline-passes print params for a few passes. Added '-print-pipeline-passes' printing of parameters for those passes declared with _WITH_PARAMS macro in PassRegistry.def. Note that it only prints the parameters declared inside _WITH_PARAMS as in a few cases there appear to be additional parameters not parsable. The following passes are now covered (i.e. all of those with *_WITH_PARAMS in PassRegistry.def). LoopExtractorPass - loop-extract HWAddressSanitizerPass - hwsan EarlyCSEPass - early-cse EntryExitInstrumenterPass - ee-instrument LowerMatrixIntrinsicsPass - lower-matrix-intrinsics LoopUnrollPass - loop-unroll AddressSanitizerPass - asan MemorySanitizerPass - msan SimplifyCFGPass - simplifycfg LoopVectorizePass - loop-vectorize MergedLoadStoreMotionPass - mldst-motion GVN - gvn StackLifetimePrinterPass - print<stack-lifetime> SimpleLoopUnswitchPass - simple-loop-unswitch Differential Revision: https://reviews.llvm.org/D109310	2021-09-15 08:34:04 +02:00
Kazu Hirata	abca4c012f	[Utils] Use make_early_inc_range (NFC)	2021-09-13 08:57:23 -07:00
Johannes Doerfert	c09fbbdcfb	Reapply "[GlobalOpt][FIX] Do not embed initializers into AS!=0 globals"" This reapplies commit `7dbba3376f`, or, put differently, this reverts commit `d9a8d20827`. The test now requires the amdgpu and nvptx backend explicitly as it won't work without properly.	2021-09-10 15:22:56 -05:00
Johannes Doerfert	d9a8d20827	Revert "[GlobalOpt][FIX] Do not embed initializers into AS!=0 globals" This reverts commit `7dbba3376f`. There seems to be a problem with the tests, investigating now: https://lab.llvm.org/buildbot/#/builders/61/builds/14574	2021-09-10 12:23:08 -05:00
Johannes Doerfert	7dbba3376f	[GlobalOpt][FIX] Do not embed initializers into AS!=0 globals Not all address spaces support initializers for globals and we can therefore not set them without checking if they are allowed. This patch adds a hook into TTI to check if an AS allows non-undef initializers. We disable it for all but address space 0 by default, NVPTX and AMDGPU targets allow all but address space 3. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D109337	2021-09-10 12:08:50 -05:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Kazu Hirata	92c9ff6d5f	[IR, Transforms] Use arg_empty (NFC)	2021-09-09 08:50:10 -07:00
Roman Lebedev	909cba9699	[SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA form for bonus instructions (PR51125) I can't seem to wrap my head around the proper fix here, we should be fine without this requirement, iff we can form this form, but the naive attempt (https://reviews.llvm.org/D106317) has failed. So just to unblock the release, put up a restriction. Fixes https://bugs.llvm.org/show_bug.cgi?id=51125	2021-09-09 12:28:09 +03:00
Jun Ma	8ba2adcf9e	Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."" Differential Revision: https://reviews.llvm.org/D106056	2021-09-09 16:53:33 +08:00
Andrew Litteken	144cd22bae	[CodeExtractor] Creating exit stubs based off original order branch instructions. Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on the order of out-of-region blocks after splitting any phi nodes, and collecting the blocks to be outlined. This could cause differences in order if there was a difference of exit block phi nodes between the two regions. This patch moves the collection of the output target blocks to be before this occurs, so that the assignment of target block to output value will be the same, regardless of the contents of the output block. Reviewers: paquette, roelofs Differential Revision: https://reviews.llvm.org/D108657	2021-09-08 15:15:15 -07:00
Nikita Popov	6dfdc6bfd2	[SROA] Support opaque pointers Make the following changes in order to support opaque pointers in SROA: * Generate i8 GEPs for opaque pointers. * Explicitly enforce that promotable allocas only have stores of the alloca type -- previously this was implicitly enforced. * Replace a check for pointer element type with load/store type. Differential Revision: https://reviews.llvm.org/D109259	2021-09-08 22:25:44 +02:00
Akira Hatanaka	dea6f71af0	[ObjC][ARC] Use the addresses of the ARC runtime functions instead of integer 0/1 for the operand of bundle "clang.arc.attachedcall" https://reviews.llvm.org/D102996 changes the operand of bundle "clang.arc.attachedcall". This patch makes changes to llvm that are needed to handle the new IR. This should make it easier to understand what the IR is doing and also simplify some of the passes as they no longer have to translate the integer values to the runtime functions. Differential Revision: https://reviews.llvm.org/D103000	2021-09-08 11:58:03 -07:00
Max Kazantsev	29d054bf12	[SimplifyCFG] Preserve knowledge about guarding condition by adding assume This improvement adds "assume" after removal of branch basing on UB in successor block. Consider the following example: ``` pred: x = ... cond = x > 10 br cond, bb, other.succ bb: phi [nullptr, pred], ... // other possible preds load(phi) // UB if we came from pred other.succ: // here we know that x <= 10, but this knowledge is lost // after the branch is turned to unconditional unless we // preserve it with assume. ``` If we remove the branch basing on knowledge about UB in a successor block, then the fact that x <= 10 is other.succ might be lost if this condition is not inferrable from any dominating condition. To preserve this knowledge, we can add assume intrinsic with (possibly inverted) branch condition. Patch by Dmitry Bakunevich! Differential Revision: https://reviews.llvm.org/D109054 Reviewed By: lebedev.ri	2021-09-08 14:05:17 +07:00
Andy Kaylor	34528c32d2	Copy Elementtype Attribute to IR at Link step Copying IR during linking causes a type mismatch due to the field being missing in IRMover/Valuemapper. Adds the full range of typed attributes including elementtype attribute in the copy functions. Patch by Chenyang Liu Differential Revision: https://reviews.llvm.org/D108796	2021-09-07 11:41:43 -07:00
Kazu Hirata	5648f7170e	[Analysis, Target, Transforms] Construct SmallVector with iterator ranges (NFC)	2021-09-07 09:19:33 -07:00
Dávid Bolvanský	9c476172b9	[InstCombine] stpcpy(d,s) -> strcpy(d,s) if the result is not used	2021-09-05 12:12:07 +02:00
Philip Reames	fa82a3d016	[runtimeunroll] Support epilogue unrolling with a parent loop This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop. When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop. Differential Revision: https://reviews.llvm.org/D108476	2021-09-02 16:29:20 -07:00
Philip Reames	45c672e20d	[runtimeunroll] Under EXPENSIVE_CHECKS, validate loop info Requested in review comment on D108476	2021-09-02 16:28:46 -07:00
Nikita Popov	c86e1ce73b	[SCEVExpander] Simplify pointer overflow check This is a followup to D104662 to generate slightly nicer code for pointer overflow checks. Bypass expandAddToGEP and instead explicitly generate i8 GEPs. This saves some bitcasts and negates the value in a more obvious way. In particular, this prevents SCEV from looking through the umul.with.overflow, same as in the integer case. The wrapping-pointer-ni.ll test deserves a comment: Previously, this generated a typed GEP which used the umulo argument rather than the multiplication result. This results in more compact IR in that case, but effectively does the multiplication twice, the second one is just hidden in the GEP. Reusing the umulo result seems pretty reasonable to me. Differential Revision: https://reviews.llvm.org/D109093	2021-09-02 20:15:59 +02:00
Philip Reames	c3b3aa277a	Fix a missing MemorySSA update in breakLoopBackedge This is a case I'd missed in 6a8237. The odd bit here is that missing the edge removal update seems to produce MemorySSA which verifies, but is still corrupt in a way which bothers following passes. I wasn't able to reduce a single pass test case, which is why the reported test case is taken as is. Differential Revision: https://reviews.llvm.org/D109068	2021-09-01 16:59:01 -07:00
Philip Reames	e735f2bf37	[SCEVExpander] Prefer pointer expansion for overflow checks We'd special cased this logic to use pointer types for non-integral pointers, but there's no reason we can't do that for all pointer types. Doing it this was has a few advantages: a) The code itself becomes more straight forward, and easier to test. b) We avoid introducing ptrtoint into programs which didn't have them in the source. c) The resulting codegen is easier to analyze and simplify (mostly due to lack of ptrtoint). Note that there are some test diffs, but a) running them through instcombine helps a ton, and b) there's enough missing obvious transforms on both before and after IR that it's clear this isn't performance sensitive. This is mostly motivated by cleaning up mentions of non-integrals to have a clearer idea of what we actually need to support. Differential Revision: https://reviews.llvm.org/D104662	2021-09-01 13:11:25 -07:00
Arthur Eubanks	52e6d70c40	[NFC] Use newly introduced *AtIndex methods Introduced in D108788. These are clearer.	2021-09-01 11:18:41 -07:00
Philip Reames	b604fcb7bc	[runtime] Move prolog/epilog block to a post-simplify strategy The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one iteration. The old implementation did this by avoiding loop construction entirely. This patches instead constructs the trivial loop and then explicitly breaks the backedge and simplifies. This does result in some additional code churn when triggered, but a) results in better quality code and b) removes a codepath which didn't work properly for multiple exit epilogs. One oddity that I want to draw to reviewer attention is that this somehow changes revisit order. The new order looks equivalent to me, but I don't understand how creating and erasing an extra loop here creates this effect. Differential Revision: https://reviews.llvm.org/D108521	2021-08-31 09:29:36 -07:00
Doug Beck	ed6cff667e	Fix typo s/beloinging/belonging Differential Revision: https://reviews.llvm.org/D107099	2021-08-31 12:01:50 +05:30
Roman Lebedev	795d142d23	[NFCI][IndVars] rewriteLoopExitValues(): don't expand SCEV's until needed Previously, we'd expand ALL the SCEV's eagerly, because we needed to check with `isValidRewrite()`, and discard bad rewrite candidates, but now that we do not do that, we also don't need to always expand. In particular, this avoids expanding potentially-huge SCEV's that we would discard anyways because they are high-cost and we aren't rewriting aggressively.	2021-08-30 12:28:24 +03:00
Roman Lebedev	7b0d59da9a	[IndVars] Drop check for the validity of rewrite `isValidRewrite()` checks that the both the original SCEV, and the rewrite SCEV have the same base pointer. I //believe//, after all the recent SCEV improvements, this invariant is already enforced by SCEV itself. I originally tried changing it into an assert in D108043, but that showed that it triggers on e.g. https://reviews.llvm.org/D108043#2946621, where SCEV manages to forward the store to load, test added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D108655	2021-08-30 12:06:58 +03:00
Nikita Popov	9f7873784d	[SCEVExpander] Reuse removePointerBase() for canonical addrecs ExposePointerBase() in SCEVExpander implements basically the same functionality as removePointerBase() in SCEV, so reuse it. The SCEVExpander code assumes that the pointer operand on adds is the last one -- I'm not sure that always holds. As such this might not be strictly NFC.	2021-08-29 21:12:35 +02:00
Nikita Popov	0886fd5b3a	[SCEVExpander] Remove unnecessary mul/udiv check (NFC) Pointer-typed SCEV expressions can no longer be mul or udiv, so we do not need to specially handle them here.	2021-08-29 20:47:00 +02:00
Nikita Popov	3f162e8e6d	[SCEVExpander] Assert single pointer op in add (NFC) There can only be one pointer operand in an add expression, and we have sorted operands to guarantee that it is the first. As such, the pointer check for other operands is dead code.	2021-08-29 20:30:56 +02:00
Philip Reames	6a82376012	Special case common branch patterns in breakLoopBackedge (try 2) Changes since aec08e: * Adjust placement of a closing brace so that the general case actually runs. Turns out we had no coverage of the switch case. I added one in `eae90fd`. * Drop .llvm.loop.* metadata from the new branch as there is no longer a loop to annotate. Original commit message: This special cases an unconditional latch and a conditional branch latch exit to improve codegen and test readability. I am hoping to reuse this function in the runtime unroll code, but without this change, the test diffs are far too complex to assess.	2021-08-27 10:27:16 -07:00
Andrew Litteken	9d2c859ebb	[CodeExtractor] Making the arguments outlined easier to access from the outside The Code Extractor does not provide an easy mechanism for determining the inputs and outputs after extraction has occurred, this patch gives the ability to pass in empty SetVectors to be filled with the inputs and outputs if they need to be analyzed. Added Tests: - InputOutputMonitoring in unittests/Transforms/Utils/CodeExtractorTests.cpp Reviewers: paquette Differential Revision: https://reviews.llvm.org/D106991	2021-08-26 09:47:53 -07:00
Vyacheslav Zakharin	2e192ab1f4	[CodeExtractor] Preserve topological order for the return blocks. Differential Revision: https://reviews.llvm.org/D108673	2021-08-25 08:09:01 -07:00
Florian Hahn	90d09eb300	[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks. Support for peeling with multiple exit blocks was added in D63921/77bb3a486fa6. So far it has only been enabled for loops where all non-latch exits are 'de-optimizing' exits (D63923). But peeling of multi-exit loops can be highly beneficial in other cases too, like if all non-latch exiting blocks are unreachable. The motivating case are loops with runtime checks, like the C++ example below. The main issue preventing vectorization is that the invariant accesses to load the bounds of B is conditionally executed in the loop and cannot be hoisted out. If we peel off the first iteration, they become dereferenceable in the loop, because they must execute before the loop is executed, as all non-latch exits are terminated with unreachable. This subsequently allows hoisting the loads and runtime checks out of the loop, allowing vectorization of the loop. int sum(std::vector<int> A, std::vector<int> B, int N) { int cost = 0; for (int i = 0; i < N; ++i) cost += A->at(i) + B->at(i); return cost; } This gives a ~20-30% increase of score for Geekbench5/HDR on AArch64. Note that this requires a follow-up improvement to the peeling cost model to actually peel iterations off loops as above. I will share that shortly. Also, peeling of multi-exits might be beneficial for exit blocks with other terminators, but I would like to keep the scope limited to known high-reward cases for now. I removed the option to disable peeling for multi-deopt exits because the code is more general now. Alternatively, the option could also be generalized, but I am not sure if there's much value in the option? Reviewed By: reames Differential Revision: https://reviews.llvm.org/D108108	2021-08-25 13:26:40 +01:00
Philip Reames	1e07f19bfc	Revert "Special case common branch patterns in breakLoopBackedge" This reverts commit `aec08e8600`. Several problems have been reported with malformed loopinfo after this change, see discussion on https://reviews.llvm.org/rGaec08e86004b.	2021-08-24 08:53:42 -07:00
Philip Reames	d8d84c9df8	[runtimeunroll] Use early return to reduce nesting [nfc]	2021-08-22 11:34:50 -07:00
Philip Reames	aec08e8600	Special case common branch patterns in breakLoopBackedge This special cases an unconditional latch and a conditional branch latch exit to improve codegen and test readability. I am hoping to reuse this function in the runtime unroll code, but without this change, the test diffs are far too complex to assess.	2021-08-22 10:42:23 -07:00
Alexander Potapenko	b0391dfc73	[clang][Codegen] Introduce the disable_sanitizer_instrumentation attribute The purpose of __attribute__((disable_sanitizer_instrumentation)) is to prevent all kinds of sanitizer instrumentation applied to a certain function, Objective-C method, or global variable. The no_sanitize(...) attribute drops instrumentation checks, but may still insert code preventing false positive reports. In some cases though (e.g. when building Linux kernel with -fsanitize=kernel-memory or -fsanitize=thread) the users may want to avoid any kind of instrumentation. Differential Revision: https://reviews.llvm.org/D108029	2021-08-20 14:01:06 +02:00
Roman Lebedev	5d4f37e895	[NFCI][SimplifyCFG] Rewrite `createUnreachableSwitchDefault()` The only thing that function should do as per it's semantic, is to ensure that the switch's default is a block consisting only of an `unreachable` terminator. So let's just create such a block and update switch's default to point to it. There should be no need for all this weird dance around predecessors/successors.	2021-08-20 13:28:08 +03:00
Akira Hatanaka	898dc4590c	Refactor inlineRetainOrClaimRVCalls. NFC This is in preparation for committing https://reviews.llvm.org/D103000.	2021-08-19 14:55:45 -07:00
Arthur Eubanks	44a3241f10	[NFC] Replace some attribute methods that use confusing indexes	2021-08-19 14:10:26 -07:00
Philip Reames	17b9cb1817	[runtimeunroll] Support multiple exits to latch exit w/prolog loop This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used. This is the prolog companion to D107381. Since this was LGTMed, a problem with DT updating was reported against that patch. I roled in the analogous fix here as it seemed obvious, and not worth re-review. As an aside, our prolog form leaves a lot of potential value on the floor when there is an invariant load or invariant condition in the loop being runtime unrolled. We should probably consider a "required prolog" heuristic. (Alternatively, maybe we should be peeling these cases more aggressively?) Differential Revision: https://reviews.llvm.org/D108262	2021-08-19 11:43:52 -07:00
Philip Reames	447256f22b	[runtimeunroll] Fix reported DT verification error after `94d0914` In `94d0914`, I added support for unrolling of multiple exit loops which have multiple exits reaching the latch. Per reports on the review post commit, I'd missed updating the domtree for one case. This fix addresses that ommission. There's no new test as this is covered by existing tests with expensive verification turned on.	2021-08-19 11:06:17 -07:00
Arthur Eubanks	33d44b762e	[OpaquePtr][Inline] Use byval type instead of pointee type Reviewed By: #opaque-pointers, dblaikie Differential Revision: https://reviews.llvm.org/D105711	2021-08-19 09:56:08 -07:00
Sanjay Patel	ec54e275f5	Revert "[CVP] processSwitch: Remove default case when switch cover all possible values." This reverts commit `9934a5b2ed`. This patch may cause miscompiles because it missed a constraint as shown in the examples from: https://llvm.org/PR51531	2021-08-19 08:43:51 -04:00
Arthur Eubanks	fde0eb1f9a	[NFC] A couple more removeAttribute() cleanups	2021-08-18 11:15:20 -07:00
Arthur Eubanks	3f4d00bc3b	[NFC] More get/removeAttribute() cleanup	2021-08-17 21:05:41 -07:00
Arthur Eubanks	de0ae9e89e	[NFC] Cleanup more AttributeList::addAttribute()	2021-08-17 21:05:41 -07:00
Arthur Eubanks	ad727ab7d9	[NFC] Migrate some callers away from Function/AttributeLists methods that take an index These methods can be confusing.	2021-08-17 21:05:40 -07:00
Arthur Eubanks	46cf82532c	[NFC] Replace Function handling of attributes with less confusing calls To avoid magic constants and confusing indexes.	2021-08-17 21:05:40 -07:00
Jun Ma	9934a5b2ed	[CVP] processSwitch: Remove default case when switch cover all possible values. Differential Revision: https://reviews.llvm.org/D106056	2021-08-18 10:23:13 +08:00
Philip Reames	94d0914292	[runtimeunroll] Support multiple exits to latch exit w/epilogue loop This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used. I decided to restrict this to the epilogue case. Given the changes ended up being pretty generic, we may be able to unblock the prolog case too, but I want to do that in a separate change to reduce the amount of code we all have to understand at one time. Differential Revision: https://reviews.llvm.org/D107381	2021-08-17 17:52:04 -07:00
Philip Reames	982da7a20c	[SCEVExpander] Stop hoisting IR when reusing phis his is a fix for PR43678, and is an alternate patch to D105723. The basic issue we're running into is that LSR + SCEVExpander are moving the very instruction whose operand we're in the process of expanding. This breaks the subtle and ill-documented invariant which let LSR work. (Full story can be found here: https://reviews.llvm.org/D105723#2878473) Rather than attempting a fix, this change just removes the optimization entirely. The code is entirely untested, and removing it appears to have no impact I can find. This code was added back in 2014 by `1e12f8563d` with a single test which does not seem to actually test the hoisting logic. From a philosophical standpoint, it also seems very strange to have the expander implementing optimizations which should live in a dedicated transform pass. Differential Revision: https://reviews.llvm.org/D106178	2021-08-17 09:38:32 -07:00
Arthur Eubanks	0d822da2bd	[NFC] Remove/replace some confusing attribute getters on Function	2021-08-16 16:12:37 -07:00
Nikita Popov	735a590471	[MemorySSA] Remove -enable-mssa-loop-dependency option This option has been enabled by default for quite a while now. The practical impact of removing the option is that MSSA use cannot be disabled in default pipelines (both LPM and NPM) and in manual LPM invocations. NPM can still choose to enable/disable MSSA using loop vs loop-mssa. The next step will be to require MSSA for LICM and drop the AST-based implementation entirely. Differential Revision: https://reviews.llvm.org/D108075	2021-08-16 20:59:37 +02:00
Nikita Popov	570c9beb8e	[MemorySSA] Remove unnecessary MSSA dependencies LoopLoadElimination, LoopVersioning and LoopVectorize currently fetch MemorySSA when construction LoopAccessAnalysis. However, LoopAccessAnalysis does not actually use MemorySSA and we can pass nullptr instead. This saves one MemorySSA calculation in the default pipeline, and thus improves compile-time. Differential Revision: https://reviews.llvm.org/D108074	2021-08-16 20:40:55 +02:00
Roman Lebedev	febcedf18c	Revert "[NFCI][IndVars] rewriteLoopExitValues(): nowadays SCEV should not change `GEP` base pointer" https://bugs.llvm.org/show_bug.cgi?id=51490 was filed. This reverts commit `35a8bdc775`.	2021-08-16 14:30:29 +03:00
David Sherwood	9b19b77883	[NFC] Remove unused code in llvm::createSimpleTargetReduction	2021-08-16 09:50:45 +01:00
Roman Lebedev	2eb554a9fe	Revert "Reland [SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)" This is still wrong, as failing bots suggest. This reverts commit `3d9beefc7d`.	2021-08-16 11:07:42 +03:00
Roman Lebedev	3d9beefc7d	Reland [SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125) ... with test change this time. LLVM IR SSA form is "implicit" in `@pr51125`. While is a valid LLVM IR, and does not require any PHI nodes, that completely breaks the further logic in `CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses()` that updates the live-out uses of the bonus instructions. What i believe we need to do, is to first make the SSA form explicit, by inserting tautological PHI nodes, and rewriting the offending uses. ``` $ /builddirs/llvm-project/build-Clang12/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -load-pass-plugin /repositories/alive2/build-Clang-release/tv/tv.so -tv -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=10 -tv -o /dev/null /tmp/test.ll ---------------------------------------- @global_pr51125 = global 4 bytes, align 4 define i32 @pr51125() { %entry: br label %L %L: %ld = load i32, * @global_pr51125, align 4 %iszero = icmp eq i32 %ld, 0 br i1 %iszero, label %exit, label %L2 %L2: store i32 4294967295, * @global_pr51125, align 4 %cmp = icmp eq i32 %ld, 4294967295 br i1 %cmp, label %L, label %exit %exit: %r = phi i32 [ %ld, %L2 ], [ %ld, %L ] ret i32 %r } => @global_pr51125 = global 4 bytes, align 4 define i32 @pr51125() { %entry: %ld.old = load i32, * @global_pr51125, align 4 %iszero.old = icmp eq i32 %ld.old, 0 br i1 %iszero.old, label %exit, label %L2 %L2: %ld2 = phi i32 [ %ld.old, %entry ], [ %ld, %L2 ] store i32 4294967295, * @global_pr51125, align 4 %cmp = icmp ne i32 %ld2, 4294967295 %ld = load i32, * @global_pr51125, align 4 %iszero = icmp eq i32 %ld, 0 %or.cond = select i1 %cmp, i1 1, i1 %iszero br i1 %or.cond, label %exit, label %L2 %exit: %ld1 = phi i32 [ poison, %L2 ], [ %ld.old, %entry ] %r = phi i32 [ %ld2, %L2 ], [ %ld.old, %entry ] ret i32 %r } Transformation seems to be correct! ``` Fixes https://bugs.llvm.org/show_bug.cgi?id=51125 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D106317	2021-08-15 19:16:04 +03:00
Roman Lebedev	60dd0121c9	Revert "[SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125)" Forgot to stage the test change. This reverts commit `78af5cb213`.	2021-08-15 19:15:09 +03:00
Roman Lebedev	78af5cb213	[SimplifyCFG] performBranchToCommonDestFolding(): form block-closed SSA form before cloning instructions (PR51125) LLVM IR SSA form is "implicit" in `@pr51125`. While is a valid LLVM IR, and does not require any PHI nodes, that completely breaks the further logic in `CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses()` that updates the live-out uses of the bonus instructions. What i believe we need to do, is to first make the SSA form explicit, by inserting tautological PHI nodes, and rewriting the offending uses. ``` $ /builddirs/llvm-project/build-Clang12/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -load-pass-plugin /repositories/alive2/build-Clang-release/tv/tv.so -tv -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -bonus-inst-threshold=10 -tv -o /dev/null /tmp/test.ll ---------------------------------------- @global_pr51125 = global 4 bytes, align 4 define i32 @pr51125() { %entry: br label %L %L: %ld = load i32, * @global_pr51125, align 4 %iszero = icmp eq i32 %ld, 0 br i1 %iszero, label %exit, label %L2 %L2: store i32 4294967295, * @global_pr51125, align 4 %cmp = icmp eq i32 %ld, 4294967295 br i1 %cmp, label %L, label %exit %exit: %r = phi i32 [ %ld, %L2 ], [ %ld, %L ] ret i32 %r } => @global_pr51125 = global 4 bytes, align 4 define i32 @pr51125() { %entry: %ld.old = load i32, * @global_pr51125, align 4 %iszero.old = icmp eq i32 %ld.old, 0 br i1 %iszero.old, label %exit, label %L2 %L2: %ld2 = phi i32 [ %ld.old, %entry ], [ %ld, %L2 ] store i32 4294967295, * @global_pr51125, align 4 %cmp = icmp ne i32 %ld2, 4294967295 %ld = load i32, * @global_pr51125, align 4 %iszero = icmp eq i32 %ld, 0 %or.cond = select i1 %cmp, i1 1, i1 %iszero br i1 %or.cond, label %exit, label %L2 %exit: %ld1 = phi i32 [ poison, %L2 ], [ %ld.old, %entry ] %r = phi i32 [ %ld2, %L2 ], [ %ld.old, %entry ] ret i32 %r } Transformation seems to be correct! ``` Fixes https://bugs.llvm.org/show_bug.cgi?id=51125 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D106317	2021-08-15 19:02:34 +03:00
Roman Lebedev	35a8bdc775	[NFCI][IndVars] rewriteLoopExitValues(): nowadays SCEV should not change `GEP` base pointer Currently/previously, while SCEV guaranteed that it produces the same value, the way it was produced may be illegal IR, so we have an ugly check that the replacement is valid. But now that the SCEV strictness wrt the pointer/integer types has been improved, i believe this invariant is already upheld by the SCEV itself, natively. I think we should add an assertion, wait for a week, and then, if all is good, rip out all this checking. Or we could just do the latter directly i guess. This reverts commit rL127839. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D108043	2021-08-15 18:59:32 +03:00
Arthur Eubanks	c19d7f8af0	[CallPromotion] Check for inalloca/byval mismatch Previously we would allow promotion even if the byval/inalloca attributes on the call and the callee didn't match. It's ok if the byval/inalloca types aren't the same. For example, LTO importing may rename types. Fixes PR51397. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D107998	2021-08-13 16:52:04 -07:00
Arthur Eubanks	a9831cce1e	[NFC] Remove public uses of AttributeList::getAttributes() Use methods that better convey the intent.	2021-08-13 11:38:12 -07:00
Arthur Eubanks	80ea2bb574	[NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes() This is more consistent with similar methods.	2021-08-13 11:16:52 -07:00
Roman Lebedev	c46546bd52	Reland "[NFCI][SimplifyCFG] simplifyCondBranch(): assert that branch is non-tautological"" The commit originally unearthed a problem, reported as https://reviews.llvm.org/rGf30a7dff8a5b32919951dcbf92e4a9d56c4679ff#1019890 Now that the problem has been fixed, and the assertion no longer fires, let's see if there are other cases it fires on. This reverts commit `5c8c24d2de`, relanding commit `f30a7dff8a`.	2021-08-13 15:45:03 +03:00
Roman Lebedev	2702fb1148	[SimplifyCFG] Restart if `removeUndefIntroducingPredecessor()` made changes It might changed the condition of a branch into a constant, so we should restart and constant-fold terminator, instead of continuing with the tautological "conditional" branch. This fixes the issue reported at https://reviews.llvm.org/rGf30a7dff8a5b32919951dcbf92e4a9d56c4679ff	2021-08-13 15:45:03 +03:00
Roman Lebedev	5c8c24d2de	Revert "[NFCI][SimplifyCFG] simplifyCondBranch(): assert that branch is non-tautological" The assertion does not hold on a provided reproducer. Reverting until after fixing the problem. This reverts commit `f30a7dff8a`.	2021-08-13 13:16:22 +03:00
Roman Lebedev	f30a7dff8a	[NFCI][SimplifyCFG] simplifyCondBranch(): assert that branch is non-tautological We really shouldn't deal with a conditional branch that can be trivially constant-folded into an unconditional branch. Indeed, barring failure to trigger BB reprocessing, that should be true, so let's assert as much, and hope the assertion never fires. If it does, we have a bug to fix.	2021-08-12 20:03:09 +03:00
Roman Lebedev	628f63d3d5	[SimplifyCFG] If FoldTwoEntryPHINode() changed things, restart Mainly, i want to add an assertion that `SimplifyCFGOpt::simplifyCondBranch()` doesn't get asked to deal with non-unconditional branches, and if i do that, then said assertion fires on existing tests, and this is what prevents it from firing.	2021-08-12 20:03:09 +03:00
Adrian Prantl	d6b6880172	Streamline the API of salvageDebugInfoImpl (NFC) This patch refactors / simplifies salvageDebugInfoImpl(). The goal here is to simplify the implementation of coro::salvageDebugInfo() in a followup patch. 1. Change the return value to I.getOperand(0). Currently users of salvageDebugInfoImpl() assume that the first operand is I.getOperand(0). This patch makes this information explicit. A nice side-effect of this change is that it allows us to salvage expressions such as add i8 1, %a in the future. 2. Factor out the creation of a DIExpression and return an array of DIExpression operations instead. This change allows users that call salvageDebugInfoImpl() in a loop to avoid the costly creation of temporary DIExpressions and to defer the creation of a DIExpression until the end. This patch does not change any functionality. rdar://80227769 Differential Revision: https://reviews.llvm.org/D107383	2021-08-10 15:21:18 -07:00
Carl Ritson	a1783b54e8	[SimpifyCFG] Remove recursion from FoldCondBranchOnPHI. NFCI. Avoid stack overflow errors on systems with small stack sizes by removing recursion in FoldCondBranchOnPHI. This is a simple change as the recursion was only iteratively calling the function again on the same arguments. Ideally this would be compiled to a tail call, but there is no guarantee. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D107803	2021-08-10 19:14:31 +09:00
Michael Liao	b5e470aa2e	[LowerMemIntrinsics] Typo fix.	2021-08-08 22:38:58 -04:00
Momchil Velikov	f171149e0d	[SimpifyCFG] Speculate a store preceded by a local non-escaping load In SimplifyCFG we may simplify the CFG by speculatively executing certain stores, when they are preceded by a store to the same location. This patch allows such speculation also when the stores are similarly preceded by a load. In order for this transformation to be correct we need to ensure that the memory location is writable and the store in the new location does not introduce a data race. Local objects (created by an `alloca` instruction) are always writable, so once we are past a read from a location it is valid to also write to that same location. Seeing just a load does not guarantee absence of a data race (unlike if we see a store) - the load may still be part of a race, just not causing undefined behaviour (cf. https://llvm.org/docs/Atomics.html#optimization-outside-atomic). In the original program, a data race might have been prevented by the condition, but once we move the store outside the condition, we must be sure a data race wasn't possible anyway, no matter what the condition evaluates to. One way to be sure that a local object is never concurrently read/written is check that its address never escapes the function. Hence this transformation is restricted to local, non-escaping objects. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D107281	2021-08-05 15:54:42 +01:00

1 2 3 4 5 ...

5989 Commits