llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	2e87333bfe	[InstCombine] convert mul by negative-pow2 to negate and shift This is an unusual canonicalization because we create an extra instruction, but it's likely better for analysis and codegen (similar reasoning as D133399). InstCombine::Negator may create this kind of multiply from negate and shift, but this should not conflict because of the narrow negation. I don't know how to create a fully general proof for this kind of transform in Alive2, but here's an example with bitwidths similar to one of the regression tests: https://alive2.llvm.org/ce/z/J3jTjR Differential Revision: https://reviews.llvm.org/D133667	2022-10-02 12:22:25 -04:00
Sanjay Patel	4490cfbaf4	[ValueTracking] peek through fpext in isKnownNeverInfinity() https://alive2.llvm.org/ce/z/BkNoRW	2022-10-02 11:20:23 -04:00
Sanjay Patel	0243b424d7	[InstSimplify] add tests for FP infinity compare with fpext; NFC	2022-10-02 11:20:23 -04:00
Florian Hahn	3fe6ddd999	[ConstraintElimination] Update Changed status in ssub simplification. Update tryToSimplifyOverflowMath to indicate whether the function made any changes to the IR.	2022-10-02 14:25:51 +01:00
Arthur Eubanks	5df4ab55f9	[llvm] Migrate PAEval to new pass manager	2022-10-01 16:41:58 -07:00
Florian Hahn	05b0b48b50	[SimpleLoopUnswitch] Pass -verify-cfg-preserved to test. This ensures PreservedCFGCheckerAnalysis is always added, independent of whether opt was built with assertions enabled or not. This fixes a few buildbot failures for bots that don't have assertions enabled.	2022-10-01 17:19:02 +01:00
Florian Hahn	7c0ff64b0f	[LAA] Change to function analysis for new PM. At the moment, LoopAccessAnalysis is a loop analysis for the new pass manager. The issue with that is that LAI caches SCEV expressions and modifications in a loop may impact SCEV expressions in other loops, but we do not have a convenient way to invalidate LAI for other loops withing a loop pipeline. To avoid this issue, turn it into a function analysis which returns a manager object that keeps track of the individual LAI objects per loop. Fixes #50940. Fixes #51669. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134606	2022-10-01 15:44:27 +01:00
Teresa Johnson	43417d8159	[MemProf] Update metadata during inlining Update both memprof and callsite metadata to reflect inlined functions. For callsite metadata this is simply a concatenation of each cloned call's call stack with that of the inlined callsite's. For memprof metadata, each profiled memory info block (MIB) is either moved to the cloned allocation call or left on the original allocation call depending on whether its context matches the newly refined call stack context on the cloned call. We also reapply context trimming optimizations based on the refined set of contexts on each of the calls (cloned and original). Depends on D128142. Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D128143	2022-09-30 19:21:15 -07:00
Teresa Johnson	4d243348fb	Revert "[MemProf] Update metadata during inlining" and preceeding commit This reverts commit `0d7f3464ce` and commit `f9403ca41e`. The latter was "Profile matching and IR annotation for memprof profiles." and was left from a bad rebase from a commit already pushed upstream.	2022-09-30 17:01:30 -07:00
Teresa Johnson	0d7f3464ce	[MemProf] Update metadata during inlining Update both memprof and callsite metadata to reflect inlined functions. For callsite metadata this is simply a concatenation of each cloned call's call stack with that of the inlined callsite's. For memprof metadata, each profiled memory info block (MIB) is either moved to the cloned allocation call or left on the original allocation call depending on whether its context matches the newly refined call stack context on the cloned call. We also reapply context trimming optimizations based on the refined set of contexts on each of the calls (cloned and original), via utilities in MemoryProfileInfo. Depends on D128142. Differential Revision: https://reviews.llvm.org/D128143	2022-09-30 16:46:17 -07:00
Matthias Braun	189900eb14	X86: Stop assigning register costs for longer encodings. This stops reporting CostPerUse 1 for `R8`-`R15` and `XMM8`-`XMM31`. This was previously done because instruction encoding require a REX prefix when using them resulting in longer instruction encodings. I found that this regresses the quality of the register allocation as the costs impose an ordering on eviction candidates. I also feel that there is a bit of an impedance mismatch as the actual costs occure when encoding instructions using those registers, but the order of VReg assignments is not primarily ordered by number of Defs+Uses. I did extensive measurements with the llvm-test-suite wiht SPEC2006 + SPEC2017 included, internal services showed similar patterns. Generally there are a log of improvements but also a lot of regression. But on average the allocation quality seems to improve at a small code size regression. Results for measuring static and dynamic instruction counts: Dynamic Counts (scaled by execution frequency) / Optimization Remarks: Spills+FoldedSpills -5.6% Reloads+FoldedReloads -4.2% Copies -0.1% Static / LLVM Statistics: regalloc.NumSpills mean -1.6%, geomean -2.8% regalloc.NumReloads mean -1.7%, geomean -3.1% size..text mean +0.4%, geomean +0.4% Static / LLVM Statistics: mean -2.2%, geomean -3.1%) regalloc.NumSpills mean -2.6%, geomean -3.9%) regalloc.NumReloads mean +0.6%, geomean +0.6%) size..text Static / LLVM Statistics: regalloc.NumSpills mean -3.0% regalloc.NumReloads mean -3.3% size..text mean +0.3%, geomean +0.3% Differential Revision: https://reviews.llvm.org/D133902	2022-09-30 16:01:33 -07:00
Florian Hahn	586784a2e4	[ConstraintElimination] Simplify check lines in test added in `2812a141`. The CHECK lines in the test are too specific and cause mis-matches on some platforms. Reduce them to make them less fragile.	2022-09-30 19:51:05 +01:00
Florian Hahn	2812a1413f	[ConstraintElimination] Add test showing bug in analysis invalidation.	2022-09-30 19:37:58 +01:00
Arthur Eubanks	e23aee7175	[test] Update some legacy PM tests	2022-09-30 11:31:02 -07:00
Arthur Eubanks	865406d21e	[FixIrreducible][opt] Mark -fix-irreducible as a codegen pass So we don't have to specify -enable-new-pm=0.	2022-09-30 10:34:04 -07:00
Arthur Eubanks	a7264e5549	[StructurizeCFG][opt] Mark -structurizecfg as a codegen pass So we don't have to specify -enable-new-pm=0.	2022-09-30 10:27:09 -07:00
Florian Hahn	04c711c78d	[ConstraintElimination] Make sure the variable is available before use. This fixes a crash when trying to access an index for a value where we don't have a known index. Fixes #58009.	2022-09-30 18:09:01 +01:00
Sanjay Patel	88b7c178f2	[SCCP] regenerate test checks; NFC Avoid names with "tmp" because that can go wrong with the auto-generated script names.	2022-09-30 10:26:01 -04:00
Sanjay Patel	bf1fe2497d	[SCCP] add tests for sitofp; NFC Adapted from the existing tests for ashr, sdiv, srem.	2022-09-30 09:28:57 -04:00
Sanjay Patel	3f906f057c	[InstSimplify] look through vector select (shuffle) in min/max fold This is an extension of the existing min/max+select fold (which already has a very large number of variations) to allow a vector shuffle because that's what we have in the motivating example from issue #42100. A couple of Alive2 checks of variants (I don't know how to generalize these in Alive): https://alive2.llvm.org/ce/z/jUFAqT And verify the PR42100 test: https://alive2.llvm.org/ce/z/3EcASf It's possible there is some generalization of the fold or a VectorCombine/SLP answer for the motivating test, but I haven't found a better/smaller solution yet. We can also add even more variants here as follow-up patches. For example, we can have shuffle followed by min/max; we also don't have this canonicalization or the reverse: https://alive2.llvm.org/ce/z/StHD9f Differential Revision: https://reviews.llvm.org/D134879	2022-09-30 08:27:00 -04:00
Simon Pilgrim	5fc7bbfaa3	[SLP][X86] Add test coverage for Issue #58054	2022-09-30 13:26:31 +01:00
Nikita Popov	5c5ac3490c	[InstCombine] Add test for PR57488 (NFC)	2022-09-30 14:25:13 +02:00
Nikita Popov	c720fad16f	[InstCombine] Add test for phi translation during select of phi fold (NFC) The phi translation performed during this fold is important for correctness, but was apparently untested.	2022-09-30 13:07:58 +02:00
Nikita Popov	b6042cdf8b	[InstCombine] Regenerate test checks (NFC)	2022-09-30 13:04:36 +02:00
Simon Pilgrim	5849fcb635	Revert rG1b7089fe67b924bdd5ecef786a34bdba7a88778f "[SLP] Add ScalarizationOverheadBuilder helper to track vector extractions" Revert rGef89409a59f3b79ae143b33b7d8e6ee6285aa42f "Fix 'unused-lambda-capture' gcc warning. NFCI." Revert rG926ccfef032d206dcbcdf74ca1e3a9ebf4d1be45 "[SLP] ScalarizationOverheadBuilder - demand all elements for scalarization if the extraction index is unknown / out of bounds" Revert ScalarizationOverheadBuilder sequence from D134605 - when accumulating extraction costs by Type (instead of specific Value), we are not distinguishing enough when they are coming from the same source or not, and we always just count the cost once. This needs addressing before we can use getScalarizationOverhead properly.	2022-09-30 11:22:48 +01:00
Simon Pilgrim	19782a46f8	[SLP][X86] Add test case for crash reported on D134605	2022-09-30 11:07:54 +01:00
Simon Pilgrim	28a3fc39a6	[SLP][AArch64] Add test case for vectorization regression case reported on D134605	2022-09-30 11:07:54 +01:00
Ben Dunbobbin	7eee2a2d44	[IR] Don't allow DLL storage-class and local linkage Disallow this meaningless combination. Doing so simplifies analysis of LLVM code w.r.t t DLL storage-class, and prevents mistakes with DLL storage class. - Change the assembler to reject DLL storage class on symbols with local linkage. - Change the bitcode reader to clear the DLL Storage class when the linkage is local for auto-upgrading - Update LangRef. There is an existing restriction on non-default visibility and local linkage which this is modelled on. Differential Review: https://reviews.llvm.org/D134784	2022-09-30 00:26:01 +01:00
Florian Hahn	9933a2e9fd	[SCEVExpander] Move LCSSA fixup to ::expand. Move LCSSA fixup from ::expandCodeForImpl to ::expand(). This has the advantage that we directly preserve LCSSA nodes here instead of relying on doing so in rememberInstruction. It also ensures that we don't add the non-LCSSA-safe value to InsertedExpressions. Alternative to D132704. Fixes #57000. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D134739	2022-09-29 20:49:56 +01:00
luxufan	f079ba76cf	[DSE] Eliminate noop store even through has clobbering between LoadI and StoreI For noop store of the form of LoadI and StoreI, An invariant should be kept is that the memory state of the related MemoryLoc before LoadI is the same as before StoreI. For this example: ``` define void @pr49927(i32* %q, i32* %p) { %v = load i32, i32* %p, align 4 store i32 %v, i32* %q, align 4 store i32 %v, i32* %p, align 4 ret void } ``` Here the definition of the store's destination is different with the definition of the load's destination, which it seems that the invariant mentioned above is broken. But the definition of the store's destination would write a value that is LoadI, actually, the invariant is still kept. So we can safely ignore it. Fixes https://github.com/llvm/llvm-project/issues/49271 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D132657	2022-09-29 00:51:56 +00:00
Sanjay Patel	8bfba17b40	[InstSimplify][PhaseOrdering] add tests for vector select of min/max; NFC The phase ordering test is the almost unoptimized IR for the example in issue #42100; it was passed through -mem2reg to reduce obvious excessive load/store and other noise. D134879	2022-09-29 12:06:55 -04:00
luxufan	ffb2a1534d	[DSE][NFC] Update noop-stores.ll using update_test_checks.py Differential Revision: https://reviews.llvm.org/D134630	2022-09-28 23:25:33 +00:00
Nikita Popov	aa25c92f33	[ValueTracking] Fix CannotBeOrderedLessThanZero() for fdiv (PR58046) When checking the RHS of fdiv, we should set the SignBitOnly flag, because a negative zero can become -Inf, which is ordered less than zero. Fixes https://github.com/llvm/llvm-project/issues/58046. Differential Revision: https://reviews.llvm.org/D134876	2022-09-29 17:07:48 +02:00
Philip Reames	02bfe2de7c	[RISCV] Adjust vector immediate store materialization cost This change updates the costs to make constant pool loads match their actual cost, and adds the broadcast special case to avoid too many regressions. We really need more information about the constants being rematerialized, but this is an incremental improvement. Differential Revision: https://reviews.llvm.org/D134746	2022-09-29 07:37:13 -07:00
Nikita Popov	ea32658288	[InstSimplify] Add test for PR58046 (NFC)	2022-09-29 15:22:13 +02:00
Nikita Popov	412141663c	Reapply [FunctionAttrs] Infer precise FMRB The previous version of the patch would incorrect convert an existing argmemonly attribute into an inaccessiblemem_or_argmemonly attribute. ----- This updates checkFunctionMemoryAccess() to infer a precise FunctionModRefBehavior, rather than an approximation split into read/write and argmemonly. Afterwards, we still map this back to imprecise function attributes. This still allows us to infer some cases that we previously did not handle, namely inaccessiblememonly and inaccessiblemem_or_argmemonly. In practice, this means we get better memory attributes in the presence of intrinsics like @llvm.assume. Differential Revision: https://reviews.llvm.org/D134527	2022-09-29 14:02:15 +02:00
Nikita Popov	e7f1331910	[FunctionAttrs] Add test for argmemonly function that already has attr (NFC) Test for the issue reported in https://reviews.llvm.org/D134527#3821010.	2022-09-29 13:56:31 +02:00
Juan Manuel MARTINEZ CAAMAÑO	52545e603b	[DebugInfo][InferAddressSpaces] Propagate DebugLoc when cloning an instruction in InferAddressSpaces Differential Revision: https://reviews.llvm.org/D134428	2022-09-29 08:43:37 +00:00
Juan Manuel MARTINEZ CAAMAÑO	e9716c64ec	[StructurizeCFG] Remove imposible case and replace by assert In addition, replace outdated XFAIL test by a new one. Differential Revision: https://reviews.llvm.org/D134439	2022-09-29 08:27:49 +00:00
Mingming Liu	ac28efa6c1	[SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches. - Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved and could cause missed loop transformations (see 'test2' in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll). To illustrate how inner-loop metadata might be dropped before this patch: CFG Before entry \| v ---> while.cond -------------> while.end \| \| \| v \| while.body \| \| \| v \| for.body <---- (md1) \| \| \|______\| \| v \| while.cond.exit (md2) \| \| \|_______\| CFG After entry \| v ---> while.cond.rewrite -------------> while.end \| \| \| v \| while.body \| \| \| v \| for.body <---- (md2) \|_______\| \|______\| Basically, when 'while.cond.exit' is folded into 'while.cond', 'md2' overrides 'md1' and 'md1' is dropped from the CFG. Differential Revision: https://reviews.llvm.org/D134152	2022-09-28 10:48:14 -07:00
Matt Arsenault	01adf1f3e5	AtomicExpand: Add some more overaligned atomic tests	2022-09-28 12:51:30 -04:00
Matt Arsenault	a61c3455c0	AtomicExpand: Use llvm.ptrmask instead of ptrtoint This removes the ptrtoint from the load's pointer operand, although we can't entirely eliminate these to get the LSB shift. In a future patch, this will avoid ptrtoint in the case where the atomic is overaligned to the word size.	2022-09-28 12:51:30 -04:00
bipmis	3b49a9fcf6	[AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load. The patch simplifies some of the patterns as below 1. (ZExt(L1) << shift1) \| (ZExt(L2) << shift2) -> ZExt(L3) << shift1 2. (ZExt(L1) << shift1) \| ZExt(L2) -> ZExt(L3) The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc. Fix the error reported on reverse load merge. Differential Revision: https://reviews.llvm.org/D127392	2022-09-28 17:32:47 +01:00
bipmis	48b8dee773	remove LE,BE labels inserted incorrectly	2022-09-28 17:07:26 +01:00
Sanjay Patel	e239198cdb	[InstCombine] fold select shuffles with shared operand together We don't combine generic shuffles together in IR, but select shuffles are a special-case because a select shuffle of a select shuffle is just another select shuffle; codegen is expected to efficiently lower those (select shuffles are also the canonical form of a vector select with constant condition).	2022-09-28 11:56:27 -04:00
Sanjay Patel	d743aff790	[InstCombine] add tests for shuffle-of-shuffle; NFC	2022-09-28 11:56:27 -04:00
bipmis	1dd7e576d7	Add reverse load tests to test load combine patch	2022-09-28 16:51:23 +01:00
Sameer Sahasrabuddhe	3f078b308b	[AAPointerInfo] OffsetInfo: Unassigned is distinct from Unknown A User like the PHINode may be visited multiple times for the same pointer along different def-use edges. The uninitialized state of OffsetInfo at the first visit needs to be distinct from the Unknown value that may be assigned after processing the PHINode. Without that, a PHINode with all inputs Unknown is never followed to its uses. This results in incorrect optimization because some interfering accessess are missed. Differential Revision: https://reviews.llvm.org/D134704	2022-09-28 20:31:36 +05:30
Benjamin Kramer	0fb2676c24	Revert "[FunctionAttrs] Infer precise FMRB" This reverts commit `97dfa53626`. It can make DSE crash. Reduced test case at https://reviews.llvm.org/P8291	2022-09-28 16:57:43 +02:00
Igor Kirillov	2d60d7ba1a	[LoopVectorize][Fix] Crash when invariant store address is calculated inside loop Fixes #57572 Generally LICM pass is responsible for sinking out code that calculates invariant address inside loop as it only needed to be calculated once. But in rare case it does not happen we will not be vectorizing the loop. Differential Revision: https://reviews.llvm.org/D133687	2022-09-28 10:33:50 +01:00

1 2 3 4 5 ...

23151 Commits