llvm-project

Commit Graph

Author	SHA1	Message	Date
Sergey Dmitriev	0f70f73308	[Attributor] Bitcast constant to the returned value type if it has different type Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79277	2020-05-03 11:46:13 -07:00
Hongtao Yu	911e06f5eb	[ICP] Handling must tail calls in indirect call promotion Per the IR convention, a musttail call must precede a ret with an optional bitcast. This was violated by the indirect call promotion optimization which could result an IR like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 br label %2202, !dbg !226012 ; <label>:2202: ; preds = %605, %2201, %2199 ret void, !dbg !229485 This is being fixed in this change where the return statement goes together with the promoted indirect call. The code generated is like: ; <label>:2192: br i1 %2198, label %2199, label %2201, !dbg !226012, !prof !229483 ; <label>:2199: ; preds = %2192 musttail call fastcc void @foo(i8* %2195), !dbg !226012 ret void, !dbg !229485 ; <label>:2201: ; preds = %2192 musttail call fastcc void %2197(i8* %2195), !dbg !226012 ret void, !dbg !229485 Differential Revision: https://reviews.llvm.org/D79258	2020-05-03 10:42:22 -07:00
Mircea Trofin	bec4ab95a4	[llvm][NFC] Inliner: factor cost and reporting out of inlining process Summary: This factors cost and reporting out of the inlining workflow, thus making it easier to reuse when driving inlining from the upcoming InliningAdvisor. Depends on: D79215 Reviewers: davidxl, echristo Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79275	2020-05-03 10:38:28 -07:00
Florian Hahn	bbdfcf8f69	[VPlan] Remove unused & undefined print method (NFC).	2020-05-03 18:36:20 +01:00
Johannes Doerfert	8228153f87	[Attributor][NFC] Encode IRPositions in the bits of a single pointer This reduces memory consumption for IRPositions by eliminating the vtable pointer and the `KindOrArgNo` integer. Since each abstract attribute has an associated IRPosition, the 12-16 bytes we save add up quickly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 469545 (260135/s) temporary memory allocations: 77137 (42735/s) peak heap memory consumption: 30.50MB peak RSS (including heaptrack overhead): 119.50MB total memory leaked: 269.07KB ``` After: ``` calls to allocation functions: 468999 (274108/s) temporary memory allocations: 77002 (45004/s) peak heap memory consumption: 28.83MB peak RSS (including heaptrack overhead): 118.05MB total memory leaked: 269.07KB ``` Difference: ``` calls to allocation functions: -546 (5808/s) temporary memory allocations: -135 (1436/s) peak heap memory consumption: -1.67MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- CTMark 15 runs Metric: compile_time Program lhs rhs diff test-suite...:: CTMark/sqlite3/sqlite3.test 25.07 24.09 -3.9% test-suite...Mark/mafft/pairlocalalign.test 14.58 14.14 -3.0% test-suite...-typeset/consumer-typeset.test 21.78 21.58 -0.9% test-suite :: CTMark/SPASS/SPASS.test 21.95 22.03 0.4% test-suite :: CTMark/lencod/lencod.test 25.43 25.50 0.3% test-suite...ark/tramp3d-v4/tramp3d-v4.test 23.88 23.83 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 60.24 60.11 -0.2% test-suite :: CTMark/kimwitu++/kc.test 15.69 15.69 -0.0% test-suite...:: CTMark/ClamAV/clamscan.test 25.43 25.42 -0.0% test-suite :: CTMark/Bullet/bullet.test 37.63 37.62 -0.0% Geomean difference -0.8% --- Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78722	2020-05-03 12:15:19 -05:00
Johannes Doerfert	6bf16ee4c5	[Attributor][NFC] Let AbstractAttribute be an IRPosition Since every AbstractAttribute so far, and for the foreseeable future, corresponds to a single IRPosition we can simplify the class structure. We already did this for IRAttribute but there is no reason to stop there.	2020-05-03 12:13:40 -05:00
Mircea Trofin	667f558c3f	[llvm][NFC] Inliner.cpp shouldInline post-commit feedback Discussion is in https://reviews.llvm.org/D79215	2020-05-03 09:31:31 -07:00
Sanjay Patel	682f0b366b	[InstCombine] use select-of-constants with set/clear bit mask patterns Cond ? (X & ~C) : (X \| C) --> (X & ~C) \| (Cond ? 0 : C) Cond ? (X \| C) : (X & ~C) --> (X & ~C) \| (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880	2020-05-03 09:44:43 -04:00
Nikita Popov	b7e2358220	Remove getNumUses() comparisons (NFC) getNumUses() scans the full use list. Don't use it is we only want to check if there's zero or one uses.	2020-05-02 11:05:19 +02:00
Nikita Popov	60e9ee16b4	[MergeFuncs] Don't merge shufflevectors with different masks When the shufflevector mask operand was converted into special instruction data, the FunctionComparator was not updated to account for this. As such, MergeFuncs will happily merge shufflevectors with different masks. This fixes https://bugs.llvm.org/show_bug.cgi?id=45773. Differential Revision: https://reviews.llvm.org/D79261	2020-05-02 10:21:14 +02:00
Mircea Trofin	3dbc612cf2	[llvm][NFC] Rename variable as per https://reviews.llvm.org/D79215 Operator error - performed the rename and didn't save.	2020-05-01 16:30:41 -07:00
Mircea Trofin	e1c4a7cb16	[llvm][NFC] Inliner: simplify inlining decision logic Summary: shouldInline makes a decision based on the InlineCost of a call site, as well as an evaluation on whether the site should be deferred. This means it's possible for the decision to be not to inline, even for an InlineCost that would otherwise allow it. Both uses of shouldInline performed the exact same logic after calling it. In addition, the decision on whether to inline or not was communicated through two values of the Option<InlineCost> return value: None, or an InlineCost evaluating to false. Simplified by: - encapsulating the decision in the return object. The bool it evaluates to communicates unambiguously the decision. The InlineCost is also available. - encapsulated the common post-shouldInline code into shouldInline. Reviewers: davidxl, echristo, eraman Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79215	2020-05-01 16:18:59 -07:00
Christopher Tetreault	beeabe382d	[SVE] Fix invalid usage of VectorType::getNumElements() in InstCombine Summary: Make foldVectorBinop return null if the instruction type is a scalable vector. It is unclear what, if any, of this function works with scalable vectors. Identified by test LLVM.Transforms/InstCombine::nsw.ll Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79196	2020-05-01 10:56:29 -07:00
Sanjay Patel	7fa150203f	[InstCombine] fix miscompile from multi-use cttz/ctlz transform PR45762: https://bugs.llvm.org/show_bug.cgi?id=45762	2020-05-01 13:52:24 -04:00
Florian Hahn	d911c17596	[SCCP] Get a copy of the state of CopyOf once. This fixes potential reference invalidations, when no lattice value is assigned for CopyOf. As the state of CopyOf won't change while in handleCallResult, we can get a copy once and use that. Should fix PR45749.	2020-05-01 14:46:35 +01:00
Benjamin Kramer	7a5a1e9460	[IR] AttributeList::getContext has a single user, remove it.	2020-05-01 14:18:29 +02:00
Florian Hahn	19ab53f1e2	[LoopVersioning] Update setAliasChecks to take ArrayRef argument (NFC). This cleanup was suggested as part of D78458.	2020-04-30 22:17:12 +01:00
Nikita Popov	b74c6d2c9d	[InlineFunction] Disable emission of alignment assumptions by default In D74183 clang started emitting alignment for sret parameters unconditionally. This caused a 1.5% compile-time regression on tramp3d-v4. The reason is that we now generate many instance of IR like %ptrint = ptrtoint %class.GuardLayers* %guards_m to i64 %maskedptr = and i64 %ptrint, 3 %maskcond = icmp eq i64 %maskedptr, 0 tail call void @llvm.assume(i1 %maskcond) to preserve the alignment information during inlining. Based on IR analysis, these assumptions also regress optimization. The attached phase ordering test case illustrates two issues: One are instruction count based optimization heuristics, which are affected by the four additional instructions of the assumption. The other is blocking of SROA due to ptrtoint casts (PR45763). We already encountered the same problem in Rust, where we (unlike Clang) generally prefer to emit alignment information absolutely everywhere it is available. We were only able to do this after hardcoding -preserve-alignment-assumptions-during-inlining=false, because we were seeing significant optimization and compile-time regressions otherwise. This patch disables -preserve-alignment-assumptions-during-inlining by default, because we should not be punishing people for adding more alignment annotations. Once the assume bundle work shakes out and we can represent (and use) alignment assumptions using assume bundles, it should be possible to re-enable this with reduced overhead. Differential Revision: https://reviews.llvm.org/D76886	2020-04-30 23:12:54 +02:00
Arthur Eubanks	a90948fd6e	[NFC] Rename ByValOrInalloca to PassPointeeByValue Summary: In preparation for preallocated. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79152	2020-04-30 09:42:13 -07:00
Jann Horn	a22685885d	[AddressSanitizer] Instrument byval call arguments Summary: In the LLVM IR, "call" instructions read memory for each byval operand. For example: ``` $ cat blah.c struct foo { void a, b, c; }; struct bar { struct foo foo; }; void func1(const struct foo); void func2(struct bar bar) { func1(bar->foo); } $ [...]/bin/clang -S -flto -c blah.c -O2 ; cat blah.s [...] define dso_local void @func2(%struct.bar* %bar) local_unnamed_addr #0 { entry: %foo = getelementptr inbounds %struct.bar, %struct.bar* %bar, i64 0, i32 0 tail call void @func1(%struct.foo* byval(%struct.foo) align 8 %foo) #2 ret void } [...] $ [...]/bin/clang -S -c blah.c -O2 ; cat blah.s [...] func2: # @func2 [...] subq $24, %rsp [...] movq 16(%rdi), %rax movq %rax, 16(%rsp) movups (%rdi), %xmm0 movups %xmm0, (%rsp) callq func1 addq $24, %rsp [...] retq ``` Let ASAN instrument these hidden memory accesses. This is patch 4/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77619	2020-04-30 17:09:13 +02:00
Jann Horn	cfe36e4c6a	[AddressSanitizer] Refactor: Permit >1 interesting operands per instruction Summary: Refactor getInterestingMemoryOperands() so that information about the pointer operand is returned through an array of structures instead of passing each piece of information separately by-value. This is in preparation for returning information about multiple pointer operands from a single instruction. A side effect is that, instead of repeatedly generating the same information through isInterestingMemoryAccess(), it is now simply collected once and then passed around; that's probably more efficient. HWAddressSanitizer has a bunch of copypasted code from AddressSanitizer, so these changes have to be duplicated. This is patch 3/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments [glider: renamed llvm::InterestingMemoryOperand::Type to OpType to fix GCC compilation] Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77618	2020-04-30 17:09:13 +02:00
Jann Horn	223a95fdf0	[AddressSanitizer] Split out memory intrinsic handling Summary: In both AddressSanitizer and HWAddressSanitizer, we first collect instructions whose operands should be instrumented and memory intrinsics, then instrument them. Both during collection and when inserting instrumentation, they are handled separately. Collect them separately and instrument them separately. This is a bit more straightforward, and prepares for collecting operands instead of instructions in a future patch. This is patch 2/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77617	2020-04-30 17:09:13 +02:00
Jann Horn	e29996c9a2	[AddressSanitizer] Refactor ClDebug{Min,Max} handling Summary: A following commit will split the loop over ToInstrument into two. To avoid having to duplicate the condition for suppressing instrumentation sites based on ClDebug{Min,Max}, refactor it out into a new function. While we're at it, we can also avoid the indirection through NumInstrumented for setting FunctionModified. This is patch 1/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77616	2020-04-30 17:09:13 +02:00
Alexander Potapenko	7e7754df32	Revert an accidental commit of four AddressSanitizer refactor CLs I couldn't make arc land the changes properly, for some reason they all got squashed. Reverting them now to land cleanly. Summary: This reverts commit `cfb5f89b62`. Reviewers: kcc, thejh Subscribers:	2020-04-30 16:15:43 +02:00
Jann Horn	cfb5f89b62	[AddressSanitizer] Refactor ClDebug{Min,Max} handling Summary: A following commit will split the loop over ToInstrument into two. To avoid having to duplicate the condition for suppressing instrumentation sites based on ClDebug{Min,Max}, refactor it out into a new function. While we're at it, we can also avoid the indirection through NumInstrumented for setting FunctionModified. This is patch 1/4 of a patch series: https://reviews.llvm.org/D77616 [PATCH 1/4] [AddressSanitizer] Refactor ClDebug{Min,Max} handling https://reviews.llvm.org/D77617 [PATCH 2/4] [AddressSanitizer] Split out memory intrinsic handling https://reviews.llvm.org/D77618 [PATCH 3/4] [AddressSanitizer] Refactor: Permit >1 interesting operands per instruction https://reviews.llvm.org/D77619 [PATCH 4/4] [AddressSanitizer] Instrument byval call arguments Reviewers: kcc, glider Reviewed By: glider Subscribers: jfb, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77616	2020-04-30 15:30:46 +02:00
David Spickett	3929429347	[globalopt] Don't emit DWARF fragments for members of a struct that cover the whole struct This can happen when the rest of the members of are zero length. Following the same pattern applied to the SROA pass in: `d7f6f1636d` Fixes: https://bugs.llvm.org/show_bug.cgi?id=45335 Differential Revision: https://reviews.llvm.org/D78720	2020-04-30 11:36:55 +01:00
Evgeniy Brevnov	3acf62f3ad	[BPI][NFC] IRCE shoud qequest BPI through analysis manager. Summary: There is no need to create BPI explicitly. It should be requested through AM in a normal way. Reviewers: skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79080	2020-04-30 16:04:06 +07:00
Evgeniy Brevnov	3e68a66704	[BPI][NFC] Reuse post dominantor tree from analysis manager when available Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager. Reviewers: skatkov, taewookoh, yrouban Reviewed By: skatkov Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78987	2020-04-30 11:31:03 +07:00
Mircea Trofin	3ab319b295	[llvm][NFC] Use CallBase explicitly instead of Instruction in FunctionComparator Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79098	2020-04-29 15:37:46 -07:00
Mircea Trofin	2c7ff270d2	[llvm][NFC] Inliner: rename call site variables. Summary: Renamed 'CS' to 'CB', and, in one case, to a more specific name to avoid naming collision with outer scope (a maintainability/readability reason, not correctness) Also updated comments. Reviewers: davidxl, dblaikie, jdoerfert Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79101	2020-04-29 15:36:29 -07:00
Anh Tuyen Tran	c7878ad231	[VFDatabase] Scalar functions are vector functions with VF =1 Summary: Return scalar function when VF==1. The new trivial mapping scalar --> scalar when VF==1 to prevent false positive for "isVectorizable" query. Author: masoud.ataei (Masoud Ataei) Reviewers: Whitney (Whitney Tsang), fhahn (Florian Hahn), pjeeva01 (Jeeva P.), fpetrogalli (Francesco Petrogalli), rengolin (Renato Golin) Reviewed By: fpetrogalli (Francesco Petrogalli) Subscribers: hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D78054	2020-04-29 17:20:37 +00:00
Mircea Trofin	4632b7292a	[llvm][NFC] Removed addressed fixme; formatting. Removed already-addressed fixme, and updated formatting of a few lines that were triggering Harbormaster.	2020-04-29 09:06:01 -07:00
Hiroshi Yamauchi	1831986826	[PGO][PGSO] Prep for enabling non-cold code size opts under non-partial-profile sample PGO. Summary: - Distinguish between partial-profile and non-partial-profile sample PGO. - Add a flag for partial-profile sample PGO. - Tune the sample PGO cutoff. - No default behavior change (yet). Reviewers: davidxl Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78949	2020-04-29 08:57:47 -07:00
Mircea Trofin	e61247c0a8	[llvm][NFC] Change parameter type to more specific CallBase in IndirectCallPromotion Reviewers: dblaikie, craig.topper, wmi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79047	2020-04-29 08:42:32 -07:00
Simon Pilgrim	090cae8491	[TTI] Add DemandedElts to getScalarizationOverhead The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited. This patch does 2 things: 1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern. 2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs. This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing. A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well. Reviewed By: @craig.topper Differential Revision: https://reviews.llvm.org/D78216	2020-04-29 12:00:38 +01:00
Florian Hahn	e89379856a	Recommit "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)." The crash that caused the original revert has been fixed in `a3c964a278`. I also added a reduced version of the crash reproducer. This reverts the revert commit `2107af9ccf`.	2020-04-29 11:40:39 +01:00
Florian Hahn	616657b39c	[LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC). This allows forward declarations of PointerCheck, which in turn reduce the number of times LoopAccessAnalysis needs to be included. Ultimately this helps with moving runtime check generation to Transforms/Utils/LoopUtils.h, without having to include it there. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78458	2020-04-28 21:47:31 +01:00
Mircea Trofin	8a7cf11f92	[llvm][NFC] Refactor APIs operating on CallBase Summary: Refactored the parameter and return type where they are too generally typed as Instruction. Reviewers: dblaikie, wmi, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79027	2020-04-28 13:23:47 -07:00
David Blaikie	95e570725a	OpenMPOpt::RuntimeFunctionInfo::UsesMap: Use unique_ptr for values to simplify memory management	2020-04-28 12:26:53 -07:00
David Blaikie	3c89256d71	Attributor::ArgumentReplacementMap: Use unique_ptr to simplify memory management	2020-04-28 12:26:52 -07:00
Roman Lebedev	a0004358a8	[InstCombine] Negator: 'or' with no common bits set is just 'add' In `InstCombiner::visitAdd()`, we have ``` // A+B --> A\|B iff A and B have no bits set in common. if (haveNoCommonBitsSet(LHS, RHS, DL, &AC, &I, &DT)) return BinaryOperator::CreateOr(LHS, RHS); ``` so we should handle such `or`'s here, too.	2020-04-28 19:16:32 +03:00
Sam Parker	e9c9329aa4	[TTI] Add TargetCostKind argument to getUserCost There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635	2020-04-28 08:57:45 +01:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
Mircea Trofin	cb56e9b923	[llvm][NFC] Use CallBase instead of Instruction in ProfileSummaryInfo Summary: getProfileCount requires the parameter be a valid CallBase, and its uses reflect that. Reviewers: dblaikie, craig.topper, wmi Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78940	2020-04-27 20:47:52 -07:00
Arthur Eubanks	3b0450acec	Add IR constructs for preallocated (inalloca replacement) Add llvm.call.preallocated.{setup,arg} instrinsics. Add "preallocated" operand bundle which takes a token produced by llvm.call.preallocated.setup. Add "preallocated" parameter attribute, which is like byval but without the copy. Verifier changes for these IR constructs. See https://github.com/rnk/llvm-project/blob/call-setup-docs/llvm/docs/CallSetup.md Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74651	2020-04-27 16:15:50 -07:00
Sanjay Patel	21acc0612a	[SLP] refactor load-combine logic; NFC We may want to identify sequences that are not reductions, but still qualify as load-combines in the back-end, so make most of the body a helper function.	2020-04-27 16:02:37 -04:00
Sameer Sahasrabuddhe	8488763682	[NFC] UnifyLoopExits: correctly skip expensive checks	2020-04-27 15:10:35 +05:30
Ayal Zaks	a3c964a278	[LV] Fix recording of BranchTakenCount for FoldTail When folding tail, branch taken count is computed during initial VPlan execution and recorded to be used by the compare computing the loop's mask. This recording should directly set the State, instead of reusing Value2VPValue mapping which serves original Values present prior to vectorization. The branch taken count may be a constant Value, which may be used elsewhere in the loop; trying to employ Value2VPValue for both leads to the issue reported in https://reviews.llvm.org/D76992#inline-721028 Differential Revision: https://reviews.llvm.org/D78847	2020-04-26 20:13:10 +03:00
Florian Hahn	2f3e86b318	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Florian Hahn	7d57d22baa	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Simon Pilgrim	a3982491db	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Nikita Popov	164845cd92	[GVN] Reduce expression size (NFC) Reduce size of GVN::Expression by reordering fields to reduce padding.	2020-04-26 09:43:35 +02:00
Sergei Trofimovich	09684b08d3	llvm: IPO: handle IRMover error handling, bug #45636 Summary: Missing error mangling is noticed in https://bugs.llvm.org/show_bug.cgi?id=45636 where inconsistent profiling input caused llvm/lld to crash as: ``` Program aborted due to an unhandled Error: linking module flags 'ProfileSummary': IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o' ``` The change does not change the fact that LLVM crashes but changes error output to say what was incorrect: ``` LLVM ERROR: Function Import: link error: linking module flags 'ProfileSummary': IDs have conflicting values in 'Mutex_posix.o' and 'nsBrowserApp.o' ``` Actual crash has yet to be fixed. Reviewers: lattner Reviewed By: lattner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78676	2020-04-25 19:16:01 +01:00
Sergey Dmitriev	67aed1469b	[Attributor] Do not set 'returned' attribute for arguments that cannot be bitcasted to function result Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78828	2020-04-25 09:49:40 -07:00
Sanjay Patel	4abab5c5ca	[InstCombine] generalize canonicalization of masked equality comparisons (X \| MaskC) == C --> (X & ~MaskC) == C ^ MaskC (X \| MaskC) != C --> (X & ~MaskC) != C ^ MaskC We have more analyis for 'and' patterns and already lean this way in the existing code, so this should be neutral or better in IR. If this does not do as well in codegen, the problem already exists and we should fix that based on target costs/heuristics. http://volta.cs.utah.edu:8080/z/oP3ecL define void @src(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %or = or i8 %x, %OrC %eq = icmp eq i8 %or, %C store i1 %eq, i1* %p0 %ne = icmp ne i8 %or, %C store i1 %ne, i1* %p1 ret void } define void @tgt(i8 %x, i8 %OrC, i8 %C, i1* %p0, i1* %p1) { %NotOrC = xor i8 %OrC, -1 %a = and i8 %x, %NotOrC %NewC = xor i8 %C, %OrC %eq = icmp eq i8 %a, %NewC store i1 %eq, i1* %p0 %ne = icmp ne i8 %a, %NewC store i1 %ne, i1* %p1 ret void }	2020-04-25 11:31:57 -04:00
Florian Hahn	46a04940e8	[DSE] Add stat for remaining stores after DSE. Using the existing NumFastStores statistic can be misleading when comparing the impact of DSE patches. For example, consider the case where a store gets removed from a function before it is inlined into another function. A less powerful DSE might only remove the store from functions it has been inlined into, which will result in more stores being removed, but no difference in the actual number of stores after DSE. The new stat provides the absolute number of stores surviving after DSE. Reviewers: dmgreen, bryant, asbirlea, jfb Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D78830	2020-04-25 16:12:55 +01:00
Tyker	e5f8a77c19	[AssumeBundles] Refactor asssume builder Summary: refactor assume bulider for the next patch. the assume builder now generate only one assume per attribute kind and per value they are on. to do this it takes the highest. this is desirable because currently, for all attributes the higest value is the most valuable. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78013	2020-04-25 13:43:52 +02:00
Benjamin Kramer	1d42764df7	Give helpers internal linkage. NFC.	2020-04-25 11:50:52 +02:00
Ehud Katz	64249f177e	[CodeExtractor] Fix extraction of a value used only by intrinsics outside of region We should only skip `lifetime` and `dbg` intrinsics when searching for users. Other intrinsics are legit users that can't be ignored. Without this fix, the testcase would result in an invalid IR. `memcpy` will have a reference to the, now, external value (local to the extracted loop function). Fix PR42194 Differential Revision: https://reviews.llvm.org/D78749	2020-04-25 11:44:47 +03:00
Craig Topper	2c24051bac	[CallSite removal] Rename CallSite.h to AbstractCallSite.h. NFC The CallSite and ImmutableCallSite were removed in a previous commit. So rename the file to match the remaining class and the name of the cpp that implements it.	2020-04-24 22:12:25 -07:00
Tyker	97ecd91e20	[NFC] Refactor SimplifyCFG to make propagating information easier. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77742	2020-04-24 22:22:20 +02:00
Michael Liao	495bb8feb9	Fix `-Wparentheses` warnings. NFC.	2020-04-24 15:04:01 -04:00
Tyker	42431da895	[AssumeBundles] Use assume bundles in isKnownNonZero Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1 Reviewed By: jdoerfert Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76149	2020-04-24 20:41:51 +02:00
Florian Hahn	e1235831c4	[DSE,MSSA] Improve debug output (NFC). This patch slightly improves the formatting of the debug output, adds a few missing outputs and makes some existing outputs more consistent with the rest.	2020-04-24 17:50:08 +01:00
Florian Hahn	44ce588670	[DSE,MSSA] Skip checking write clobber for DomAccess (NFC). There is no need to check if the starting access for is a write clobber and all of its uses have already been checked.	2020-04-24 17:16:22 +01:00
Sanjay Patel	e4175ff525	[InstCombine] intersect FMF when reassociating FP min/max intrinsics As discussed in PR45478: https://bugs.llvm.org/show_bug.cgi?id=45478 ...propagating FMF from the outer (second) call is not correct, so intersect them instead. I suspect we could do better (see TODO comment), but mismatched FMF is probably too rare to care about. Differential Revision: https://reviews.llvm.org/D78631	2020-04-24 12:14:03 -04:00
Simon Pilgrim	27ad103a3a	ARCRuntimeEntryPoints.h - remove unnecessary includes. NFC.	2020-04-24 14:32:45 +01:00
Max Kazantsev	9cd4debd5a	[LoopVectorize] Preserve CFG analyses if CFG wasn't modified One of transforms the loop vectorizer makes is LCSSA formation. In some cases it is the only transform it makes. We should not drop CFG analyzes if only LCSSA was formed and no actual CFG changes was made. We should think of expanding this logic to other passes as well, and maybe make it a part of PM framework. Reviewed By: Florian Hahn Differential Revision: https://reviews.llvm.org/D78360	2020-04-24 17:22:24 +07:00
Johannes Doerfert	1dfc473177	Revert "[Attributor][NFC] Encode IRPositions in the bits of a single pointer" A dependent patch has been reverted [0]. Until it goes back in this one has to stay out. [0] `ebdb893994` This reverts commit `d254b50b2b`.	2020-04-24 02:53:51 -05:00
Johannes Doerfert	d254b50b2b	[Attributor][NFC] Encode IRPositions in the bits of a single pointer This reduces memory consumption for IRPositions by eliminating the vtable pointer and the `KindOrArgNo` integer. Since each abstract attribute has an associated IRPosition, the 12-16 bytes we save add up quickly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 469545 (260135/s) temporary memory allocations: 77137 (42735/s) peak heap memory consumption: 30.50MB peak RSS (including heaptrack overhead): 119.50MB total memory leaked: 269.07KB ``` After: ``` calls to allocation functions: 468999 (274108/s) temporary memory allocations: 77002 (45004/s) peak heap memory consumption: 28.83MB peak RSS (including heaptrack overhead): 118.05MB total memory leaked: 269.07KB ``` Difference: ``` calls to allocation functions: -546 (5808/s) temporary memory allocations: -135 (1436/s) peak heap memory consumption: -1.67MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- CTMark 15 runs Metric: compile_time Program lhs rhs diff test-suite...:: CTMark/sqlite3/sqlite3.test 25.07 24.09 -3.9% test-suite...Mark/mafft/pairlocalalign.test 14.58 14.14 -3.0% test-suite...-typeset/consumer-typeset.test 21.78 21.58 -0.9% test-suite :: CTMark/SPASS/SPASS.test 21.95 22.03 0.4% test-suite :: CTMark/lencod/lencod.test 25.43 25.50 0.3% test-suite...ark/tramp3d-v4/tramp3d-v4.test 23.88 23.83 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 60.24 60.11 -0.2% test-suite :: CTMark/kimwitu++/kc.test 15.69 15.69 -0.0% test-suite...:: CTMark/ClamAV/clamscan.test 25.43 25.42 -0.0% test-suite :: CTMark/Bullet/bullet.test 37.63 37.62 -0.0% Geomean difference -0.8% --- Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78722	2020-04-24 01:58:47 -05:00
Mircea Trofin	b8960b5d81	[llvm][NFC][CallSite] Remove remaining {Immutable}CallSite uses Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78789	2020-04-23 22:19:39 -07:00
Mehdi Amini	2107af9ccf	Revert "[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC)." This reverts commit `9245c7ac13`. This is triggering a segfault in XLA downstream, we'll follow-up with a reproducer, it is likely influenced by TTI/TLI settings or other options as a simple `opt -loop-vectorize` invocation on the IR before the crash does not reproduce immediately.	2020-04-24 05:07:32 +00:00
Mircea Trofin	2059a6e3ef	[llvm][NFC][CallSite] Remove ImmutableCallSite from a few locations Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78783	2020-04-23 21:18:44 -07:00
Craig Topper	cbe77ca9bd	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-23 21:01:48 -07:00
Craig Topper	81c5e83f7d	[CallSite removal][Transform] Replace CallSite with CallBase in Utils. NFC Differential Revision: https://reviews.llvm.org/D78780	2020-04-23 20:49:33 -07:00
Roman Lebedev	5a159ed2a8	[InstCombine] Negator: don't negate multi-use `sub` While we can do that, it doesn't increase instruction count, if the old `sub` sticks around then the transform is not only not a unlikely win, but a likely regression, since we likely now extended live range and use count of both of the `sub` operands, as opposed to just the result of `sub`. As Kostya Serebryany notes in post-commit review in https://reviews.llvm.org/D68408#1998112 this indeed can degrade final assembly, increase register pressure, and spilling. This isn't what we want here, so at least for now let's guard it with an use check.	2020-04-23 23:59:15 +03:00
Christopher Tetreault	7ca56c90bd	[SVE] Remove calls to isScalable from Transforms Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77756	2020-04-23 13:50:07 -07:00
Mircea Trofin	ceb7f308b8	[llvm][NFC][CallSite] Removed CallSite from few implementation details Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78724	2020-04-23 10:36:36 -07:00
Mircea Trofin	cea6f4d5f8	[llvm][NFC][CallSite] Remove CallSite from TypeMetadataUtils & related Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78666	2020-04-23 08:23:16 -07:00
Sanjay Patel	62da6ecea2	[InstCombine] substitute equivalent constant to reduce logic-of-icmps (X == C) && (Y Pred1 X) --> (X == C) && (Y Pred1 C) (X != C) \|\| (Y Pred1 X) --> (X != C) \|\| (Y Pred1 C) This cooperates/overlaps with D78430, but it is a more general transform that gets us most of the expected simplifications and several other improvements. http://volta.cs.utah.edu:8080/z/5gxjjc PR45618: https://bugs.llvm.org/show_bug.cgi?id=45618 Differential Revision: https://reviews.llvm.org/D78582	2020-04-23 10:19:16 -04:00
Simon Pilgrim	7a8b1096be	[ObjCARC] Remove unused forward declarations. NFC.	2020-04-23 13:52:49 +01:00
Simon Pilgrim	b108a457e1	[VPlan] Remove unused forward declarations. NFC. Move VPlan.h include from VPlanVerifier.h down to VPlanVerifier.cpp	2020-04-23 12:34:20 +01:00
Serguei Katkov	c0d2bbb1d4	[CaptureTracking] Replace hardcoded constant to option. NFC. The motivation is to be able to play with the option and change if it is required. Reviewers: fedor.sergeev, apilipenko, rnk, jdoerfert Reviewed By: fedor.sergeev Subscribers: hiraditya, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D78624	2020-04-23 18:23:35 +07:00
Florian Hahn	9245c7ac13	[VPlan] Add & use VPValue operands for VPWidenRecipe (NFC). This patch adds VPValue version of the instruction operands to VPWidenRecipe and uses them during code-generation. Similar to D76373 this reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Reviewers: rengolin, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D76992	2020-04-23 12:16:46 +01:00
Craig Topper	25807452ac	[ArgumentPromotion] Remove unnecessary getScalarType() before casting to PointerType. NFC I don't believe this pass deals with vectors of pointers. I think this getScalarType() was added during a mechanical opaque pointer change of the interface to GetElementPtrInst::getIndexedType.	2020-04-22 22:51:41 -07:00
Vedant Kumar	2fa656cdfd	[Debugify] Do not require named metadata to be present when stripping This allows -mir-strip-debug to be run without -debugify having run before.	2020-04-22 17:03:39 -07:00
Vedant Kumar	2a5675f11d	[MachineDebugify] Insert synthetic DBG_VALUE instructions Summary: Teach MachineDebugify how to insert DBG_VALUE instructions. This can help find bugs causing CodeGen differences when debug info is present. DBG_VALUE instructions are only emitted when -debugify-level is set to locations+variables. There is essentially no attempt made to match up DBG_VALUE register operands with the local variables they ought to correspond to. I'm not sure how to improve the situation. In some cases (MachineMemOperand?) it's possible to find the IR instruction a MachineInstr corresponds to, but in general this seems to call for "undoing" the work done by ISel. Reviewers: dsanders, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78135	2020-04-22 17:03:39 -07:00
Juneyoung Lee	aca335955c	[ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 08:08:53 +09:00
Juneyoung Lee	5ceef26350	Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison" This reverts commit `80faa8c3af`.	2020-04-23 08:07:09 +09:00
Juneyoung Lee	80faa8c3af	RFC: [ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 07:57:12 +09:00
Florian Hahn	352b612a71	[SCCP] Drop unnecessary early exit for ExtractValueInst. visitExtractValueInst uses mergeInValue, so it already can handle constant ranges. Initially the early exit was using isOverdefined to keep things as NFC during the initial move to ValueLatticeElement. As the function already supports constant ranges, it can just use ValueState[&I].isOverdefined. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78393	2020-04-22 22:07:59 +01:00
Craig Topper	be04aba6fc	[CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC Differential Revision: https://reviews.llvm.org/D78613	2020-04-22 12:06:58 -07:00
Christopher Tetreault	2dea3f1298	[SVE] Add new VectorType subclasses Summary: Introduce new types for fixed width and scalable vectors. Does not remove getNumElements yet so as to not break code during transition period. Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr Reviewed By: sdesmalen Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm, #lldb Differential Revision: https://reviews.llvm.org/D77587	2020-04-22 08:59:01 -07:00
Mircea Trofin	1b6b05a250	[llvm][NFC][CallSite] Remove CallSite from a few trivial locations Summary: Implementation details and internal (to module) APIs. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78610	2020-04-22 08:39:21 -07:00
Dmitry Vyukov	5a2c31116f	[TSAN] Add optional support for distinguishing volatiles Add support to optionally emit different instrumentation for accesses to volatile variables. While the default TSAN runtime likely will never require this feature, other runtimes for different environments that have subtly different memory models or assumptions may require distinguishing volatiles. One such environment are OS kernels, where volatile is still used in various places for various reasons, and often declare volatile to be "safe enough" even in multi-threaded contexts. One such example is the Linux kernel, which implements various synchronization primitives using volatile (READ_ONCE(), WRITE_ONCE()). Here the Kernel Concurrency Sanitizer (KCSAN) [1], is a runtime that uses TSAN instrumentation but otherwise implements a very different approach to race detection from TSAN. While in the Linux kernel it is generally discouraged to use volatiles explicitly, the topic will likely come up again, and we will eventually need to distinguish volatile accesses [2]. The other use-case is ignoring data races on specially marked variables in the kernel, for example bit-flags (here we may hide 'volatile' behind a different name such as 'no_data_race'). [1] https://github.com/google/ktsan/wiki/KCSAN [2] https://lkml.kernel.org/r/CANpmjNOfXNE-Zh3MNP=-gmnhvKbsfUfTtWkyg_=VqTxS4nnptQ@mail.gmail.com Author: melver (Marco Elver) Reviewed-in: https://reviews.llvm.org/D78554	2020-04-22 17:27:09 +02:00
Roman Lebedev	67266d879c	[InstCombine] Negator: shufflevector is negatible All these folds are correct as per alive-tv	2020-04-22 15:14:23 +03:00
Craig Topper	05a11974ae	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-22 00:07:13 -07:00
Johannes Doerfert	ca59ff5af9	[Attributor] Replace AccessKind2Accesses map with an "array map" The number of different access location kinds we track is relatively small (8 so far). With this patch we replace the DenseMap that mapped from index (0-7) to the access set pointer with an array of access set pointers. This reduces memory consumption. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 472499 (215654/s) temporary memory allocations: 77794 (35506/s) peak heap memory consumption: 35.28MB peak RSS (including heaptrack overhead): 125.46MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 472270 (308673/s) temporary memory allocations: 77578 (50704/s) peak heap memory consumption: 32.70MB peak RSS (including heaptrack overhead): 121.78MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -229 (346/s) temporary memory allocations: -216 (326/s) peak heap memory consumption: -2.58MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` ---	2020-04-22 01:35:27 -05:00
Johannes Doerfert	f20ff4b17d	[Attributor] Run IRPosition::verify only with EXPENSIVE_CHECKS	2020-04-22 01:35:12 -05:00
Sameer Sahasrabuddhe	5a7a6382bc	FixIrreducible: don't crash when moving a child loop Summary: When an irreducible SCC is converted into a new natural loop, existing loops included in that SCC now become children of the new loop. The logic that moves these loops from the parent loop to the new loop invoked undefined behaviour when it modified the container that it was iterating over. Fixed this by first extracting all the loops that are to be removed from the parent. Fixes bug 45623. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D78544	2020-04-22 07:47:30 +05:30
Mircea Trofin	9ee02aef62	[llvm][NFC][CallSite] Remove CallSite from FunctionAttrs Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78584	2020-04-21 16:16:00 -07:00
Johannes Doerfert	46b7ed0e6f	[Attributor] Remove dependence edges eagerly If we have a dependence between an abstract attribute A to an abstract attribute B such hat changes in A should trigger an update of B, we do not need to keep the dependence around once the update was triggered. If the dependence is still required the update will reinsert it into the dependence map, if it is not we avoid triggering B in the future. This replaces the "recompute interval" mechanism we used before to prune stale dependences. Number of required iterations is generally down, compile time for the module pass (not really the CGSCC pass) is down quite a bit. There is one test change which looks like an artifact in the undefined behavior AA that needs to be looked at.	2020-04-21 15:22:10 -05:00
Johannes Doerfert	ea439bbcbb	[Attributor][NFC] Track the number of created AAs in the statistics	2020-04-21 15:22:10 -05:00
Johannes Doerfert	c5794f77eb	[Attributor][PM] Introduce `-attributor-enable={none,cgscc,module,all}` The old command line option `-attributor-disable` was too coarse grained as we want to measure the effects of the module or cgscc pass without the other as well. Since `none` is the default there is no real functional change. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78571	2020-04-21 15:22:10 -05:00
Michael Liao	163bd9d858	Fix `-Wpedantic` warnings. NFC.	2020-04-21 16:09:17 -04:00
Michael Liao	21529355e1	Fix `-Wparentheses` warnings. NFC.	2020-04-21 15:02:59 -04:00
Roman Lebedev	352fef3f11	[InstCombine] Negator - sink sinkable negations Summary: As we have discussed previously (e.g. in D63992 / D64090 / [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]]), `sub` instruction can almost be considered non-canonical. While we do convert `sub %x, C` -> `add %x, -C`, we sparsely do that for non-constants. But we should. Here, i propose to interpret `sub %x, %y` as `add (sub 0, %y), %x` IFF the negation can be sinked into the `%y` This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms). For former there's `-instcombine-negator-max-depth` option to mitigate it, should this expose any such issues For latter, if there are still any such opposing folds, we'd need to remove the colliding fold. In any case, reproducers welcomed! Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: xbolva00, mgorny, hiraditya, reames, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68408	2020-04-21 22:00:23 +03:00
Benjamin Kramer	9a08c30705	Bit-pack some pairs. No functionlity change intended.	2020-04-21 20:40:20 +02:00
Fangrui Song	cca545ce46	[CallSite] Fix build breakage after D78538	2020-04-21 11:33:40 -07:00
Mircea Trofin	d702325af6	[llvm][NFC][CallSite] Remove CallSite from DeadArgumentElimination Summary: Also capitalized some induction variables, to match coding style. Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78538	2020-04-21 10:48:38 -07:00
Simon Pilgrim	d9af50efbc	[Transforms] getOrEnforceKnownAlignment - fix MSVC result of 32-bit shift implicitly converted to 64 bits warning. NFCI We don't overflow here so we can use a U64 shift directly.	2020-04-21 18:32:12 +01:00
Johannes Doerfert	177c065e50	[Attributor] Use a pointer value type for the OpcodeInstMap This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 490390 (320725/s) temporary memory allocations: 84601 (55330/s) peak heap memory consumption: 41.70MB peak RSS (including heaptrack overhead): 131.18MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 489359 (301144/s) temporary memory allocations: 82983 (51066/s) peak heap memory consumption: 36.76MB peak RSS (including heaptrack overhead): 126.48MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -1031 (-10739/s) temporary memory allocations: -1618 (-16854/s) peak heap memory consumption: -4.94MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:20:09 -05:00
Johannes Doerfert	99662c22cd	[Attributor] Use a pointer value type for the QueryMap This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 596180 (374484/s) temporary memory allocations: 84979 (53378/s) peak heap memory consumption: 52.14MB peak RSS (including heaptrack overhead): 139.79MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 489200 (303285/s) temporary memory allocations: 83406 (51708/s) peak heap memory consumption: 41.70MB peak RSS (including heaptrack overhead): 131.76MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -106980 (-5094285/s) temporary memory allocations: -1573 (-74904/s) peak heap memory consumption: -10.44MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:20:04 -05:00
Johannes Doerfert	1f570e019d	[Attributor] Use a pointer value type for the access kind -> accesses map This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 616219 (381559/s) temporary memory allocations: 83294 (51575/s) peak heap memory consumption: 72.15MB peak RSS (including heaptrack overhead): 160.04MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 595004 (357145/s) temporary memory allocations: 83840 (50324/s) peak heap memory consumption: 52.14MB peak RSS (including heaptrack overhead): 138.32MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -21215 (-415980/s) temporary memory allocations: 546 (10705/s) peak heap memory consumption: -20.01MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:20:02 -05:00
Johannes Doerfert	40f3baeb20	[Attributor] Pass the Attributor to the AbstractAttribute constructors AbstractAttribute::initialize is used to initialize the deduction and the object we do not always call it. To make sure we have the option to initialize the object even if initialize is not called we pass the Attributor to AbstractAttribute constructors now.	2020-04-21 11:20:02 -05:00
Johannes Doerfert	91a6c88349	[Attributor] Use a pointer value type for the AAMap This reduces memory consumption and the need to copy complex data structures repeatedly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 613353 (376521/s) temporary memory allocations: 83636 (51341/s) peak heap memory consumption: 75.64MB peak RSS (including heaptrack overhead): 162.97MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 616575 (349929/s) temporary memory allocations: 83650 (47474/s) peak heap memory consumption: 72.15MB peak RSS (including heaptrack overhead): 159.81MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: 3222 (24225/s) temporary memory allocations: 14 (105/s) peak heap memory consumption: -3.49MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-21 11:19:58 -05:00
Sanjay Patel	978166f209	[InstCombine] improve types/names for logic-of-icmp helper function; NFC	2020-04-21 10:16:45 -04:00
Florian Hahn	647c9e72e4	[VPlan] Make various tryTo* helpers private and mark as const (NFC). The individual tryTo* helpers do not need to be public. Also, the builder contained two consecutive public: sections, which is not necessary. Moved the remaining public methods after the constructor. Also make some of the tryTo* helpers const. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed by: gilr Differential Revision: https://reviews.llvm.org/D78288	2020-04-21 14:49:02 +01:00
Sanjay Patel	ba72389269	[InstCombine] improve types/names for logic-of-icmp helper functions; NFC	2020-04-21 09:18:22 -04:00
Craig Topper	6235951ec0	[CallSite removal][Instrumentation] Use CallBase instead of CallSite in AddressSanitizer/DataFlowSanitizer/MemorySanitizer. NFC Differential Revision: https://reviews.llvm.org/D78524	2020-04-20 22:39:14 -07:00
Max Kazantsev	a116f0fa86	[LICM][NFC] Reorder checks to speed up things slightly Side effect check is made faster than potentially heavy other checks.	2020-04-21 11:34:44 +07:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Johannes Doerfert	dc3b5b00fe	[OpenMPOpt] Make the combination of `ident_t` deterministic Before we kept the first applicable `ident_t` during deduplication of runtime calls. The problem is that "first" is dependent on the iteration order of a DenseMap. Since the proper solution, which is to combine the information from all `ident_t`, should be deterministic on its own, we will not try to make the iteration order deterministic. Instead, we will create a fresh `ident_t` if there is not a unique existing `ident_t*` to pick.	2020-04-20 23:27:08 -05:00
Johannes Doerfert	8855fec37e	[OpenMPOpt] Use a pointer value type in map The value type was a set before which can easily lead to excessive memory usage and copying. We use a pointer to a vector instead now.	2020-04-20 23:27:08 -05:00
Johannes Doerfert	ee17263adc	[OpenMPOpt] Make the SCC a vector to ensure deterministic results	2020-04-20 23:27:08 -05:00
Mircea Trofin	c2d86e1f30	[llvm][NFC][CallSite] Remove CallSite from ArgumentPromotion Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78528	2020-04-20 19:33:42 -07:00
Johannes Doerfert	87aa362985	[Attributor] Use the BumpPtrAllocator in InformationCache as well We now also use the BumpPtrAllocator from the Attributor in the InformationCache. The lifetime of objects in either is pretty much the same and it should result in consistently good performance regardless of the allocator. Doing so requires to call more constructors manually but so far that does not seem to be problematic or messy. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 615359 (368257/s) temporary memory allocations: 83315 (49859/s) peak heap memory consumption: 75.64MB peak RSS (including heaptrack overhead): 163.43MB total memory leaked: 269.04KB ``` After: ``` calls to allocation functions: 613042 (359555/s) temporary memory allocations: 83322 (48869/s) peak heap memory consumption: 75.64MB peak RSS (including heaptrack overhead): 162.92MB total memory leaked: 269.04KB ``` Difference: ``` calls to allocation functions: -2317 (-68147/s) temporary memory allocations: 7 (205/s) peak heap memory consumption: 2.23KB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ---	2020-04-20 21:12:41 -05:00
Mircea Trofin	15cd1e36e4	[llvm][NFC][CallSite] Remove CallSite from CoroEarly Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78523	2020-04-20 18:15:25 -07:00
Sriraman Tallam	365b60fc93	New pass to make internal linkage symbol names unique. With clang option -funique-internal-linkage-symbols, symbols with internal linkage get names with the module hash appended. Differential Revision: https://reviews.llvm.org/D78243	2020-04-20 15:05:22 -07:00
Craig Topper	fcc9d70260	Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign." This is breaking the clang build. This reverts commit `897409fb56`.	2020-04-20 13:25:06 -07:00
Craig Topper	897409fb56	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 13:08:05 -07:00
Nikita Popov	54d01cbc15	[IPT] Don't use OrderedInstructions (NFC) Use Instruction::comesBefore() instead of OrderedInstructions inside InstructionPrecedenceTracking. This also removes the dominator tree dependency. Differential Revision: https://reviews.llvm.org/D78461	2020-04-20 18:25:31 +02:00
Bjorn Pettersson	a8a31fdd80	[Scalarizer] Fix a non-deterministic scatter order problem Summary: The indexing operator in Scatterer may result in building new instructions. When using multiple such operators in a function argument list the order in which we build instructions depend on argument evaluation order (which is undefined in C++). This patch avoid such problems by expanding the components using the [] operator prior to the function call. Problem was seen when comparing output, while builing LLVM with different compilers (clang vs gcc). Reviewers: foad, cameron.mcinally, uabelho Reviewed By: foad Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78455	2020-04-20 16:05:33 +02:00
Florian Hahn	fa284e136e	[VPlan] Clean up tryToCreate(Widen)Recipe. (NFC) This patch includes some clean-ups to tryToCreateRecipe, suggested in D77973. It includes: * Renaming tryToCreateRecipe to tryToCreateWidenRecipe. * Move VPBB insertion logic to caller of tryToCreateWidenRecipe. * Hoists instruction checks to tryToCreateWidenRecipe, making it clearer which instructions are handled by which recipe, simplifying the checks by using early exits. * Split up handling of induction PHIs and truncates using inductions. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78287	2020-04-20 10:06:35 +01:00
Florian Hahn	4331b3812a	[PredicateInfo] Use new Instruction::comesBefore instead of OI (NFC). The recently added Instruction::comesBefore can be used instead of OrderedInstructions. Reviewers: rnk, nikic, efriedma Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D78452	2020-04-20 09:22:21 +01:00
Sam Parker	e3056ae9a0	[NFC][TTI] Explicit use of VectorType The API for shuffles and reductions uses generic Type parameters, instead of VectorType, and so assertions and casts are used a lot. This patch makes those types explicit, which means that the clients can't be lazy, but results in less ambiguity, and that can only be a good thing. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562 Differential Revision: https://reviews.llvm.org/D78357	2020-04-20 09:16:52 +01:00
Craig Topper	53ee8fbc23	[CallSite removal][SCCP] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78470	2020-04-20 00:16:09 -07:00
Craig Topper	4cf6d4ab48	[CallSite removal][CalledValuePropagation] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78467	2020-04-19 22:05:40 -07:00
Florian Hahn	a7aaadc135	[TTI] Clean up includes (NFC). Remove some unnecessary includes, replace some with forward declarations. This also exposed a few places that were missing some includes.	2020-04-19 20:11:59 +01:00
Florian Hahn	32af48cdcf	[IVDescriptors] Clean up includes. Some includes are not required and forward declarations can be used instead. This also exposed a few places that were not directly including required files.	2020-04-19 20:07:47 +01:00
Florian Hahn	7a87e8f90b	[LoopUtils] Clean up includes, use forward decls if appropriate (NFC). Most of the includes in LoopUtils.h are not required in the header and they can be replaced by forward declarations. Unfortunately includes of TargetTransformInfo.h and IVDescriptors.h pull in a bunch of additional things, but there is no easy way to get rid of them at the moment I think.	2020-04-19 19:44:29 +01:00
Sanjay Patel	bef6e67e95	[VectorCombine] transform bitcasted shuffle to wider elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' This is the widen shuffle elements enhancement to D76727. It builds on the analysis and simplifications in D77881 and rG6a7e958a423e. The phase ordering tests show that we can simplify inverse shuffles across a binop in both directions (widen/narrow or narrow/widen) now. There's another potential transform visible in some of the remaining TODOs - move a bitcasted operand of a shuffle after the shuffle. Differential Revision: https://reviews.llvm.org/D78371	2020-04-19 08:24:38 -04:00
Benjamin Kramer	ff54d1c897	Remove remaining callers of CreateShuffleVector with unsigned indices and mark it as deprecated No functionality change intended.	2020-04-19 11:48:28 +02:00
Florian Hahn	6ba0695c60	[ValueLattice] Add struct for merge options. This makes it easier to extend the merge options in the future and also reduces the risk of accidentally setting a wrong option. Reviewers: efriedma, nikic, reames, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78368	2020-04-19 09:03:16 +01:00
Ayal Zaks	8e0c5f7200	[LV] Mark first-order recurrences as allowed exits First-order recurrences require special treatment when they are live-out; such treatment is provided by fixFirstOrderRecurrence(), so they should be included in AllowedExit set. (Should probably have been included originally in D16197.) Fixes PR45526: AllowedExit set is used by prepareToFoldTailByMasking() to check whether the treatment for live-outs also holds when folding the tail, which is not (yet) the case for first-order recurrences. Differential Revision: https://reviews.llvm.org/D78210	2020-04-18 23:54:21 +03:00
Craig Topper	7fde990694	Recommit "[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI" With a tweak to avoid a linker error for passing MaxAlignmentExponent by reference to std::min.	2020-04-18 13:51:57 -07:00
Nikita Popov	a42fd18d0f	[PredicateInfo] Factor out PredicateInfoBuilder (NFC) When running IPSCCP on a module with many small functions, memory usage is dominated by PredicateInfo, which is a huge structure (partially due to some unfortunate nested SmallVector use). However, most of it is actually only temporary state needed to build predicate info, and does not need to be retained after initial construction. This patch factors out the predicate building logic and state into a separate PrediceInfoBuilder, with the extra bonus that it does not need to live in the header anymore. Differential Revision: https://reviews.llvm.org/D78326	2020-04-18 22:34:38 +02:00
Craig Topper	44d63b7528	Revert "[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI" This reverts commit `e00cfe254d`. Seems to be causing a linker error on the build bots.	2020-04-18 13:23:29 -07:00
Craig Topper	e00cfe254d	[Local] Simplify the alignment limits in getOrEnforceKnownAlignment. NFCI We previously clamped the trailing zero count to 31 bits. And then clamped the final alignment to MaximumAlignment which is 1 << 29. This patch simplifies this to just clamp the trailing zero to 29 using MaxAlignmentExponent. I was looking into changing this function to use Align/MaybeAlign and noticed this. Differential Revision: https://reviews.llvm.org/D78418	2020-04-18 12:52:47 -07:00
Florian Hahn	46853b95ca	[SCCP] Drop unused early exit from visitStoreInst (NFC). There are no lattice values associated with store instructions directly. They will never get marked as overdefined.	2020-04-18 19:44:54 +01:00
Florian Hahn	034e8d58a8	[SCCP] Drop unused early exit from visitReturnInst (NFC). There are no lattice values associated with return instructions directly. They will never get marked as overdefined.	2020-04-18 13:52:41 +01:00
Florian Hahn	4ee45ab60f	[LV] Invalidate cost model decisions along with interleave groups. Cost-modeling decisions are tied to the compute interleave groups (widening decisions, scalar and uniform values). When invalidating the interleave groups, those decisions also need to be invalidated. Otherwise there is a mis-match during VPlan construction. VPWidenMemoryRecipes created initially are left around w/o converting them into VPInterleave recipes. Such a conversion indeed should not take place, and these gather/scatter recipes may in fact be right. The crux is leaving around obsolete CM_Interleave (and dependent) markings of instructions along with their costs, instead of recalculating decisions, costs, and recipes. Alternatively to forcing a complete recompute later on, we could try to selectively invalidate the decisions connected to the interleave groups. But we would likely need to run the uniform/scalar value detection parts again anyways and the extra complexity is probably not worth it. Fixes PR45572. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78298	2020-04-18 10:23:49 +01:00
Mircea Trofin	41ad8b7388	[llvm][NFC][CallSite] Remove CallSite from Evaluator. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78395	2020-04-17 19:11:17 -07:00
Anna Thomas	fd5e069d23	Fix buildbot failure due to obsolete CallSite usage Fix buildbot failures due to `ef49b1d97e` (which was a revert of a previous change).	2020-04-17 17:46:19 -04:00
Anna Thomas	ef49b1d97e	Revert "[InlineFunction] Update metadata on loads that are return values" This reverts commit `1d0f757904` because of https://bugs.llvm.org/show_bug.cgi?id=45590. Needs investigation.	2020-04-17 17:23:00 -04:00
Craig Topper	5f6d93c7d3	[CallSite removal][Attributor] Replaces use of CallSite with CallBase. NFC Differential Revision: https://reviews.llvm.org/D78343	2020-04-17 10:44:31 -07:00
Craig Topper	0feaba683e	[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC There are also some adjustments to use MaybeAlign in here due to CallBase::getParamAlignment() being deprecated. It would be a little cleaner if getOrEnforceKnownAlignment was migrated to Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78345	2020-04-17 10:32:45 -07:00
Craig Topper	8c94d616e1	Revert "[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC" There were extra changes that weren't supposed to be in there This reverts commit `b91f78db37`.	2020-04-17 10:11:22 -07:00
Craig Topper	b91f78db37	[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC There are also some adjustments to use MaybeAlign in here due to CallBase::getParamAlignment() being deprecated. It would be cleaner if getOrEnforceKnownAlignment was migrated to Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78345	2020-04-17 10:07:20 -07:00
Florian Hahn	c245d3e033	[ValueLattice] Steal bits from Tag to track range extensions (NFC). Users of ValueLatticeElement currently have to ensure constant ranges are not extended indefinitely. For example, in SCCP, mergeIn goes to overdefined if a constantrange value is repeatedly merged with larger constantranges. This is a simple form of widening. In some cases, this leads to an unnecessary loss of information and things can be improved by allowing a small number of extensions in the hope that a fixed point is reached after a small number of steps. To make better decisions about widening, it is helpful to keep track of the number of range extensions. That state is tied directly to a concrete ValueLatticeElement and some unused bits in the class can be used. The current patch preserves the existing behavior by default: CheckWiden defaults to false and if CheckWiden is true, a single change to the range is allowed. Follow-up patches will slightly increase the threshold for widening. Reviewers: efriedma, davide, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78145	2020-04-17 15:38:23 +01:00
Benjamin Kramer	c5e7c2691d	Remove accidental include. Thank you clangd.	2020-04-17 16:36:30 +02:00
Benjamin Kramer	b639091c02	Change users of CreateShuffleVector to pass the masks as int instead of Constants No functionality change intended.	2020-04-17 16:34:29 +02:00
Benjamin Kramer	166467e822	[VectorUtils] Create shufflevector masks as int vectors instead of Constants No functionality change intended.	2020-04-17 15:28:00 +02:00
Max Kazantsev	72c13446ce	[NFC] Add missing 'const' notion to LCSSA-related functions These functions don't really do any changes to loop info or dominator tree. We should state this explicitly using 'const'.	2020-04-17 17:49:34 +07:00
Simon Pilgrim	fa7f328a15	[cmake] LLVMVectorize - add include/llvm/Transforms/Vectorize header path MSVC projects were missing the llvm/Transforms/Vectorize/* headers	2020-04-17 11:06:26 +01:00
Craig Topper	5034df8600	[SampleProfile] Use CallBase in function arguments and data structures to reduce the number of explicit casts. NFCI Removing CallSite left us with a bunch of explicit casts from Instruction to CallBase. This moves the casts earlier so that function arguments and data structure types are CallBase so we don't have to cast when we use them. Differential Revision: https://reviews.llvm.org/D78246	2020-04-16 22:10:34 -07:00
Craig Topper	798b262c3c	[CallSite removal][IPO] Change implementation of AbstractCallSite to store a CallBase* instead of CallSite. NFCI. CallSite will likely be removed soon, but AbstractCallSite serves a different purpose and won't be going away. This patch switches it to internally store a CallBase* instead of a CallSite. The only interface changes are the removal of the getCallSite method and getCallBackUses now takes a CallBase&. These methods had only a few callers that were easy enough to update without needing a compatibility shim. In the future once the other CallSites are gone, the CallSite.h header should be renamed to AbstractCallSite.h Differential Revision: https://reviews.llvm.org/D78322	2020-04-16 16:24:45 -07:00
Bob Haarman	cc5c58889e	[WPD] Avoid noalias assumptions in unique return value optimization Summary: Changes the type of the @__typeid_.*_unique_member imports we generate for unique return value optimization from i8 to [0 x i8]. This prevents assuming that these imports do not alias, such as when two unique return values occur in the same vtable. Fixes PR45393. Reviewers: tejohnson, pcc Reviewed By: pcc Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77421	2020-04-16 14:49:51 -07:00
Roman Lebedev	b1fbf438f6	[OpenMPOpt] deduplicateRuntimeCalls(): avoid traditional map lookup pitfall Summary: This roughly halves time spent in that pass, while unsurprisingly significantly reducing total memory usage. This makes sense because most functions won't use any openmp functions.. old ``` 0.2329 ( 0.5%) 0.0409 ( 0.9%) 0.2738 ( 0.5%) 0.2736 ( 0.5%) OpenMP specific optimizations ``` ``` total runtime: 63.32s. bytes allocated in total (ignoring deallocations): 8.34GB (131.70MB/s) calls to allocation functions: 14526259 (229410/s) temporary memory allocations: 3335760 (52680/s) peak heap memory consumption: 324.36MB peak RSS (including heaptrack overhead): 5.39GB total memory leaked: 289.93MB ``` new ``` 0.1457 ( 0.3%) 0.0276 ( 0.6%) 0.1732 ( 0.3%) 0.1731 ( 0.3%) OpenMP specific optimizations ``` ``` total runtime: 55.01s. bytes allocated in total (ignoring deallocations): 6.70GB (121.89MB/s) calls to allocation functions: 14268205 (259398/s) temporary memory allocations: 3225355 (58637/s) peak heap memory consumption: 324.09MB peak RSS (including heaptrack overhead): 5.39GB total memory leaked: 289.87MB ``` diff ``` total runtime: -8.31s. bytes allocated in total (ignoring deallocations): -1.63GB (196.58MB/s) calls to allocation functions: -258054 (31034/s) temporary memory allocations: -110405 (13277/s) peak heap memory consumption: -262.36KB peak RSS (including heaptrack overhead): 0B total memory leaked: -61.45KB ``` Reviewers: jdoerfert, hfinkel Reviewed By: jdoerfert Subscribers: yaxunl, hiraditya, guansong, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78299	2020-04-16 19:54:02 +03:00
Bjorn Pettersson	fdf9bad573	[Float2Int] Stop passing around a reference to the class member Roots. NFC The Float2IntPass got a class member called Roots, but Roots was also passed around to member function as a reference. This patch simply remove those references.	2020-04-16 15:24:13 +02:00
Johannes Doerfert	c4d3188adb	[Attributor][NFC] Reduce indention for call site attribute seeding Also added a TODO to remind us that indirect calls could be optimized as well.	2020-04-16 02:32:31 -05:00
Johannes Doerfert	0741dec27b	[Attributor][FIX] Handle droppable uses when replacing values Since we use the fact that some uses are droppable in the Attributor we need to handle them explicitly when we replace uses. As an example, an assumed dead value can have live droppable users. In those we cannot replace the value simply by an undef. Instead, we either drop the uses (via `dropDroppableUses`) or keep them as they are. In this patch we do both, depending on the situation. For values that are dead but not necessarily removed we keep droppable uses around because they contain information we might be able to use later. For values that are removed we drop droppable uses explicitly to avoid replacement with undef.	2020-04-16 00:56:08 -05:00
Johannes Doerfert	ea7f17ee38	[InstCombine] Simplify calls with casted `returned` attribute The handling of the `returned` attribute in D75815 did miss the case where the argument is (bit)casted to a different type. This is explicitly allowed by the language reference and exposed by the Attributor. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D77977	2020-04-16 00:56:00 -05:00
Johannes Doerfert	253d6be0f6	[Attributor][FIX] Properly check for accesses to globals The check if globals were accessed was not always working because two bits are set for NO_GLOBAL_MEM. The new check works also if only on kind of globals (internal/external) is accessed.	2020-04-16 00:55:34 -05:00
Johannes Doerfert	ad9c284cc3	[Attributor][NFC] Run the verifier only on functions and under EXPENSIVE_CHECKS Running the verifier is expensive so we want to avoid it even in runs that enable assertions. As we move closer to enabling the Attributor this code will be executed by some buildbots but not cause overhead for most people.	2020-04-16 00:55:33 -05:00
Craig Topper	8e1408695c	[CallSite removal][TargetLibraryInfo] Replace ImmutableCallSite with CallBase in one of the getLibFunc signatures. NFC Differential Revision: https://reviews.llvm.org/D78083	2020-04-15 22:43:41 -07:00
Mircea Trofin	4213bc761a	[llvm][NFC][CallSite] Removed CallSite from some implementation details. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78256	2020-04-15 22:27:05 -07:00
Johannes Doerfert	898bbc252a	[Attributor] Lazily collect function information Before, we eagerly analyzed all the functions to collect information about them, e.g. what instructions may read/write memory. This had multiple drawbacks: - In CGSCC-mode we can end up looking at a callee which is not in the SCC but for which we need an initialized cache. - We end up looking at functions that we deem dead and never need to analyze in the first place. - We have a implicit dependence which is easy to break. This patch moves the function analysis into the information cache and makes it lazy. There is no real functional change expected except due to the first reason above.	2020-04-15 22:26:38 -05:00
Johannes Doerfert	8c4057e3a3	[Attributor] Replace call graph call sites after function replacement The CallGraphUpdater allows to directly alter call site information and we should do so. This might appease the windows buildbot that crashes during the SCC traversal.	2020-04-15 22:24:09 -05:00
Johannes Doerfert	df675890b7	[CallGraphUpdater][NFC] Minor updates to D77855 I uploaded the old version accidentally instead of the one with these minor adjustments requested by the reviewers. Differential Revision: https://reviews.llvm.org/D77855	2020-04-15 21:26:35 -05:00
Alina Sbirlea	edccc35e8f	[Reassociate] Preserve AAManager and BasicAA analyses. Now Reassociate Pass invalidates the analysis results of AAManager and BasicAA, but it saves GlobalsAA, although it seems that it should preserve them, since it affects only Unary and Binary operators. Author: kpolushin (Kirill) Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D77137	2020-04-15 16:58:03 -07:00
Johannes Doerfert	937025757c	[CallGraphUpdater] Remove nodes from their SCC (old PM) Summary: We can and should remove deleted nodes from their respective SCCs. We did not do this before and this was a potential problem even though I couldn't locally trigger an issue. Since the `DeleteNode` would assert if the node was not in the SCC, we know we only remove nodes from their SCC and only once (when run on all the Attributor tests). Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku Subscribers: hiraditya, bollu, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77855	2020-04-15 18:38:50 -05:00
Johannes Doerfert	1b34b84ddd	[CallGraphUpdater] Update the ExternalCallingNode for node replacements Summary: While it is uncommon that the ExternalCallingNode needs to be updated, it can happen. It is uncommon because most functions listed as callees have external linkage, modifying them is usually not allowed. That said, there are also internal functions that have, or better had, their "address taken" at construction time. We conservatively assume various uses cause the address "to be taken". Furthermore, the user might have become dead at some point. As a consequence, transformations, e.g., the Attributor, might be able to replace a function that is listed as callee of the ExternalCallingNode. Since there is no function corresponding to the ExternalCallingNode, we did just remove the node from the callee list if we replaced it (so far). Now it would be preferable to replace it if needed and remove it otherwise. However, removing the node has implications on the CGSCC iteration. Locally, that caused some other nodes to be never visited but it is for sure possible other (bad) side effects can occur. As it seems conservatively safe to keep the new node in the callee list we will do that for now. Reviewers: lebedev.ri, hfinkel, fhahn, probinson, wristow, loladiro, sstefan1, uenoku Subscribers: hiraditya, bollu, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77854	2020-04-15 18:38:50 -05:00
Johannes Doerfert	7ec8d79385	[CallGraphUpdater] Properly remove strongly connected components (oldPM) Summary: The old code did eliminate references from and to functions that were about to be deleted only just before we deleted them. This can cause references from other functions that are supposed to be deleted to still exist, depending on the order. If the functions form a strongly connected component the problem manifests regardless of the order in which we try to actually delete the functions. This patch introduces a two step deletion. First we remove all references and then we delete the function. Note that this only affects the old call graph. There should not be any functional changes if no old style call graph was given. To test this we delete two strongly connected functions instead of one in an existing test. Reviewers: hfinkel Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77975	2020-04-15 18:38:49 -05:00
Craig Topper	240725666a	[CallSite removal][CallSiteSplitting] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78240	2020-04-15 15:38:02 -07:00
Craig Topper	fbb804983d	[CallSite removal][CloneFunction] Use CallSite instead of CallBase. NFC Differential Revision: https://reviews.llvm.org/D78236	2020-04-15 15:38:02 -07:00
Philip Reames	80c46c53bd	[PoisonChecking] Further clarify file scope comment, and update to match naming now used in code	2020-04-15 14:48:53 -07:00
Philip Reames	463513e959	[NFC] Adjust style and clarify comments in PoisonChecking	2020-04-15 14:48:53 -07:00
Philip Reames	75ca7127bc	[NFC] Use new canCreatePoison to make code intent more clear in PoisonChecking	2020-04-15 14:48:53 -07:00
Craig Topper	592d8e7d75	[CallSite removal][SimpleLoopUnswitch] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78227	2020-04-15 13:25:02 -07:00
Craig Topper	7b6ff8bf1f	[CallSite removal][SampleProfile] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78219	2020-04-15 12:47:17 -07:00
Davide Italiano	5f87415efc	[LICM] Try to merge debug locations when sinking. The current strategy LICM uses when sinking for debuginfo is that of picking the debug location of one of the uses. This causes stepping to be wrong sometimes, see, e.g. PR45523. This patch introduces a generalization of getMergedLocation(), that operates on a vector of locations instead of two, and try to merge all them together, and use the new API in LICM. <rdar://problem/61750950>	2020-04-15 12:29:34 -07:00
Craig Topper	a0d92248ea	[CallSite removal][PruneEH] Use CallBase instead of CallSite. NFC Reviewers: mtrofin, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78182	2020-04-15 10:11:41 -07:00
Sanjay Patel	01bcc3e937	[InstCombine] prevent infinite loop with sub/abs of constant expression PR45539: https://bugs.llvm.org/show_bug.cgi?id=45539	2020-04-15 09:19:16 -04:00
Benjamin Kramer	cc035d475f	Upgrade users of 'new ShuffleVectorInst' to pass indices as an int array No functionality change intended.	2020-04-15 14:29:43 +02:00
Florian Hahn	3f7f06888b	[VPlan] Branches are not widened by VPWidenRecipe, assert (NFC).	2020-04-15 12:03:45 +01:00
Benjamin Kramer	6f64daca8f	Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints No functionality change intended.	2020-04-15 12:51:38 +02:00
Florian Hahn	5b4b3e0b6e	[VPlan] Move widening check for non-memory/non-calls to function (NFC). After introducing VPWidenSelectRecipe, the duplicated logic can be shared. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77973	2020-04-15 11:48:37 +01:00
Florian Hahn	cf9ee49b4d	[DSE] Lift post-dominance for objs not accessible in caller. We can eliminate MemoryDefs of objects not accessible after the function returns (e.g. alloca), if there are no reads between the MemoryDef and any function exits. We can stop traversing paths that completely overwrite the memory location of the MemoryDef. This patch was split off D73763. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: asbirlea, george.burgess.iv Differential Revision: https://reviews.llvm.org/D77736	2020-04-15 11:37:14 +01:00
Sameer Sahasrabuddhe	7bb9f500e2	fix warning: specialization of template in different namespace This is related to commit `8c11bc0cd0` which introduces the FixIrreducible pass. The warning seems hard to reproduce locally. The latest attempt ought to work.	2020-04-15 15:57:53 +05:30
Sameer Sahasrabuddhe	8c11bc0cd0	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198 This restores commit `2ada8e2525`. Originally reverted with commit `44e09b59b8`.	2020-04-15 15:05:51 +05:30
Florian Hahn	79d185c792	[VPlan] Move Load/Store checks out of tryToWiden (NFC). Handling LoadInst and StoreInst in tryToWiden seems a bit counter-intuitive, as there is only an assertion for them and in no case VPWidenRefipes are created for them. I think it makes sense to move the assertion to handleReplication, where the non-widened loads and store are handled. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77972	2020-04-15 10:18:42 +01:00
Gil Rapaport	b747d72c19	[LV] Fix PR45525: Incorrect assert in blend recipe Fix an assert introduced in 41ed5d856c1: a phi with a single predecessor and a mask is a valid case which is already supported by the code. Differential Revision: https://reviews.llvm.org/D78115	2020-04-15 10:39:07 +03:00
Sameer Sahasrabuddhe	44e09b59b8	Revert "Introduce fix-irreducible pass" This reverts commit `2ada8e2525`. Buildbots produced compilation errors which I was not able to quickly reproduce locally. Need more time to investigate.	2020-04-15 12:19:50 +05:30
Sameer Sahasrabuddhe	2ada8e2525	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198	2020-04-15 11:29:19 +05:30
Teresa Johnson	33ffb62e23	Allow disabling of vectorization using internal options Summary: Currently, the internal options -vectorize-loops, -vectorize-slp, and -interleave-loops do not have much practical effect. This is because they are used to initialize the corresponding flags in the pass managers, and those flags are then unconditionally overwritten when compiling via clang or via LTO from the linkers. The only exception was -vectorize-loops via opt because of some special hackery there. While vectorization could still be disabled when compiling via clang, using -fno-[slp-]vectorize, this meant that there was no way to disable it when compiling in LTO mode via the linkers. This only affected ThinLTO, since for regular LTO vectorization is done during the compile step for scalability reasons. For ThinLTO it is invoked in the LTO backends. See also the discussion on PR45434. This patch makes it so the internal options can actually be used to disable these optimizations. Ultimately, the best long term solution is to mark the loops with metadata (similar to the approach used to fix -fno-unroll-loops in D77058), but this enables a shorter term workaround, and actually makes these internal options useful. I constant propagated the initial values of these internal flags into the pass manager flags (for some reasons vectorize-loops and interleave-loops were initialized to true, while vectorize-slp was initialized to false). As mentioned above, they are overwritten unconditionally so this doesn't have any real impact, and these initial values aren't particularly meaningful. I then changed the passes to check the internl values and return without performing the associated optimization when false (I changed the default of -vectorize-slp to true so the options behave similarly). I was able to remove the hackery in opt used to get -vectorize-loops=false to work, as well as a special option there used to disable SLP vectorization. Finally, I changed thinlto-slp-vectorize-pm.c to: a) Only test SLP (moved the loop vectorization checking to a new test). b) Use code that is slp vectorized when it is enabled, and check that instead of whether the pass is enabled. c) Test the new behavior of -vectorize-slp. d) Test both pass managers. The loop vectorization (and associated interleaving) testing I moved to a new thinlto-loop-vectorize-pm.c test, with several changes: a) Changed the flags on the interleaving testing so that it will actually interleave, and check that. b) Test the new behavior of -vectorize-loops and -interleave-loops. c) Test both pass managers. Reviewers: fhahn, wmi Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77989	2020-04-14 18:09:10 -07:00
Mircea Trofin	447e2c3067	[llvm][NFC][CallSite] Remove Implementation uses of CallSite Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78142	2020-04-14 14:49:47 -07:00
Christopher Tetreault	8226d599ff	[SVE] Remove calls to getBitWidth from Transforms Reviewers: efriedma, sdesmalen, spatel, eugenis, chandlerc Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77896	2020-04-14 14:31:42 -07:00
Huihui Zhang	5c1d1a62e3	[InstCombine][SVE] Fix visitGetElementPtrInst for scalable type. Summary: This patch fix the following issues in InstCombiner::visitGetElementPtrInst 1. Skip for scalable type if transformation requires fixed size number of vector element. 2. Skip for scalable type if transformation relies on compile-time known type alloc size. 3. Use VectorType::getElementCount when scalable property is used to construct new VectorType. 4. Use TypeSize::getKnownMinSize when minimal size of a scalable type is valid to determine GEP 'inbounds'. 5. Explicitly call TypeSize::getFixedSize to avoid implicit type conversion to uint64_t. Reviewers: sdesmalen, efriedma, spatel, ctetreau Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78081	2020-04-14 12:38:32 -07:00
Sanjay Patel	6a7e958a42	[InstCombine] try to reduce more shuffles with bitcasted operand This is the widen mask element sibling to D76844. shuf (bitcast X), undef, Mask --> bitcast X' http://volta.cs.utah.edu:8080/z/4dt3V8	2020-04-14 15:03:59 -04:00
Benjamin Kramer	7bf166665e	[FunctionAttrs] Don't copy all the nodes where a reference is fine.	2020-04-14 17:18:23 +02:00
Max Kazantsev	f8a42bca28	[ADCE] Fix incorrect reporting of CFG changes This patch fixes 2 related bugs in ADCE: - `performDeadCodeElimination` does not report changes if it did ONLY CFG changes (affects both old and new pass managers); - When control flow removal is enabled, new pass manager does not drop CFG analyses. Both can lead to incorrect loop info after ADCE that does only CFG changes. Differential Revision: https://reviews.llvm.org/D78103 Reviewed By: Denis Antrushin	2020-04-14 20:26:13 +07:00
Aaron Puchert	e833e58300	[ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC Reviewed By: fhahn, echristo Differential Revision: https://reviews.llvm.org/D78061	2020-04-14 13:32:53 +02:00
Georgii Rymar	1647ff6e27	[ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. It can be used to avoid passing the begin and end of a range. This makes the code shorter and it is consistent with another wrappers we already have. Differential revision: https://reviews.llvm.org/D78016	2020-04-14 14:11:02 +03:00
Florian Hahn	38609fa9e4	Recommit "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This includes a fix reported with simplifications in the presence of NaN. This reverts the revert commit `06408451bf`.	2020-04-14 11:48:52 +01:00
Tyker	3bdfa966ec	[AssumeBundles] preserve knowledge in DCE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77403	2020-04-14 12:48:15 +02:00
Tyker	086de7673e	[AssumeBundles] preserve knowledge in DSE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77404	2020-04-14 12:48:15 +02:00
Tyker	de4dc275f5	[AssumeBundles] preserve information in NewGVN Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77406	2020-04-14 12:48:14 +02:00
Tyker	c35194b800	[AssumeBundles] preserve information in LICM Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77407	2020-04-14 12:48:14 +02:00
Tyker	1d2b76a8fc	[AssumeBundles] adapte GVN to assume bundles Summary: prevent GVN from removing assume bundles make GVN preserve information from removed instructions Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77405	2020-04-14 12:48:14 +02:00
Pratyai Mazumder	0c61e91100	[SanitizerCoverage] The section name for inline-bool-flag was too long for darwin builds, so shortening it. Summary: Following up on the comments on D77638. Not undoing rGd6525eff5ebfa0ef1d6cd75cb9b40b1881e7a707 here at the moment, since I don't know how to test mac builds. Please let me know if I should include that here too. Reviewers: vitalybuka Reviewed By: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77889	2020-04-14 02:06:33 -07:00
Mircea Trofin	4aae4e3f48	[llvm][NFC] CallSite removal from inliner-related files Summary: This removes CallSite from inliner files. Some dependencies where thus affected. Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77991	2020-04-13 21:28:58 -07:00
Mehdi Amini	384ca190ae	Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object" This reverts commit `10df1563d6`. Some buildbots are broken.	2020-04-14 00:27:08 +00:00
Mehdi Amini	10df1563d6	Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object ModuleSummaryAnalysis is the only file in libAnalysis that brings a dependency on the CodeGen layer from libAnalysis, moving it breaks this dependency. Differential Revision: https://reviews.llvm.org/D77994	2020-04-13 23:12:11 +00:00
Benjamin Kramer	f1542efd97	[CHR] Clean up some code and reduce copying. NFCI.	2020-04-13 23:11:20 +02:00
Christopher Tetreault	3297e9b7c3	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: rriddle, sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77259	2020-04-13 12:29:43 -07:00
Benjamin Kramer	ec228d722c	[InstCombine] Use SmallBitVector for convienently checking if all bits are set	2020-04-13 20:37:15 +02:00
Vedant Kumar	4831f4b7bd	[InstCombine] Fix debug variance issue in tryToMoveFreeBeforeNullTest Fix an issue where the presence of debug info could disable an optimization in tryToMoveFreeBeforeNullTest.	2020-04-13 10:55:17 -07:00
Vedant Kumar	122a6bfb07	[Debugify] Strip added metadata in the -debugify-each pipeline Summary: Share logic to strip debugify metadata between the IR and MIR level debugify passes. This makes it simpler to hunt for bugs by diffing IR with vs. without -debugify-each turned on. As a drive-by, fix an issue causing CallGraphNodes to become invalid when a dead llvm.dbg.value prototype is deleted. Reviewers: dsanders, aprantl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77915	2020-04-13 10:55:17 -07:00
Gil Rapaport	41ed5d856c	[LV] Clean up vectorizeInterleaveGroup (NFCI) Pass from the calling recipe the interleave group itself instead of passing the group's insertion position and having the function query CM for its interleave group and making sure that given instruction is the insertion point of. Differential Revision: https://reviews.llvm.org/D78002	2020-04-13 13:15:06 +03:00
Tyker	813f438baa	[AssumeBundles] adapt Assumption cache to assume bundles Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge. Reviewers: jdoerfert, hfinkel Reviewed By: jdoerfert Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77402	2020-04-13 12:04:51 +02:00
Benjamin Kramer	06408451bf	Revert "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This reverts commit `1a02aaeaa4`. Crashes on the following test case: $ cat crash.ll source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\00\00\C0\7F\09\85\08?\ED\C94\FE~\EB/\F3\90\CF\BA\C1" @1 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\A3\A0\0FA\00\00\C0\7F\00\00\C0\7F\00\00\00\00\02\9AA\00" define void @IgammaSpecialValues.448() { entry: br label %fusion.26.loop_header.dim.0 fusion.26.loop_header.dim.0: ; preds = %fusion.26.loop_header.dim.0, %entry %fusion.26.invar_address.dim.0.0 = phi i64 [ 0, %entry ], [ %invar.inc17, %fusion.26.loop_header.dim.0 ] %0 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @0 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %1 = load float, float %0 %2 = fmul float %1, 0.000000e+00 %3 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @1 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %4 = load float, float %3 %5 = fneg float %4 %6 = fadd float %2, %5 %invar.inc17 = add nuw nsw i64 %fusion.26.invar_address.dim.0.0, 1 br label %fusion.26.loop_header.dim.0 } $ opt -ipsccp -S < crash.ll opt: llvm/include/llvm/Analysis/ValueLattice.h:251: bool llvm::ValueLatticeElement::markConstant(llvm::Constant *, bool): Assertion `getConstant() == V && "Marking constant with different value"' failed.	2020-04-13 11:23:26 +02:00
Florian Hahn	18138e0252	[VPlan] Introduce VPWidenSelectRecipe (NFC). Widening a selects depends on whether the condition is loop invariant or not. Rather than checking during codegen-time, the information can be recorded at the VPlan construction time. This was suggested as part of D76992, to reduce the reliance on accessing the original underlying IR values. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77869	2020-04-13 08:35:28 +01:00
Eli Friedman	cfb844265a	[GlobalOpt] Explicitly set alignment of bool load/store operations.	2020-04-12 16:03:12 -07:00
Huihui Zhang	4bde7c5986	[NFC] Use VectorType::isScalable to align with ongoing VectorType refactor.	2020-04-12 15:39:13 -07:00
Mircea Trofin	d2f1cd5d97	[llvm][NFC] Refactor uses of CallSite to CallBase - call promotion Summary: Updated CallPromotionUtils and impacted sites. Parameters that are expected to be non-null, and return values that are guranteed non-null, were replaced with CallBase references rather than pointers. Left FIXME in places where more changes are facilitated by CallBase, but aren't CallSites: Instruction* parameters or return values, for example, where the contract that they are actually CallBase values. Reviewers: davidxl, dblaikie, wmi Reviewed By: dblaikie Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77930	2020-04-12 08:27:29 -07:00
Florian Hahn	ae1e353a25	[VPlan] Turn classes with all public members into structs (NFC). struct should be used when all members are public: https://llvm.org/docs/CodingStandards.html#use-of-class-and-struct-keywords Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77865	2020-04-12 11:03:39 +01:00
Sanjay Patel	1318ddbc14	[VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC As proposed in D77881, we'll have the related widening operation, so this name becomes too vague. While here, change the function signature to take an 'int' rather than 'size_t' for the scaling factor, add an assert for overflow of 32-bits, and improve the documentation comments.	2020-04-11 10:05:49 -04:00
Benjamin Kramer	e590bd6b92	[argpromote] Use formatv to simplify code. NFCI.	2020-04-11 14:54:32 +02:00
Florian Hahn	719846c469	[VPlan] Drop redundant private: at beginning of class defs (NFC). Default visibility for classes is private, so the private: at the top of various class definitions is redundant. Reviewers: gilr, rengolin, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D77810	2020-04-11 13:27:10 +01:00
Huihui Zhang	6e7eeb44b3	[GVN] Fix VNCoercion for Scalable Vector. Summary: For VNCoercion, skip scalable vector when analysis rely on fixed size, otherwise call TypeSize::getFixedSize() explicitly. Add unit tests to check funtionality of GVN load elimination for scalable type. Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau Reviewed By: efriedma Subscribers: bjope, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76944	2020-04-10 17:49:07 -07:00
Eric Christopher	45dca04395	Exclude bitcast and ext/trunc signbit optimization on ppc_fp128 Revision `a1c05fe` <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad> removed bitcast from the list of problematic transformations, however: %97 = fptrunc ppc_fp128 %2 to double // we need to check ppc_fp128 here to prevent the transformation %98 = bitcast double %97 to i64 // `a1c05fe` checks ppc_fp128 at here %99 = icmp slt i64 %98, 0 %100 = zext i1 %99 to i8 store i8 %100, i8* %7, align 1 so this patch does that. I'm also disabling it in the presence of extend just in case. I verified separately that the hash of -std::infinity and std::infinity don't match now. Differential Revision: https://reviews.llvm.org/D77911	2020-04-10 17:07:55 -07:00
Mircea Trofin	da9bcdaad9	[llvm][NFC] Inliner.cpp: ensure InlineHistory ID is always initialized; Summary: The inline history is associated with a call site. There are two locations we fetch inline history. In one, we fetch it together with the call site. In the other, we initialize it under certain conditions, use it later under same conditions (different if check), and otherwise is uninitialized. Although currently there is no uninitialized use, the code is more challenging to maintain correctly, than if the value were always initialized. Changed to the upfront initialization pattern already present in this file. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77877	2020-04-10 15:28:53 -07:00
Matt Morehouse	bef187c750	Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang Summary: This commit adds two command-line options to clang. These options let the user decide which functions will receive SanitizerCoverage instrumentation. This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing. Patch by Yannis Juglaret of DGA-MI, Rennes, France libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy. Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions. The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`. Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists: ``` # (a) src:* fun:* # (b) src:SRC/* fun:* # (c) src:SRC/src/woff2_dec.cc fun:* ``` Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points. For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s. We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction. The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise. Reviewers: kcc, morehouse, vitalybuka Reviewed By: morehouse, vitalybuka Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D63616	2020-04-10 10:44:03 -07:00
Mircea Trofin	f62335b534	[llvm][NFC] Style fixes in Inliner.cpp Summary: Function names: camel case, lower case first letter. Variable names: start with upper letter. For iterators that were 'i', renamed with a descriptive name, as 'I' is 'Instruction&'. Lambda captures simplification. Opportunistic boolean return simplification. Reviewers: davidxl, dblaikie Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77837	2020-04-10 08:04:39 -07:00
Ilya Leoshkevich	3bc439bdff	[MSan] Add instrumentation for SystemZ Summary: This patch establishes memory layout and adds instrumentation. It does not add runtime support and does not enable MSan, which will be done separately. Memory layout is based on PPC64, with the exception that XorMask is not used - low and high memory addresses are chosen in a way that applying AndMask to low and high memory produces non-overlapping results. VarArgHelper is based on AMD64. It might be tempting to share some code between the two implementations, but we need to keep in mind that all the ABI similarities are coincidental, and therefore any such sharing might backfire. copyRegSaveArea() indiscriminately copies the entire register save area shadow, however, fragments thereof not filled by the corresponding visitCallSite() invocation contain irrelevant data. Whether or not this can lead to practical problems is unclear, hence a simple TODO comment. Note that the behavior of the related copyOverflowArea() is correct: it copies only the vararg-related fragment of the overflow area shadow. VarArgHelper test is based on the AArch64 one. s390x ABI requires that arguments are zero-extended to 64 bits. This is particularly important for __msan_maybe_warning_() and __msan_maybe_store_origin_() shadow and origin arguments, since non zeroed upper parts thereof confuse these functions. Therefore, add ZExt attribute to the corresponding parameters. Add ZExt attribute checks to msan-basic.ll. Since with -msan-instrumentation-with-call-threshold=0 instrumentation looks quite different, introduce the new CHECK-CALLS check prefix. Reviewers: eugenis, vitalybuka, uweigand, jonpa Reviewed By: eugenis Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, stefansf, Andreas-Krebbel Tags: #llvm Differential Revision: https://reviews.llvm.org/D76624	2020-04-10 16:53:49 +02:00
Christopher Tetreault	3bebf02861	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77262	2020-04-10 07:47:19 -07:00
Florian Hahn	1a02aaeaa4	[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef. For non-integer constants/expressions and overdefined, I think we can just use SimplifyBinOp to do common folds. By just passing a context with the DL, SimplifyBinOp should not try to get additional information from looking at definitions. For overdefined values, it should be enough to just pass the original operand. Note: The comment before the `if (isconstant(V1State)...` was wrong originally: isConstant() also matches integer ranges with a single element. It is correct now. Reviewers: efriedma, davide, mssimpso, aartbik Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76459	2020-04-10 11:02:57 +01:00
John McCall	8423a6f363	Rename OptimalLayout to OptimizedStructLayout at Chris's request.	2020-04-10 00:14:20 -04:00
Max Kazantsev	4e87823026	[LoopLoadElim] Fix crash by always checking simplify form Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775	2020-04-10 09:23:28 +07:00
Mircea Trofin	655aa1ae4a	[llvm][NFC] Replace CallSite with CallBase in Inliner Summary: Almost all uses are replaced. Left FIXMEs for the two sites that require refactoring outside of Inliner, to scope this patch. Subscribers: eraman, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77817	2020-04-09 15:01:58 -07:00
Christopher Tetreault	19cc9b9ded	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: efriedma, sdesmalen, rriddle Reviewed By: sdesmalen Subscribers: hiraditya, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77261	2020-04-09 14:59:14 -07:00
Christopher Tetreault	00a1032412	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: rriddle, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77260	2020-04-09 13:35:41 -07:00
Zequan Wu	eccfa35d53	Fix lifetime call in landingpad blocking Simplifycfg pass Fix lifetime call in landingpad blocks simplifycfg from removing the landingpad. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D77188	2020-04-09 13:07:32 -07:00
Gil Rapaport	e2a1867880	[LV] Add VPValue operands to VPBlendRecipe (NFCI) InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit introduces VPValues for VPBlendRecipe to use as the values to blend. The recipe is generated with VPValues wrapping the phi's incoming values of the scalar phi. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential Revision: https://reviews.llvm.org/D77539	2020-04-09 18:48:33 +03:00
Ayal Zaks	1678489234	[LV] FoldTail w/o Primary Induction Introduce a new VPWidenCanonicalIVRecipe to generate a canonical vector induction for use in fold-tail-with-masking, if a primary induction is absent. The canonical scalar IV having start = 0 and step = VFUF, created during code -gen to control the vector loop, is widened into a canonical vector IV having start = {<PartVF, PartVF+1, ..., PartVF+VF-1> for 0 <= Part < UF} and step = <VFUF, VFUF, ..., VF*UF>. Differential Revision: https://reviews.llvm.org/D77635	2020-04-09 17:45:23 +03:00
Sanjay Patel	812970edda	[InstCombine] replace undef in vector constant for safe shift transform (PR45447) As noted in PR45447, we have a vector-constant-with-undef-element transform bug: https://bugs.llvm.org/show_bug.cgi?id=45447 We replace undefs with a safe constant (0 or -1) based on the (non-)negative predicate constraint. So this is correct: http://volta.cs.utah.edu:8080/z/WZE36H ...but this is not: http://volta.cs.utah.edu:8080/z/boj8gJ Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800). But that's making an assumption about what qualifies as "safe", and that assumption may not always hold. Differential Revision: https://reviews.llvm.org/D77739	2020-04-09 08:00:46 -04:00
Anton Bikineev	9e1ccec8d5	tsan: don't instrument __attribute__((naked)) functions Naked functions are required to not have compiler generated prologues/epilogues, hence no instrumentation is needed for them. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45400 Differential Revision: https://reviews.llvm.org/D77477	2020-04-09 13:47:47 +02:00
Florian Hahn	a7efe06af0	[LV] Assert no DbgInfoIntrinsic calls are passed to widening (NFC). When building a VPlan, BasicBlock::instructionsWithoutDebug() is used to iterate over the instructions in a block. This means that no recipes should be created for debug info intrinsics already and we can turn the early exit into an assertion. Reviewers: Ayal, gilr, rengolin, aprantl Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D77636	2020-04-09 11:37:32 +01:00
Florian Hahn	9997ee23ed	[VPlan] Add & use VPValue operands for VPWidenCallRecipe (NFC). This patch adds VPValue versions for the arguments of the call to VPWidenCallRecipe and uses them during code-generation. Similar to D76373 this reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77655	2020-04-09 10:23:26 +01:00
Jay Foad	c63aed890e	[KnownBits] Move AND, OR and XOR logic into KnownBits Summary: There are at least three clients for KnownBits calculations: ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the common logic should be moved out of these clients and into KnownBits itself. This patch does this for AND, OR and XOR calculations by implementing and using appropriate operator overloads KnownBits::operator& etc. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74060	2020-04-09 10:10:37 +01:00
Serge Pavlov	c7ff5b38f2	[FPEnv] Use single enum to represent rounding mode Now compiler defines 5 sets of constants to represent rounding mode. These are: 1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes defined by IEEE-754 and is used in `APFloat` implementation. 2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754 rounding modes and a special value for dynamic rounding mode. It is used in clang frontend. 3. `llvm::fp::RoundingMode`. Defines the same values as `clang::LangOptions::FPRoundingModeKind` but in different order. It is used to specify rounding mode in in IR and functions that operate IR. 4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7). Besides constants for rounding mode it also uses a special value to indicate error. It is convenient to use in intrinsic functions, as it represents platform-independent representation for rounding mode. In this role it is used in some pending patches. 5. Values like `FE_DOWNWARD` and other, which specify rounding mode in library calls `fesetround` and `fegetround`. Often they represent bits of some control register, so they are target-dependent. The same names (not values) and a special name `FE_DYNAMIC` are used in `#pragma STDC FENV_ROUND`. The first 4 sets of constants are target independent and could have the same numerical representation. It would simplify conversion between the representations. Also now `clang::LangOptions::FPRoundingModeKind` and `llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding direction `roundTiesToAway`, although it is supported natively on some targets. This change defines all the rounding mode type via one `llvm::RoundingMode`, which also contains rounding mode for IEEE rounding direction `roundTiesToAway`. Differential Revision: https://reviews.llvm.org/D77379	2020-04-09 13:26:47 +07:00
Pratyai Mazumder	e8d1c6529b	[SanitizerCoverage] sancov/inline-bool-flag instrumentation. Summary: New SanitizerCoverage feature `inline-bool-flag` which inserts an atomic store of `1` to a boolean (which is an 8bit integer in practice) flag on every instrumented edge. Implementation-wise it's very similar to `inline-8bit-counters` features. So, much of wiring and test just follows the same pattern. Reviewers: kcc, vitalybuka Reviewed By: vitalybuka Subscribers: llvm-commits, hiraditya, jfb, cfe-commits, #sanitizers Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D77244	2020-04-08 22:43:52 -07:00
Vitaly Buka	8b1a6c0a57	[NFC][SanitizerCoverage] Simplify alignment calculation This reverts commit e42f2a0cd8b8007c816d0e63f5000c444e29105e.	2020-04-08 22:43:52 -07:00
Johannes Doerfert	cb0ecc5c33	[CallGraphUpdater] Remove dead constants before replacing a function Dead constants might be left when a function is replaced, we can gracefully handle this case and avoid complexity for the users who would see an assertion otherwise.	2020-04-08 22:52:46 -05:00
Craig Topper	f3d3cec648	[InstCombine] Avoid a call to deprecated version of CreateCall. Passing a Value * to CreateCall has to call getPointerElementType to find the type of the pointer. In this case we can rely on the fact that Intrinsic::getDeclaration returns a Function * and use that version of CreateCall.	2020-04-08 17:41:16 -07:00
Johannes Doerfert	0985554b70	[Attributor][NFC] Split AbstractAttributes out of Attributor.cpp Attributor.cpp became quite big and we need to start provide structure. The Attributor code is now in Attributor.cpp and the classes derived from AbstractAttribute are in AttributorAttributes.cpp. Minor changes were required but no intended functional changes. We also minimized includes as part of this. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76873	2020-04-08 19:02:14 -05:00
Christopher Tetreault	155740cc33	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, rriddle, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77263	2020-04-08 15:15:41 -07:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Sanjay Patel	a1c05fe20f	[InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold Based on the post-commit comments for rG0f56bbc, there might be a problem with this transform: (bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0 ...and the ppc_fp128 data type, so conservatively bypass if we are bitcasting a ppc_fp128. We might be able to account for endian or other differences to enable this for PowerPC again if that is useful. Differential Revision: https://reviews.llvm.org/D77642	2020-04-08 08:56:19 -04:00
Max Kazantsev	7adb9e06fd	[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM	2020-04-08 17:32:03 +07:00
Kazu Hirata	91eb442fde	[JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl Summary: ComputeValueKnownInPredecessorsImpl is the main folding mechanism in JumpThreading.cpp. To avoid potential infinite recursion while chasing use-def chains, it uses: DenseSet<std::pair<Value , BasicBlock >> &RecursionSet to keep track of Value-BB pairs that we've processed. Now, when ComputeValueKnownInPredecessorsImpl recursively calls itself, it always passes BB as is, so the second element is always BB. This patch simplifes the function by dropping "BasicBlock *" from RecursionSet. Reviewers: wmi, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77699	2020-04-07 18:37:36 -07:00
Eli Friedman	565b56a72c	[NFC] Clean up uses of LoadInst constructor.	2020-04-07 16:28:53 -07:00
Daniel Sanders	1adeeabb79	Add MIR-level debugify with only locations support for now Summary: Re-used the IR-level debugify for the most part. The MIR-level code then adds locations to the MachineInstrs afterwards based on the LLVM-IR debug info. It's worth mentioning that the resulting locations make little sense as the range of line numbers used in a Function at the MIR level exceeds that of the equivelent IR level function. As such, MachineInstrs can appear to originate from outside the subprogram scope (and from other subprogram scopes). However, it doesn't seem worth worrying about as the source is imaginary anyway. There's a few high level goals this pass works towards: * We should be able to debugify our .ll/.mir in the lit tests without changing the checks and still pass them. I.e. Debug info should not change codegen. Combining this with a strip-debug pass should enable this. The main issue I ran into without the strip-debug pass was instructions with MMO's and checks on both the instruction and the MMO as the debug-location is between them. I currently have a simple hack in the MIRPrinter to resolve that but the more general solution is a proper strip-debug pass. * We should be able to test that GlobalISel does not lose debug info. I recently found that the legalizer can be unexpectedly lossy in seemingly simple cases (e.g. expanding one instr into many). I have a verifier (will be posted separately) that can be integrated with passes that use the observer interface and will catch location loss (it does not verify correctness, just that there's zero lossage). It is a little conservative as the line-0 locations that arise from conflicts do not track the conflicting locations but it can still catch a fair bit. Depends on D77439, D77438 Reviewers: aprantl, bogner, vsk Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77446	2020-04-07 16:25:13 -07:00
Fangrui Song	d2ef8c1f2c	[ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker() dso_local leads to direct access even if the definition is not within this compilation unit (it is still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link. If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no direct access will be generated. The current behavior is benign, because -fpic does not assume dso_local (clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal). If we do that for -fno-semantic-interposition (D73865), there will be an R_X86_64_PC32 linker error without this patch. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74751	2020-04-07 15:46:01 -07:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00
Jun Ma	46bff786bc	[Coroutines] Remove alignment check in shouldBeMustTail Differential Revision: https://reviews.llvm.org/D77362	2020-04-07 09:07:34 +08:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Vedant Kumar	5f185a8999	[AddressSanitizer] Fix for wrong argument values appearing in backtraces Summary: In some cases, ASan may insert instrumentation before function arguments have been stored into their allocas. This causes two issues: 1) The argument value must be spilled until it can be stored into the reserved alloca, wasting a stack slot. 2) Until the store occurs in a later basic block, the debug location will point to the wrong frame offset, and backtraces will show an uninitialized value. The proposed solution is to move instructions which initialize allocas for arguments up into the entry block, before the position where ASan starts inserting its instrumentation. For the motivating test case, before the patch we see: ``` \| 0033: movq %rdi, 0x68(%rbx) \| \| DW_TAG_formal_parameter \| \| ... \| \| DW_AT_name ("a") \| \| 00d1: movq 0x68(%rbx), %rsi \| \| DW_AT_location (RBX+0x90) \| \| 00d5: movq %rsi, 0x90(%rbx) \| \| ^ not correct ... \| ``` and after the patch we see: ``` \| 002f: movq %rdi, 0x70(%rbx) \| \| DW_TAG_formal_parameter \| \| \| \| DW_AT_name ("a") \| \| \| \| DW_AT_location (RBX+0x70) \| ``` rdar://61122691 Reviewers: aprantl, eugenis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77182	2020-04-06 15:59:25 -07:00
Daniel Sanders	15f7bc7857	Add option to limit Debugify to locations (omitting variables) Summary: It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE around. In particular, because DEBUG_VALUE has the potential to change CodeGen behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally don't. Reviewers: aprantl, bogner Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77438	2020-04-06 15:04:55 -07:00
Kirill Naumov	3f995ce8b5	[CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo The patch introduces the system to distinctively store the information needed for the Control Flow Graph as well as the instrumentary needed for the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D76820	2020-04-06 17:42:54 +00:00
Florian Hahn	7aba6a0333	[LV] Fix value that could be read uninitialized. This should fix http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569	2020-04-06 17:54:50 +01:00
Florian Hahn	90be3c24a7	[VPlan] Introduce new VPWidenCallRecipe (NFC). This patch moves calls to their own recipe, to simplify the transition to VPUser for operands of VPWidenRecipe, as discussed in D76992. Subsequently additional information can be added to the recipe rather than computing it during the execute step. Reviewers: rengolin, Ayal, gilr, hsaito Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D77467	2020-04-06 16:07:37 +01:00
Guillaume Chatelet	808286342a	[Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set. Summary: In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined. This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure. Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77538	2020-04-06 14:54:57 +00:00
Florian Hahn	6babae74c7	[Matrix] Update load/storeMatrix to take indices as Value* (NFC). This allows using the functions to be used with loop dependent indices.	2020-04-06 14:48:48 +01:00
Guillaume Chatelet	ff858d7781	[Alignment][NFC] Add DebugStr and operator* Summary: This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately) Differences from D77394: - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)` - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll) - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum) Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77537	2020-04-06 12:09:45 +00:00
Florian Hahn	39f2d9aa81	[Matrix] Add option to use row-major matrix layout as default. This patch adds a -matrix-default-layout option which can be used to set the default matrix layout to row-major or column-major (default). The initial patch updates codegen for loads, stores, binary operators and matrix multiply. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76325	2020-04-06 10:00:56 +01:00
Florian Hahn	d1fed7081d	[Matrix] Add initial tiling for load/multiply/store chains. This patch adds initial fusion for load/multiply/store chains of matrix operations. The patch contains roughly two parts: 1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused). First, we ensure that both loads of the multiply operands do not alias the store. If they do, we create new non-aliasing copies of the operands. Note that this may introduce new basic block. Finally we process TileSize x TileSize blocks. That is: load tiles from the input operands, multiply and store them. 2. Identify fusion candidates & matrix instructions. As a first step, collect all instructions with shape info and fusion candidates (currently @llvm.matrix.multiply calls). Next, try to fuse candidates and collect instructions eliminated by fusion. Finally iterate over all matrix instructions, skip the ones eliminated by fusion and lower the rest as usual. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75566	2020-04-06 09:28:15 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Tarindu Jayatilaka	b43b59fcc0	Expose `attributor-disable` to the new and old pass managers The new and old pass managers (PassManagerBuilder.cpp and PassBuilder.cpp) are exposed to an `extern` declaration of `attributor-disable` option which will guard the addition of the attributor passes to the pass pipelines. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76871	2020-04-05 22:29:34 -05:00
Anna Thomas	1d0f757904	[InlineFunction] Update metadata on loads that are return values This patch builds upon D76140 by updating metadata on pointer typed loads in inlined functions, when the load is the return value, and the callsite contains return attributes which can be updated as metadata on the load. Added test cases show this for nonnull, dereferenceable, dereferenceable_or_null Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D76792	2020-04-05 14:50:10 -04:00
Sanjay Patel	538a8f0227	[InstCombine] convert bitcast-shuffle to vector trunc As discussed in D76983, that patch can turn a chain of insert/extract with scalar trunc ops into bitcast+extract and existing instcombine vector transforms end up creating a shuffle out of that (see the PhaseOrdering test for an example). Currently, that process requires at least this sequence: -instcombine -early-cse -instcombine. Before D76983, the sequence of insert/extract would reach the SLP vectorizer and become a vector trunc there. Based on a small sampling of public targets/types, converting the shuffle to a trunc is better for codegen in most cases (and a regression of that form is the reason this was noticed). The trunc is clearly better for IR-level analysis as well. This means that we can induce "spontaneous vectorization" without invoking any explicit vectorizer passes (at least a vector cast op may be created out of scalar casts), but that seems to be the right choice given that we started with a chain of insert/extract, and the backend would expand back to that chain if a target does not support the op. Differential Revision: https://reviews.llvm.org/D77299	2020-04-05 09:48:02 -04:00
Sanjay Patel	4036a0af24	[InstCombine] enhance freelyNegateValue() by handling 'not' This patch extends D77230. If we have a 'not' instruction inside a negated expression, we can ignore extra uses of that op because the negation has a one-to-one replacement: negate becomes increment. Alive2 examples of the test cases: http://volta.cs.utah.edu:8080/z/T5-u9P http://volta.cs.utah.edu:8080/z/eT89L6 Differential Revision: https://reviews.llvm.org/D77459	2020-04-05 09:16:19 -04:00
Stefanos Baziotis	f3dd3a66d3	[Attributor] AAUndefinedBehavior: Use AAValueSimplify in memory accessing instructions. Query AAValueSimplify on pointers in memory accessing instructions to take advantage of the constant propagation (or any other value simplification) of such values.	2020-04-05 02:46:26 +03:00
Florian Hahn	a2b18c5a08	[LV] Simplify tryToWiden as recipes are not re-used (NFC). After `49d00824bb`, VPWidenRecipe only stores a single instruction. tryToWiden can simply return the widen recipe, like other helpers in VPRecipeBuilder.	2020-04-04 18:30:50 +01:00
Nikita Popov	4ede730096	[InstCombine] Don't limit uses in eraseInstFromFunction() eraseInstFromFunction() adds the operands of the erased instructions, as those might now be dead as well. However, this is limited to instructions with less than 8 operands. This check doesn't make a lot of sense to me. As the instruction gets removed afterwards, I don't see a potential for anything overly pathological happening here (as we can only add those operands to the worklist once). The impact on CTMark is in the noise. We also have the same code in instruction sinking and don't limit the operand count there. Differential Revision: https://reviews.llvm.org/D77325	2020-04-04 18:37:30 +02:00
Luofan Chen	eec6d87626	[Attributor] Deduce attributes for non-exact functions This patch is based on D63312 and D63319. For now we create shallow wrappers for all functions that are IPO amendable. See also [this github issue](https://github.com/llvm/llvm-project/issues/172). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76404	2020-04-04 11:34:58 -05:00
Nikita Popov	6896d559f3	[VNCoercion] Use IRBuilderBase; NFC And remove include from header.	2020-04-04 12:44:50 +02:00
Nikita Popov	ebd5a1b049	[Reassociate] Use IRBuilderBase; NFC And remove now unnecessary IRBuilder.h include in header.	2020-04-04 12:34:16 +02:00
Nikita Popov	1055e9e3c8	[IVDescriptors] Remove IRBuilder.h include; NFC IVDescriptors.h itself does not reference IRBuilder at all. Move the include into transformation passes that do.	2020-04-04 12:07:57 +02:00
Sanjay Patel	ce97ce3a5d	[VectorCombine] try to form a better extractelement Extracting to the same index that we are going to insert back into allows forming select ("blend") shuffles and enables further transforms. Admittedly, this is a quick-fix for a more general problem that I'm hoping to solve by adding transforms for patterns that start with an insertelement. But this might resolve some regressions known to be caused by the extract-extract transform (although I have not gotten more details on those yet). In the motivating case from PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 The combination of subsequent instcombine and codegen transforms gets us this improvement: vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm4 vmovshdup %xmm1, %xmm3 ## xmm3 = xmm1[1,1,3,3] vaddps %xmm0, %xmm2, %xmm0 vaddps %xmm1, %xmm3, %xmm1 vshufps $200, %xmm4, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm4[0,3] vinsertps $177, %xmm1, %xmm0, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm1[2] --> vmovshdup %xmm0, %xmm2 ## xmm2 = xmm0[1,1,3,3] vhaddps %xmm1, %xmm1, %xmm1 vaddps %xmm0, %xmm2, %xmm0 vshufps $200, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,3] Differential Revision: https://reviews.llvm.org/D76623	2020-04-03 13:55:13 -04:00
Roman Lebedev	7d572ef2dd	Revert "[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668)" As discussed in post-commit review in https://reviews.llvm.org/D73501 if the goal of this is to help vectorizer, then we should actually be teaching vectorizer to do this, because right now this rewrite is still budget-limited, which isn't what we'd want. Additionally, while the rest of the patch series was universally profitable, this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171) exposing cost-modeling issues on ARM. So let's just back this particular patch out. Once there's an undo transform, this could be considered for reintegration. This reverts commit `44edc6fd2c`.	2020-04-03 20:15:04 +03:00
Matt Arsenault	57a55313c3	InstCombine: Reduce minnum/maxnum if inputs are casted	2020-04-03 11:57:25 -04:00
Guillaume Chatelet	1a584a8d50	[Alignment][NFC] Remove unused private functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77297	2020-04-03 09:16:20 +00:00
OCHyams	9b56cc9361	[DebugInfo] Salvage debug info when sinking loop invariant instructions Reviewed By: vsk, aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D77318	2020-04-03 09:19:26 +01:00
Hongtao Yu	88da019977	Fix a bug in the inliner that causes subsequent double inlining Summary: A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining. To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges. ``` void top() { int t = first(); second(t); } void second(int t) { t = third(t); fourth(t); } void third(int t) { return t; } ``` The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up. We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done. I haven't been able to repro the failure with the new pass manager since the processing order of ininlined callsites is a bit different, but in theory the issue could happen there too. Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification. Reviewers: wenlei, davidxl, tejohnson Reviewed By: wenlei, davidxl Subscribers: eraman, nikic, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76248	2020-04-02 21:08:05 -07:00
Jun Ma	9c6f32a0ff	[Coroutines] Simplify implementation using removePredecessor Differential Revision: https://reviews.llvm.org/D77035	2020-04-03 09:20:07 +08:00
Anna Thomas	bf7a16a768	[InlineFunction] Update valid return attributes at callsite within callee body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate valid attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. Also, this is valid only for attributes which are a property of a callsite and not those that are not dependent on the ABI, or a property of the call itself. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-04-02 14:13:12 -04:00
Sanjay Patel	f4448063cc	[InstCombine] try to reduce shuffle with bitcasted operand shuf (bitcast X), undef, Mask --> bitcast X' The 'inverse shuffles' test (shuf_bitcast_operand) is a pattern in the motivating examples from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 (see also D76727) We can deal with this class of patterns in generic instcombine because we are not creating any new shuffles, just a bitcast. Alive2 proof: http://volta.cs.utah.edu:8080/z/mwDUZf Differential Revision: https://reviews.llvm.org/D76844	2020-04-02 13:44:50 -04:00
Sanjay Patel	b6050ca181	[VectorCombine] transform bitcasted shuffle to narrower elements bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC' We do not attempt this in InstCombine because we do not want to change types and create new shuffle ops that are potentially not lowered as well as the original code. Here, we can check the cost model to see if it is worthwhile. I've aggressively enabled this transform even if the types are the same size and/or equal cost because moving the bitcast allows InstCombine to make further simplifications. In the motivating cases from PR35454: https://bugs.llvm.org/show_bug.cgi?id=35454 ...this is enough to let instcombine and the backend eliminate the redundant shuffles, but we probably want to extend VectorCombine to handle the inverse pattern (shuffle-of-bitcast) to get that simplification directly in IR. Differential Revision: https://reviews.llvm.org/D76727	2020-04-02 13:30:22 -04:00
Sanjay Patel	12fcbcecff	[InstCombine] add tests for cmyk benchmark; NFC These are versions of a function that regressed with: rGf2fbdf76d8d0 That particular problem occurs with an instcombine-simplifycfg-instcombine sequence, but we can show that it exists within instcombine only with other variations of the pattern.	2020-04-02 13:00:46 -04:00
Benjamin Kramer	de8831934a	[LoopDataPrefetch] Remove unused include that's a layering violation	2020-04-02 17:46:10 +02:00
Benjamin Kramer	dffc503187	Revert "[SimplifyLibCalls] Erase replaced instructions" This reverts commit `2a77544ad5`. This introduces a use-after-free in Transforms/InstCombine/sincospi.ll. Found by asan.	2020-04-02 17:30:47 +02:00
Sanjay Patel	1008435f3d	Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold" This reverts commit `f2fbdf76d8`. As noted in the post-commit thread: https://reviews.llvm.org/rGf2fbdf76d8d0 ...this can obscure a min/max pattern where the components have extra uses. We can show that the problem is independent of this change with a slightly modified source example, so this revert just delays/reduces the need to fix the real problem. We need to improve our analysis of negation or -- more generally -- subtraction using patches like D77230 or D68408.	2020-04-02 09:15:23 -04:00
Tyker	c00cb76274	[NFC] Split Knowledge retention and place it more appropriatly Summary: Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils allows Queries and Transform/Utils to use Analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77171	2020-04-02 15:01:41 +02:00
Jonas Paulsson	36d4421f50	[LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop. This patch adds - New arguments to getMinPrefetchStride() to let the target decide on a per-loop basis if software prefetching should be done even with a stride within the limit of the hw prefetcher. - New TTI hook enableWritePrefetching() to let a target do write prefetching by default (defaults to false). - In LoopDataPrefetch: - A search through the whole loop to gather information before emitting any prefetches. This way the target can get information via new arguments to getMinPrefetchStride() and emit prefetches more selectively. Collected information includes: Does the loop have a call, how many memory accesses, how many of them are strided, how many prefetches will cover them. This is NFC to before as long as the target does not change its definition of getMinPrefetchStride(). - If a previous access to the same exact address was 'read', and the current one is 'write', make it a 'write' prefetch. - If two accesses that are covered by the same prefetch do not dominate each other, put the prefetch in a block that dominates both of them. - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop. - A SystemZ implementation of getMinPrefetchStride(). Review: Ulrich Weigand, Michael Kruse Differential Revision: https://reviews.llvm.org/D70228	2020-04-02 14:57:46 +02:00
Florian Hahn	a63b5c9e53	[CallSiteSplitting] Simplify isPredicateOnPHI & continue checking PHIs. As pointed out by @thakis, currently CallSiteSplitting bails out after checking the first PHI node. We should check all PHI nodes, until we find one where call site splitting is beneficial. This patch also slightly simplifies the code using BasicBlock::phis(). Reviewers: davidxl, junbuml, thakis Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D77089	2020-04-02 10:11:27 +01:00
Johannes Doerfert	bcd8009369	[Attributor] Use the proper context instruction in genericValueTraversal There was a TODO in genericValueTraversal to provide the context instruction and due to the lack of it users that wanted one just used something available. Unfortunately, using a fixed instruction is wrong in the presence of PHIs so we need to update the context instruction properly. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D76870	2020-04-01 22:20:47 -05:00
Johannes Doerfert	ac96c8fd85	[Attributor][FIX] Do not compute ranges for arguments of declarations This cannot be triggered right now, as far as I know, but it doesn't make sense to deduce a constant range on arguments of declarations. Exposed during testing of AAValueSimplify extensions.	2020-04-01 22:05:30 -05:00
Johannes Doerfert	54d6a608bf	[Attributor][NFC] Predetermine the module It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.	2020-04-01 21:56:17 -05:00
Johannes Doerfert	9e19693994	[Attributor] Derive better alignment for accessed pointers Use DL & ABI information for better alignment deduction, e.g., if a type is accessed and the ABI specifies an alignment requirement for such an access we can use it. This is based on a patch by @lebedev.ri and inspired by getBaseAlign in Loads.cpp. Depends on D76673. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D76674	2020-04-01 21:49:57 -05:00
Johannes Doerfert	b1c788d051	[Attributor][FIX] Prevent alignment breakage wrt. must-tail calls If we have a must-tail call the callee and caller need to have matching ABIs. Part of that is alignment which we might modify when we deduce alignment of arguments of either. Since we would need to keep them in sync, which is not as simple, we simply avoid deducing alignment for arguments of the must-tail caller or callee. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D76673	2020-04-01 21:40:07 -05:00
Johannes Doerfert	41f2a57d0b	[Attributor][NFC] Use a BumpPtrAllocator to allocate `AbstractAttribute`s We create a lot of AbstractAttributes and they live as long as the Attributor does. It seems reasonable to allocate them via a BumpPtrAllocator owned by the Attributor. Reviewed By: baziotis Differential Revision: https://reviews.llvm.org/D76589	2020-04-01 20:53:28 -05:00
Sanjay Patel	3d90048791	[InstCombine] enhance freelyNegateValue() by handling xor Negation is equivalent to bitwise-not + 1, so try to convert more subtracts into adds using this relationship: 0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1 I doubt this will recover the regression noted in rGf2fbdf76d8d0, but seems like we're going to need to improve here and/or revive D68408? Alive2 proofs: http://volta.cs.utah.edu:8080/z/Re5tMU http://volta.cs.utah.edu:8080/z/An-uns Differential Revision: https://reviews.llvm.org/D77230	2020-04-01 15:05:13 -04:00
Jonathan Roelofs	1148f004fa	Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping find() was altering the UserChain, even in cases where it subsequently discovered that the resulting constant was a 0. This confuses rebuildWithoutConstOffset() when it attempts to walk the chain later, since it is expected that the chain itself be a path down the use-def edges of an expression.	2020-04-01 12:38:15 -06:00
Nikita Popov	50a3e8738a	Revert "[InstCombine] Erase old instruction when replacing extractelements" This reverts commit `d40368fdb5`. llvm-clang-x86_64-expensive-checks-debian failure looks related.	2020-04-01 20:10:11 +02:00
Nikita Popov	2a77544ad5	[SimplifyLibCalls] Erase replaced instructions After RAUWing an instruction, also erase it. This makes sure we don't perform extra InstCombine iterations to clean up the garbage.	2020-04-01 20:00:10 +02:00
Uday Bondhugula	6ee11c3b0f	[NewGVN] Make NewGVN aware of aligned_alloc Make the New GVN pass aware of aligned_alloc. Depends on D76975. Differential Revision: https://reviews.llvm.org/D76976	2020-04-01 23:26:51 +05:30
Uday Bondhugula	4cf70af94f	[GVN] Make GVN aware of aligned_alloc Make the GVN pass aware of aligned_alloc. Depends on D76974. Differential Revision: https://reviews.llvm.org/D76975	2020-04-01 23:26:50 +05:30
Uday Bondhugula	c4499e3333	[Attributor] Make attributor aware of aligned_alloc for heap to stack conversion Make the attributor pass aware of aligned_alloc for converting heap allocations to stack ones. Depends on D76971. Differential Revision: https://reviews.llvm.org/D76974	2020-04-01 23:26:50 +05:30
Nikita Popov	d40368fdb5	[InstCombine] Erase old instruction when replacing extractelements As we are not returning the result of replaceInstUsesWith(), so we need to clean up ourselves. NFC apart from worklist order.	2020-04-01 19:55:28 +02:00
Nikita Popov	4b35c816ef	[InstCombine] Use replaceOperand() in div transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-04-01 19:55:00 +02:00
Benjamin Kramer	66b9f5f7f0	[GVNSink] Simplify code. NFC.	2020-04-01 13:13:00 +02:00
Cullen Rhodes	84aa6cf1a9	[Transforms][SROA] Promote allocas with mem2reg for scalable types Summary: Aggregate types containing scalable vectors aren't supported and as far as I can tell this pass is mostly concerned with optimisations on aggregate types, so the majority of this pass isn't very useful for scalable vectors. This patch modifies SROA such that mem2reg is run on allocas with scalable types that are promotable, but nothing else such as slicing is done. The use of TypeSize in this pass has also been updated to be explicitly fixed size. When invoking the following methods in DataLayout: * getTypeSizeInBits * getTypeStoreSize * getTypeStoreSizeInBits * getTypeAllocSize we now called getFixedSize on the resultant TypeSize. This is quite an extensive change with around 50 calls to these functions, and also the first change of this kind (being explicit about fixed vs scalable size) as far as I'm aware, so feedback welcome. A test is included containing IR with scalable vectors that this pass is able to optimise. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76720	2020-04-01 10:34:11 +00:00
Eli Friedman	ba4764c2cc	Fix leak in GVNSink introduced in D72467.	2020-03-31 16:21:27 -07:00
Evgenii Stepanov	f9471b0010	Fix MSan false positive due to select folding. Summary: Select folding in JumpThreading can create a conditional branch on a code patch that did not have one in the original program. This is not a valid transformation in sanitize_memory functions. Note that JumpThreading does select folding in 3 different places. Two of them seem safe - they apply to a select instruction in a BB that ends with an unconditional branch to another BB, which (in turn) ends with a conditional branch or a switch with the same condition. Fixes PR45220. Reviewers: glider, dvyukov, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76332	2020-03-31 15:25:42 -07:00
Anna Thomas	58a05675da	Revert "[InlineFunction] Handle return attributes on call within inlined body" This reverts commit `28518d9ae3`. There is a failure in MsgPackReader.cpp when built with clang. It complains about "signext and zeroext" are incompatible. Investigating offline if it is infact a UB in the MsgPackReader code.	2020-03-31 16:16:34 -04:00
Nikita Popov	b7fe795e5b	[InstCombine] Use replaceOperand() in some select transforms To make sure the old operand is DCEd. NFC apart from worklist order.	2020-03-31 22:10:55 +02:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Nikita Popov	c538c57d6d	[InstCombine] Use replaceOperand() in descaling To make sure the old operand gets DCEd. NFC apart from worklist order.	2020-03-31 22:05:53 +02:00
Nikita Popov	19df7fa892	[InstCombine] Erase old alloca in cast of alloca transform As we don't return the replaceInstUsesWith() result, we are responsible for erasing the instruction. NFC apart from worklist order.	2020-03-31 21:57:39 +02:00
Nikita Popov	87357808b8	[InstCombine] Use replaceOperand() in non zero phi transform To make sure the old operand gets DCEd. NFC apart from worklist order changes.	2020-03-31 21:54:21 +02:00
Nikita Popov	f3d4166368	[InstCombine] Report change in non zero phi transform We need to inform InstCombine (and transitively the pass manager) that we changed an instruction.	2020-03-31 21:52:40 +02:00
Anna Thomas	28518d9ae3	[InlineFunction] Handle return attributes on call within inlined body Consider a callee function that has a call (C) within it which feeds into the return. When we inline that callee into a callsite that has return attributes, we can backward propagate those attributes to the call (C) within that inlined callee body. This is safe to do so only if we can guarantee transfer of execution to successor in the window of instructions between return value (i.e. the call C) and the return instruction. See added test cases. Reviewed-By: reames, jdoerfert Differential Revision: https://reviews.llvm.org/D76140	2020-03-31 14:35:40 -04:00
Uday Bondhugula	dc817b2dea	[InstCombine] Deduce attributes for aligned_alloc in InstCombine Make InstCombine aware of the aligned_alloc library function. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Depends on D76970. Differential Revision: https://reviews.llvm.org/D76971	2020-03-31 23:17:28 +05:30
Florian Hahn	b0cd7b2799	[SCCP] Limit use of range info for binops to integers for now. This fixes a crash when building the test suite.	2020-03-31 17:08:09 +01:00
Tyker	4aeb7e1ef4	[AssumeBundles] Preserve information in EarlyCSE Summary: this patch preserve information from various places in EarlyCSE into assume bundles. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76769	2020-03-31 17:47:04 +02:00
Florian Hahn	b37543750c	[ValueLattice] Distinguish between constant ranges with/without undef. This patch updates ValueLattice to distinguish between ranges that are guaranteed to not include undef and ranges that may include undef. A constant range guaranteed to not contain undef can be used to simplify instructions to arbitrary values. A constant range that may contain undef can only be used to simplify to a constant. If the value can be undef, it might take a value outside the range. For example, consider the snipped below define i32 @f(i32 %a, i1 %c) { br i1 %c, label %true, label %false true: %a.255 = and i32 %a, 255 br label %exit false: br label %exit exit: %p = phi i32 [ %a.255, %true ], [ undef, %false ] %f.1 = icmp eq i32 %p, 300 call void @use(i1 %f.1) %res = and i32 %p, 255 ret i32 %res } In the exit block, %p would be a constant range [0, 256) including undef as %p could be undef. We can use the range information to replace %f.1 with false because we remove the compare, effectively forcing the use of the constant to be != 300. We cannot replace %res with %p however, because if %a would be undef %cond may be true but the second use might not be < 256. Currently LazyValueInfo uses the new behavior just when simplifying AND instructions and does not distinguish between constant ranges with and without undef otherwise. I think we should address the remaining issues in LVI incrementally. Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76931	2020-03-31 12:50:20 +01:00
Daan Sprenkels	464b9aeafe	[InstCombine] Transform extelt-trunc -> bitcast-extelt Canonicalize the case when a scalar extracted from a vector is truncated. Transform such cases to bitcast-then-extractelement. This will enable erasing the truncate operation. This commit fixes PR45314. reviewers: spatel Differential revision: https://reviews.llvm.org/D76983	2020-03-31 11:53:41 +02:00
Sebastian Neubauer	5d3a69feca	[AMDGPU] New llvm.amdgcn.ballot intrinsic Add a new llvm.amdgcn.ballot intrinsic modeled on the ballot function in GLSL and other shader languages. It returns a bitfield containing the result of its boolean argument in all active lanes, and zero in all inactive lanes. This is intended to replace the existing llvm.amdgcn.icmp and llvm.amdgcn.fcmp intrinsics after a suitable transition period. Use the new intrinsic in the atomic optimizer pass. Differential Revision: https://reviews.llvm.org/D65088	2020-03-31 10:35:39 +02:00
Florian Hahn	0c9c58ada0	[SCCP] Use constant ranges for casts. For casts with constant range operands, we can use ConstantRange::castOp. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71938	2020-03-31 09:22:04 +01:00
Wei Mi	ebad678857	[SampleFDO] Port MD5 name table support to extbinary format. Compbinary format uses MD5 to represent strings in name table. That gives smaller profile without the need of compression/decompression when writing/reading the profile. The patch adds the support in extbinary format. It is off by default but user can choose to enable it. Note the feature of using MD5 in name table can bring very small chance of name conflict leading to profile mismatch. Besides, profile using the feature won't have the profile remapping support. Differential Revision: https://reviews.llvm.org/D76255	2020-03-30 22:07:08 -07:00
Sanjay Patel	f2fbdf76d8	[InstCombine] do not exclude min/max from icmp with casted operand fold InstCombine has a mess of logic that tries to preserve min/max patterns, but AFAICT, this one is not necessary because we can always narrow the corresponding select in this sequence to match the narrow compare. The biggest danger for this patch is inducing infinite looping or assert from exceeding max iterations. If any bots hit that in the vicinity of this commit, this is the likely patch to blame.	2020-03-30 16:10:51 -04:00
Thomas Raoux	3ea0774b13	[ConstantFold][NFC] Compile time optimization for large vectors Optimize the common case of splat vector constant. For large vector going through all elements is expensive. For splatr/broadcast cases we can skip going through all elements. Differential Revision: https://reviews.llvm.org/D76664	2020-03-30 11:27:09 -07:00
Sameer Sahasrabuddhe	3cbbded68c	Introduce unify-loop-exits pass. For each natural loop with multiple exit blocks, this pass creates a new block N such that all exiting blocks now branch to N, and then control flow is redistributed to all the original exit blocks. The bulk of the tranformation is a new function introduced in BasicBlockUtils that an redirect control flow from a set of incoming blocks to a set of outgoing blocks via a common "hub". This is a useful workaround for a limitation in the structurizer which incorrectly orders blocks when processing a nest of loops. This pass bypasses that issue by ensuring that each natural loop is recognized as a separate region. Since the structurizer is a region pass, it no longer sees a nest of loops in a single region, and instead processes each "level" in the nesting as a separate region. The AMDGPU backend provides a new option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewers: madhur13490, arsenm, nhaehnle Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D75865	2020-03-30 13:23:56 -04:00
Vedant Kumar	dcc410b5cf	[LoopVectorize] Fix crash on "getNoopOrZeroExtend cannot truncate!" (PR45259) In InnerLoopVectorizer::getOrCreateTripCount, when the backedge taken count is a SCEV add expression, its type is defined by the type of the last operand of the add expression. In the test case from PR45259, this last operand happens to be a pointer, which (according to llvm::Type) does not have a primitive size in bits. In this case, LoopVectorize fails to truncate the SCEV and crashes as a result. Uing ScalarEvolution::getTypeSizeInBits makes the truncation work as expected. https://bugs.llvm.org/show_bug.cgi?id=45259 Differential Revision: https://reviews.llvm.org/D76669	2020-03-30 10:14:14 -07:00
Chris Jackson	f6b2c003f3	[DebugInfo] Ensure that a demanded bits optimisation in InstCombine does not result in an incorrect debuginfo variable value - Add an additional salvage and a test. Reviewers: aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D76854 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=44371	2020-03-30 15:39:22 +01:00
Chris Jackson	135709aa90	[DebugInfo] Ensure dead store elimination can mark an operand value as undefined - Correct a debug info salvage and add a test Reviewers: aprantl, vsk Differential Revision: https://reviews.llvm.org/D76930 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45080	2020-03-30 14:58:14 +01:00
Florian Hahn	9e81249d76	[Matrix] Rename emitChainedMatrixMultiply to emitMatrixMultiply (NFC). The Chained in the name potentially leads to confusion. Also updated the comment to drop the unnecessary mention of tile-sized.	2020-03-30 11:17:25 +01:00
Jun Ma	31a1d85c53	[Coroutines 2/2] Improve symmetric control transfer feature Differential Revision: https://reviews.llvm.org/D76913	2020-03-30 09:53:09 +08:00
Jun Ma	a94fa2c049	[Coroutines 1/2] Improve symmetric control transfer feature Differential Revision: https://reviews.llvm.org/D76911	2020-03-30 09:53:09 +08:00
Nikita Popov	8253a86b65	[InstCombine] Erase old mul when creating umulo As we don't return the result of replaceInstUsesWith(), we are responsible for erasing the instruction. There is a small subtlety here in that we need to do this after the other uses of Builder, which uses the original multiply as the insertion point. NFC apart from worklist order changes.	2020-03-29 20:46:08 +02:00
Nikita Popov	53d209076a	[InstCombine] Use replaceOperand() in demanded elements simplification To make sure that dead operands get DCEd. This fixes the largest source of leftover dead operands we see in tests. NFC apart from worklist changes.	2020-03-29 20:43:19 +02:00
Nikita Popov	0c87140065	[InstCombine] Use replaceOperand() in assoc cast simplification To make sure the old operands are DCEd. NFC apart from worklist order.	2020-03-29 20:28:37 +02:00
Nikita Popov	a9ddcd6411	[InstCombine] Erase old add when optimizing add overflow We don't return the replaceInstUsesWith() result, so we're responsible for cleaning up. NFC apart from worklist order changes.	2020-03-29 20:20:14 +02:00
Uday Bondhugula	c0955edfd6	Introduce support for lib function aligned_alloc in TLI / memory builtins Aligned_alloc is a standard lib function and has been in glibc since 2.16 and in the C11 standard. It has semantics similar to malloc/calloc for several analyses/transforms. This patch introduces aligned_alloc in target library info and memory builtins. Subsequent ones will make other passes aware and fix https://bugs.llvm.org/show_bug.cgi?id=44062 This change will also be useful to LLVM generators that need to allocate buffers of vector elements larger than 16 bytes (for eg. 256-bit ones), element boundary alignment for which is not typically provided by glibc malloc. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D76970	2020-03-29 23:36:24 +05:30
Sanjay Patel	fc3cc8a4b0	[VectorCombine] skip debug intrinsics first for efficiency	2020-03-29 13:58:04 -04:00
Nikita Popov	26fa33755f	[InstCombine] Simplify select of cmpxchg transform Rather than converting to a dummy select with equal true and false ops, just directly return the resulting value. As a side-effect, this fixes missing DCE of the previously replaced operand.	2020-03-29 18:57:32 +02:00
Nikita Popov	28f67bd5c5	[InstCombine] Fix worklist management in varargs transform Add a replaceUse() helper to mirror replaceOperand() for the rare cases where we're working directly on uses. NFC apart from worklist order changes.	2020-03-29 18:04:12 +02:00
Nikita Popov	6f07a9e80a	[InstCombine] Erase original add when creating saddo Usually when we replaceInstUsesWith() we also return the original instruction, and InstCombine will take care of erasing it. Here we don't do that, so we need to manually erase it. NFC apart from worklist order changes.	2020-03-29 18:01:32 +02:00
Nikita Popov	1e363023b8	[InstCombine] Use replaceOperand() in a few more places To make sure the old operands get DCEd. NFC apart from worklist order changes.	2020-03-29 18:01:00 +02:00
Florian Hahn	49d00824bb	[VPlan] Use one VPWidenRecipe per original IR instruction. (NFC). This patch changes VPWidenRecipe to only store a single original IR instruction. This is the first required step towards modeling it's operands as VPValues and also towards breaking it up into a VPInstruction. Discussed as part of D74695. Reviewers: Ayal, gilr, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D76988	2020-03-29 13:47:28 +01:00
Richard Diamond	4bf015c035	[AlignmentFromAssumptions] Fix a SCEV assertion resulting from address space differences. Summary: On targets with different pointer sizes, -alignment-from-assumptions could attempt to create SCEV expressions which use different effective SCEV types. The provided test illustrates the issue. In `getNewAlignment`, AASCEV would be the (only) alloca, which would have an effective SCEV type of i32. But PtrSCEV, the GEP in this case, due to being in the flat/default address space, will have an effective SCEV of i64. This patch resolves the issue by truncating PtrSCEV to AASCEV's effective type. Reviewers: hfinkel, jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75471	2020-03-29 01:26:31 -05:00
Nikita Popov	2215dcf1d7	[InstCombine] Remove unreachable blocks before DCE Dropping unreachable code may reduce use counts on other instructions, so it's better to do this earlier rather than later. NFC-ish, may only impact worklist order.	2020-03-28 21:19:16 +01:00
Nikita Popov	97cc1275c7	[InstCombine] Merge two functions; NFC Merge AddReachableCodeToWorklist() into prepareICWorklistFromFunction(). It's one logical step, and this makes it easier to move code.	2020-03-28 21:19:16 +01:00
Nikita Popov	30d712103f	[InstCombine] Use replaceOperand() API in GEP transforms To make sure that replaced operands get DCEd. This drops one iteration from gepphigep.ll, which is still not optimal. This was the last test case performing more than 3 iterations. NFC-ish, only worklist order should change.	2020-03-28 19:07:25 +01:00
Nikita Popov	b1f78baeaa	[InstCombine] Reduce code duplication in GEP of PHI transform; NFC The `NewGEP->setOperand(DI, NewPN)` call was duplicated, and the insertion of NewGEP is the same in both if/else, so we can extract it.	2020-03-28 19:07:25 +01:00
Nikita Popov	672e8bfbfc	[InstCombine] Fix worklist management in foldXorOfICmps() Because this code does not use the IC-aware replaceInstUsesWith() helper, we need to manually push users to the worklist. This is NFC-ish, in that it may only change worklist order.	2020-03-28 18:25:21 +01:00
Enna1	03bc311a16	[CorrelatedValuePropagation] Remove redundant if statement in processSelect() This statement if (ReplaceWith == S) ReplaceWith = UndefValue::get(S->getType()); is introduced in https://reviews.llvm.org/rG35609d97ae89b8e13f40f4e6b9b056954f8baa83 to fix a case where unreachable code can cause select instruction simplification to fail. In https://reviews.llvm.org/rGd10480657527ffb44ea213460fb3676a6b1300aa, we begin to perform a depth-first walk of basic blocks. This means we will not visit unreachable blocks. So we do not need this the special check any more. Differential Revision: https://reviews.llvm.org/D76753	2020-03-28 18:01:17 +01:00
Florian Hahn	81f173ed0e	[SCCP] Remove LatticeVal alias now that transition is done (NFC). The LatticeVal alias was introduced to reduce the diff size for the transition to ValueLatticeElement, which is done now. This patch removes the unnecessary alias and updates some very verbose type uses with auto.	2020-03-28 15:40:24 +00:00
Florian Hahn	a44bf59c93	[SCCP] Remove unused toLatticeValue helper (NFC). LatticeVal is an alias for ValueLatticeElement and the function is not used any longer.	2020-03-28 15:40:24 +00:00
Uday Bondhugula	06066c4003	[NFC] Attributor comment updates / cast cleanup Minor update/fixes to comments for the Attributor pass, and dyn_cast -> cast. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D76972	2020-03-28 13:36:43 +05:30
Sanjay Patel	0f56bbc1a5	[InstCombine] reduce FP-casted and bitcasted signbit check PR45305: https://bugs.llvm.org/show_bug.cgi?id=45305 Alive2 proofs: http://volta.cs.utah.edu:8080/z/bVyrko http://volta.cs.utah.edu:8080/z/Vxpz9q	2020-03-27 17:33:59 -04:00
Sjoerd Meijer	401a324c51	[LV] Refactor widenIntOrFpInduction. NFC. This untangles the logic in widenIntOrFpInduction in order to make more explicit and visible how exactly the induction variable is lowered. Differential Revision: https://reviews.llvm.org/D76686	2020-03-27 12:58:50 +00:00
Jonathan Roelofs	7a89a5d81b	[InstCombine] Fix Incorrect fold of ashr+xor -> lshr w/ vectors Fixes https://bugs.llvm.org/show_bug.cgi?id=43665	2020-03-26 12:09:36 -06:00
Fangrui Song	4c52d51e78	[InstCombine] Fix a code-sinking bug after D73832/f1a9efabcb9b - UserParent = PN->getIncomingBlock(I->use_begin()); + UserParent = PN->getIncomingBlock(SingleUse); The first use of I may be droppable (llvm.assume). When compiling llvm/lib/IR/AutoUpgrade.cpp with a bootstrapped clang with ThinLTO with minimized bitcode files, I see such a case in the function _ZN4llvm20UpgradeIntrinsicCallEPNS_8CallInstEPNS_8FunctionE clang -c -fthinlto-index=AutoUpgrade.o.thinlto.bc AutoUpgrade.bc -O3 Unfortunately it is really difficult to get a minimized reproduce.	2020-03-25 22:50:53 -07:00
John McCall	9514c048d8	Use optimal layout and preserve alloca alignment in coroutine frames. Previously, we would ignore alloca alignment when building the frame and just use the natural alignment of the allocated type. If an alloca is over-aligned for its IR type, this could lead to a frame entry with inadequate alignment for the downstream uses of the alloca. Since highly-aligned fields also tend to produce poor layouts under a naive layout algorithm, I've also switched coroutine frames to use the new optimal struct layout algorithm. In order to communicate the frame size and alignment to later passes, I needed to set align+dereferenceable attributes on the frame-pointer parameter of the resume function. This is clearly the right thing to do, but the align attribute currently seems to result in assumptions being added during inlining that the optimizer cannot easily remove.	2020-03-26 00:51:09 -04:00
Tyker	f1a9efabcb	Ignore/Drop droppable uses for code-sinking in InstCombine Summary: This patch allows code-sinking in InstCombine to be performed when instruction have uses in llvm.assume. Use are considered droppable when it is preferable to modify the User such that the use disappears rather than to prevent a transformation because of the use. for now uses are considered droppable if they are in an llvm.assume. Reviewers: jdoerfert, nikic, spatel, lebedev.ri, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73832	2020-03-25 20:42:52 +01:00
Alina Sbirlea	3abcbf9903	[CFG/BasicBlock] Rename succ_const to const_succ. [NFC] Summary: Rename `succ_const_iterator` to `const_succ_iterator` and `succ_const_range` to `const_succ_range` for consistency with the predecessor iterators, and the corresponding iterators in MachineBasicBlock. Reviewers: nicholas, dblaikie, nlewycky Subscribers: hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75952	2020-03-25 12:40:55 -07:00
Gil Rapaport	078c863305	[LV] Replace stored value with a VPValue (NFCI) InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit introduces a VPValue for VPWidenMemoryInstructionRecipe to use as the stored value. The recipe is generated with a VPValue wrapping the stored value of the scalar store. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential Revision: https://reviews.llvm.org/D76373	2020-03-25 19:36:55 +02:00
Tyker	d72c586aeb	[NFC] Rename function to match Coding Convention and fix typo in KnowledgeRetention	2020-03-25 18:31:13 +01:00
Johannes Doerfert	5699d08b79	[Attributor] Use knowledge retained in llvm.assume (operand bundles) This patch integrates operand bundle llvm.assumes [0] with the Attributor. Most IRAttributes will now look at uses of the associated value and if there are llvm.assume operand bundle uses with the right tag we will check if they are in the must-be-executed-context (around the context instruction). Droppable users, which is currently only llvm::assume, are handled special in some places now as well. [0] http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D74888	2020-03-24 15:33:40 -05:00
Juneyoung Lee	49f75132bc	[DivRemPairs] Freeze operands if they can be undef values Summary: DivRemPairs is unsound with respect to undef values. ``` // bb1: // %rem = srem %x, %y // bb2: // %div = sdiv %x, %y // --> // bb1: // %div = sdiv %x, %y // %mul = mul %div, %y // %rem = sub %x, %mul ``` If X can be undef, X should be frozen first. For example, let's assume that Y = 1 & X = undef: ``` %div = sdiv undef, 1 // %div = undef %rem = srem undef, 1 // %rem = 0 => %div = sdiv undef, 1 // %div = undef %mul = mul %div, 1 // %mul = undef %rem = sub %x, %mul // %rem = undef - undef = undef ``` http://volta.cs.utah.edu:8080/z/m7Xrx5 Same for Y. If X = 1 and Y = (undef \| 1), %rem in src is either 1 or 0, but %rem in tgt can be one of many integer values. This resolves https://bugs.llvm.org/show_bug.cgi?id=42619 . This miscompilation disappears if undef value is removed, but it may take a while. DivRemPair happens pretty late during the optimization pipeline, so this optimization seemed as a good candidate to fix without major regression using freeze than other broken optimizations. Reviewers: spatel, lebedev.ri, george.burgess.iv Reviewed By: spatel Subscribers: wuzish, regehr, nlopes, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76483	2020-03-25 03:46:14 +09:00
Jun Ma	a44de12ab2	[Coroutines] Also check lifetime intrinsic for local variable when build coroutine frame Currently we move all allocas into the frame when build coroutine frame in CoroSplit pass. However, this can be relaxed. Since CoroSplit pass run after Inline pass, we can use lifetime intrinsic to do such analysis: If the scope of lifetime intrinsic is not across any suspend point, rather than move the allocas to frame, we can just move them to entry bb of corresponding function. This reduce the frame size. More importantly, this also avoid data race in multithread environment. Consider one inline function by coroutine: it starts a thread which access local variables, while after inline the movement of allocs to frame also access them. cause data race. Differential Revision: https://reviews.llvm.org/D75664	2020-03-24 13:41:55 +08:00
Vedant Kumar	b7cd291c15	[GlobalOpt] Treat null-check of loaded value as use of global (PR35760) PR35760 shows an example program which, when compiled with `clang -O0` or gcc at any optimization level, prints '0'. However, llvm transforms the program in a way that causes it to print '1'. Fix the issue by having `AllUsesOfValueWillTrapIfNull` return false when analyzing a load from a global which is used by an `icmp`. This special case was untested [0] so this is just deleting dead code. An alternative fix might be to change the GlobalStatus analysis for the global to report "Stored" instead of "StoredOnce". However, "StoredOnce" is appropriate when only one value other than the initializer is stored to the global. [0] http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/Transforms/IPO/GlobalOpt.cpp.html#L662 Differential Revision: https://reviews.llvm.org/D76645	2020-03-23 22:36:09 -07:00
Johannes Doerfert	f09f4b2676	[OpenMPOpt] Initialize value to avoid use of uninitialized memory This should fix the issue reported here: https://reviews.llvm.org/D76058#1937554	2020-03-23 19:17:19 -05:00
Matt Arsenault	b20a1d840f	GVNSink: Allow handling addrspacecast	2020-03-23 16:50:58 -04:00
Stefanos Baziotis	a650d555fc	[Attributor][NFC] Refactorings and typos in doc Reviewed By: sstefan1, uenoku Differential Revision: https://reviews.llvm.org/D76175	2020-03-23 22:44:10 +02:00
Matt Arsenault	43d98a0ecf	Allow replacing intrinsic operands with variables Since intrinsics can now specify when an argument is required to be constant, it is now OK to replace arguments with variables if they aren't. This means intrinsics must now be accurately marked with immarg.	2020-03-23 15:51:57 -04:00
Sanjay Patel	a1fe6beb1e	[InstCombine] remove one-use check for ctpop -> cttz Two one-use checks were added with rGfdcb27105537, but only the first one is necessary to limit an increase in instruction count. The second transform only creates one instruction, so it is always a reasonable canonicalization/optimization.	2020-03-23 13:59:57 -04:00
Johannes Doerfert	9d38f98dc3	[OpenMPOpt] Validate declaration types against the expected types Validation of the found runtime library functions declarations types (return and argument types) with the expected types. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D76058	2020-03-23 11:43:36 -05:00
Benjamin Kramer	ff2f5097ed	[Attributor] Fold single-use variable into assert Fixes unused variable warning in Release builds.	2020-03-23 17:41:52 +01:00
Johannes Doerfert	c57689bef2	[Attributor][NFC] Copy llvm::function_ref, don't use references On IRC this was called a "code smell" so we get rid of it.	2020-03-23 10:45:24 -05:00
Johannes Doerfert	68fed27067	[Attributor] Handle calls in AAValueConstantRange properly We did handle calls that were operands of certain instructions but not standalone calls we visit via indirection, e.g., selects.	2020-03-23 10:45:24 -05:00
Johannes Doerfert	54ec9b54f6	[Attributor] Unify handling of must-tail calls We special cased must-tail calls all over the place because they cannot be modified as other calls can be. However, we already centralized the modification API so we can centralize the handling as well. This simplifies the code and allows to remove must-tail calls completely.	2020-03-23 10:45:24 -05:00
Johannes Doerfert	0995001ce5	[Attributor][NFC] Predetermine the module before verification It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.	2020-03-23 10:45:23 -05:00
Johannes Doerfert	f3bf4b05c2	[Attributor][NFC] clang-format Attributor.{h,cpp}	2020-03-23 10:45:23 -05:00
Simon Pilgrim	fdcb271055	[InstCombine] Limit CTPOP -> CTTZ simplifications to one use Tweak D76568 so we only combine if it will remove the bit-twiddling. Suggested by @spatel	2020-03-23 14:33:41 +00:00
Guillaume Chatelet	32851f8d63	[Alignment][NFC] Deprecate VectorUtils::getAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, rogfer01, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76542	2020-03-23 13:54:15 +01:00
Simon Pilgrim	72d1419bfb	[InstCombine] Add CTPOP -> CTTZ simplifications (PR43513) As detailed on PR43513, we can simplify: ctpop(x \| -x) -> bitwidth - cttz(x, false) Alive2: http://volta.cs.utah.edu:8080/z/caw49X ctpop(~x & (x - 1)) -> cttz(x, false) Alive2: http://volta.cs.utah.edu:8080/z/5zfVrx I've tweaked the initial test cases I added at rG2d712fb75584 to increase commutativity testing. Differential Revision: https://reviews.llvm.org/D76568	2020-03-23 11:04:33 +00:00
Nikita Popov	dc81923659	[InstCombine] Remove ExpensiveCombines option D75801 removed the last and only user of this option, so we can drop it now. The original idea behind this was to only run expensive transforms under -O3, but apart from the one known bits transform, this has never really taken off. I believe nowadays the recommendation is to put expensive transforms in AggressiveInstCombine instead, though that isn't terribly popular either :) Differential Revision: https://reviews.llvm.org/D76540	2020-03-22 16:56:28 +01:00
Matt Arsenault	830cfda19f	Utils: Mostly convert memcpy expansion to use Align The TTI hooks aren't converted. I also think the intrinsics should have mandatory alignment and never return MaybeAlign.	2020-03-22 11:21:44 -04:00
Nikita Popov	a63eaa5449	[SLP] Avoid repeated visitation in getVectorElementSize(); NFC We need to insert into the Visited set at the same time we insert into the worklist. Otherwise we may end up pushing the same instruction to the worklist multiple times, and only adding it to the visited set later.	2020-03-22 14:34:29 +01:00
Simon Pilgrim	f00a4b531a	[InstCombine][X86] simplifyX86immShift - remove ConstantAggregateZero handling. NFC. The llvm::computeKnownBits path now handles this.	2020-03-21 11:30:44 +00:00
Nikita Popov	2b52e4e629	[InstCombine] Remove known bits constant folding If ExpensiveCombines is enabled (which is the case with -O3 on the legacy PM and always on the new PM), InstCombine tries to compute the known bits of all instructions in the hope that all bits end up being known, which is fairly expensive. How effective is it? If we add some statistics on how often the constant folding succeeds and how many KnownBits calculations are performed and run test-suite we get: "instcombine.NumConstPropKnownBits": 642, "instcombine.NumConstPropKnownBitsComputed": 18744965, In other words, we get one fold for every 30000 KnownBits calculations. However, the truth is actually much worse: Currently, known bits are computed before performing other folds, so there is a high chance that cases that get folded by known bits would also have been handled by other folds. What happens if we compute known bits after all other folds (hacky implementation: https://gist.github.com/nikic/751f25b3b9d9e0860db5dde934f70f46)? "instcombine.NumConstPropKnownBits": 0, "instcombine.NumConstPropKnownBitsComputed": 18105547, So it turns out despite doing 18 million known bits calculations, the known bits fold does not do anything useful on test-suite. I was originally planning to move this into AggressiveInstCombine so it only runs once in the pipeline, but seeing this, I think we're better off removing it entirely. As this is the only use of the "expensive combines" mechanism, it may be removed afterwards, but I'll leave that to a separate patch. Differential Revision: https://reviews.llvm.org/D75801	2020-03-20 20:54:06 +01:00
Nikita Popov	3205d1a860	[InstCombine] Handle known shl nsw sign bit in SimplifyDemanded Ideally SimplifyDemanded should compute the same known bits as computeKnownBits(). This patch addresses one discrepancy, where ValueTracking is more powerful: If we have a shl nsw shift, we know that the sign bit of the input and output must be the same. If this results in a conflict, the result is poison. This is implemented in `2c4ca6832f/lib/Analysis/ValueTracking.cpp (L1175-L1179)` and `2c4ca6832f/lib/Analysis/ValueTracking.cpp (L904-L908)`. This implements the same basic logic in SimplifyDemanded. It's slightly stronger, because I return undef instead of zero for the poison case (which is not an option inside ValueTracking). As mentioned in https://reviews.llvm.org/D75801#inline-698484, we could detect poison in more cases, this just establishes parity with the existing logic. Differential Revision: https://reviews.llvm.org/D76489	2020-03-20 18:16:05 +01:00
Simon Pilgrim	34659de5fd	[InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by scalar amounts to generic shifts (PR40391) The sll/srl/sra scalar vector shifts can be replaced with generic shifts if the shift amount is known to be in range. This also required public DemandedElts variants of llvm::computeKnownBits to be exposed (PR36319).	2020-03-20 15:48:06 +00:00
Nikita Popov	0372768776	[InstCombine] Simplify calls with "returned" attribute If a call argument has the "returned" attribute, we can simplify the call to the value of that argument. This was already partially handled by InstSimplify/InstCombine for the case where the argument is an integer constant, and the result is thus known via known bits. The non-constant (or non-int) argument cases weren't handled though. This previously landed as an InstSimplify transform, but was reverted due to assertion failures when compiling the Linux kernel. The reason is that simplifying a call to another call breaks assumptions in call graph updating during inlining. As the code is not easy to fix, and there is no particularly strong motivation for having this in InstSimplify, the transform is only performed in InstCombine instead. Differential Revision: https://reviews.llvm.org/D75815	2020-03-20 10:23:39 +01:00
Nikita Popov	5c10967157	[InstCombine] Don't replace musttail result based on known bits This is the same change as D75824, but for two cases where InstCombine performs the same optimization: Replacing an instruction whose bits are fully known with a constant. This is not (generally) legal for musttail calls. Differential Revision: https://reviews.llvm.org/D76457	2020-03-20 10:17:09 +01:00
Florian Hahn	be86bc76f0	[Matrix] Generalize ColumnMatrixTy to MatrixTy (NFC). This patch sets the stage for supporting both row and column major layouts for matrixes. It renames ColumnMatrixTy to MatrixTy, adds booleans indicating the underlying layout to both MatrixTy and ShapeInfo and generalizes the methods of MatrixTy to support both row and column major layouts. Reviewers: Gerolf, anemet, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76324	2020-03-20 08:32:13 +00:00
Florian Hahn	3a8372ed02	[DSE] Support traversing MemoryPhis. For MemoryPhis, we have to avoid that the MemoryPhi may be executed before before the access we are currently looking at. To do this we do a post-order numbering of the basic blocks in the function and bail out once we reach a MemoryPhi with a larger (or equal) post-order block number than the current MemoryAccess. This changes the order in which we visit stores for elimination. This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72148	2020-03-20 07:51:42 +00:00
Jun Ma	032251e34d	[Coroutines] Fix PR45130 For now, when final suspend can be simplified by simplifySuspendPoint, handleFinalSuspend is executed as well to remove last case in switch instruction. This patch fixes it. Differential Revision: https://reviews.llvm.org/D76345	2020-03-20 11:27:08 +08:00
Benjamin Kramer	1db8b341a6	[Matrix] Fold single-use variable into assert Avoids -Wunused-variable warnings in Release builds.	2020-03-19 21:42:22 +01:00
Florian Hahn	796fb2e474	[Matrix] Move multiply-add code generation into separate function (NFC). This logic can be shared with the tiled code generation. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75565	2020-03-19 20:26:19 +00:00
Kazu Hirata	e23d786526	[JumpThreading] Fix infinite loop (PR44611) Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44611 by preventing an infinite loop in the jump threading pass when -jump-threading-across-loop-headers is on. Specifically, without this patch, jump threading through two basic blocks would trigger on the same area of the CFG over and over, resulting in an infinite loop. Consider testcase PR44611-across-header-hang.ll in this patch. The first opportunity to thread through two basic blocks is: from bb_body2 through bb_header and bb_body1 to bb_body2. The pass duplicates bb_header and bb_body1 as, say, bb_header.thread1 and bb_body1.thread1. Since bb_header contains a successor edge back to itself, bb_header.thread1 also contains a successor edge to bb_header, immediately giving rise to the next jump threading opportunity: from bb_header.thread1 through bb_header and bb_body1 to bb_body2. After that, we repeatedly thread an incoming edge into bb_header through bb_header and bb_body1 to bb_body2. In other words, we keep peeling one iteration from bb_header's self loop. The patch fixes the problem by preventing the pass from duplicating a basic block containing a self loop. Reviewers: wmi, junparser, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76390	2020-03-19 12:49:36 -07:00
Florian Hahn	0cc2d23751	[Matrix] Hoist load/store generation logic, add helpers for tiled access. This patch slightly generalizes the code to emit loads and stores of a matrix and adds helpers to load/store a tile of a larger matrix. This will be used in a follow-up patch introducing initial tiling. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75564	2020-03-19 19:28:21 +00:00
Simon Pilgrim	a11e5b32df	[InstCombine][X86] simplifyX86immShift - handle variable out-of-range vector shift by immediate amounts (PR40391) If we know the SSE shift amount is out of range then we can simplify to zero value (logical) or a 'signsplat' bitwidth-1 shift (arithmetic). This allows us to remove the equivalent ConstantInt constant folding path from simplifyX86immShift.	2020-03-19 18:27:31 +00:00
Simon Pilgrim	433897da4a	[InstCombine][X86] simplifyX86immShift - convert variable in-range vector shift by immediate amounts to generic shifts (PR40391) The slli/srli/srai 'immediate' vector shifts (although its not immediate anymore to match gcc) can be replaced with generic shifts if the shift amount is known to be in range.	2020-03-19 15:44:24 +00:00
Florian Hahn	4a58996dd2	[SCCP] Use constant ranges for PHI nodes. For PHIs with multiple incoming values, we can improve precision by using constant ranges for integers. We can over-approximate phis by merging the incoming values. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71933	2020-03-19 12:45:33 +00:00
Florian Hahn	8a36594a7e	[SCCP] Use constant ranges for binary operators. If one of the operands of a binary operator is a constant range, we can use ConstantRange::binaryOp to approximate the result. We still handle single element constant ranges as we did previously, with ConstantExpr::get(), because ConstantRange::binaryOp still gives worse results in a few cases for single element ranges. Also note that we bail out early if any of the operands is still unknown. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71936	2020-03-19 09:35:48 +00:00
Huihui Zhang	2ea5495759	[InstCombine][SVE] Fix InstCombiner::visitAllocaInst for scalable vector. Summary: DataLayout::getTypeAllocSize() return TypeSize. For cases where scalable property doesn't matter (check for zero-sized alloca), we should explicitly call getKnownMinSize() to avoid implicit type conversion to uint64_t, which is invalid for scalable vector type. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76386	2020-03-18 20:57:14 -07:00
Florian Hahn	fd2c15e602	[VPlan] Do not print mapping for Value2VPValue. The latest improvements to VPValue printing make this mapping clear when printing the operand. Printing the mapping separately is not required any longer. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76375	2020-03-18 21:44:07 +00:00
Florian Hahn	00c1cd1934	[VPlan] Record underlying value for VPValues created by addVPValue (NFC). Now that printing VPValues uses the underlying IR value name, if available, recording the underlying value here improves printing. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76374	2020-03-18 21:30:58 +00:00
Eli Friedman	e24e95fe90	Remove CompositeType class. The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660	2020-03-18 13:53:17 -07:00
Florian Hahn	e6a74803d4	[VPlan] Use underlying value for printing, if available. When the an underlying value is available, we can use its name for printing, as discussed in D73078. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76200	2020-03-18 17:46:57 +00:00
Simon Pilgrim	f4e495a18e	[InstCombine][X86] simplifyX86varShift - convert variable in-range per-element shift amounts to generic shifts (PR40391) AVX2/AVX512 per-element shifts can be replaced with generic shifts if the shift amounts are guaranteed to be in-range (upper bits are known zero).	2020-03-18 11:26:54 +00:00
Florian Hahn	5672ae8d86	[SCCP] Use constant ranges for select, if cond is overdefined. For selects with an unknown condition, we can approximate the result by merging the state of both options. This automatically takes care of the case where on operand is undef. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71935	2020-03-18 09:26:02 +00:00
Michael Liao	f2f8bdc2b1	Fix `-Wunused-variable` warning. NFC.	2020-03-17 20:15:50 -04:00
Florian Hahn	a72ae99cf9	[SCCP] Split up callsite handling, only propagate result on change (NFC) Functions include their arguments in the use-list. Changed function values mean that the result of the function changed. We only need to update the call sites with the new function result and do not have to propagate the call arguments. To do so, this patch splits up the visitCallSite into handleCallResult and handleCallArguments and updates markUsersAsChanged to only update call results for functions. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75846	2020-03-17 20:05:35 +00:00
Sanjay Patel	be9e3d9416	[InstCombine] reduce demand-limited bool math to logic, part 2 Follow-on suggested in: D75961	2020-03-17 15:18:18 -04:00
Tyker	e8ac825f5b	[AssumeBundles] Detection of Empty bundles Summary: Prevent InstCombine from removing llvm.assume for which the arguement is true when they have operand bundles with usefull information. Reviewers: jdoerfert, nikic, lebedev.ri Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76147	2020-03-17 15:50:15 +01:00
Florian Hahn	1d6f919df2	[SCCP] Explicitly mark values as overdefined (NFC). This was part of D60582 but can be committed separately.	2020-03-17 12:13:30 +00:00
Roman Lebedev	398b497cd0	[NFC] LoopRotate: do issue debug message when not rotating due to instr count It is somewhat problematic to notice this issue otherwise.	2020-03-17 09:26:09 +03:00
Serguei Katkov	80c351cdb6	[InstCombine] Transform to undef incorrect atomic unordered mem intrinsics According to LangRef: If len is not a positive integer multiple of element_size, then the behaviour of the intrinsic is undefined. Add InstCombine rule to transform intrinsic to undef operation. This is a follow-up for D76116. Reviewers: reames Reviewed By: reames Subscribers: hiraditya, jfb, dantrushin, llvm-commits Differential Revision: https://reviews.llvm.org/D76215	2020-03-17 10:20:16 +07:00
Matt Arsenault	b0bdb186f5	Utils: Always set alignment when expanding mem intrinsics This was creating natural aligned loads and stores, which may not be the case. The target could request a wider type load with less alignment.	2020-03-16 14:34:29 -04:00
Matt Arsenault	05e7d8d6ce	TTI: Add addrspace parameters to memcpy lowering functions	2020-03-16 14:34:29 -04:00
Florian Hahn	4878aa36d4	[ValueLattice] Add new state for undef constants. This patch adds a new undef lattice state, which is used to represent UndefValue constants or instructions producing undef. The main difference to the unknown state is that merging undef values with constants (or single element constant ranges) produces the constant/constant range, assuming all uses of the merge result will be replaced by the found constant. Contrary, merging non-single element ranges with undef needs to go to overdefined. Using unknown for UndefValues currently causes mis-compiles in CVP/LVI (PR44949) and will become problematic once we use ValueLatticeElement for SCCP. Reviewers: efriedma, reames, davide, nikic Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75120	2020-03-14 17:19:59 +00:00
Whitney Tsang	aca7167535	[NFC][LoopUnrollAndJam] clang-format. I am currently working on this file.	2020-03-14 00:04:10 +00:00
Akira Hatanaka	c6f1713c46	[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't a tail call This reapplies the patch in https://reviews.llvm.org/rG1f5b471b8bf4, which was reverted because it was causing crashes. https://bugs.chromium.org/p/chromium/issues/detail?id=1061289#c2 Check that HasSafePathToCall is true before checking the call is a tail call. Original commit message: Previosly ARC optimizer removed the autoreleaseRV/retainRV pair in the following code, which caused the object returned by @something to be placed in the autorelease pool because the call to @something isn't a tail call: ``` %call = call i8* @something(...) %2 = call i8* @objc_retainAutoreleasedReturnValue(i8* %call) %3 = call i8* @objc_autoreleaseReturnValue(i8* %2) ret i8* %3 ``` Fix the bug by checking whether @something is a tail call. rdar://problem/59275894	2020-03-13 13:52:14 -07:00
Alexey Zhikhartsev	f71abec661	[LoopInterchange] Fix interchanging contents of preheader BBs Summary: Previously LCSSA was getting broken by placing instructions into the (newly) inner header instead of the preheader. Fixes PR43474 Reviewers: fhahn Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75943	2020-03-13 15:59:37 -04:00
Reid Kleckner	478b06e687	Revert "[ObjC][ARC] Check the basic block size before calling DominatorTree::dominate" This reverts commit `5c3117b0a9` This should not be necessary after `7593a480db`, and Florian Hahn has confirmed that the problem no longer reproduces with this patch. I happened to notice this code because the FIXME talks about OrderedBasicBlock. Reviewed By: fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D76075	2020-03-13 11:57:55 -07:00
Huihui Zhang	fc1f205745	[SLPVectorizer][SVE] Bail out early for scalable vector. Summary: SLPVectorizer try to vectorize list of scalar instructions of the same type, instructions already vectorized are rejected through isValidElementType(). Without this patch, tryToVectorizeList() will first try to determine vectorization factor of a list of Instructions before checking whether each instruction has unsupported type or not. For instructions already vectorized for SVE, it will crash at getVectorElementSize(), where it try to return a fixed size. This patch make sure invalid element types are rejected before trying to get vectorization factor. This make sure we are not trying to vectorize instructions already vectorized. Reviewers: sdesmalen, efriedma, spatel, RKSimon, ABataev, apazos, rengolin Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76017	2020-03-13 11:23:31 -07:00
Sanjay Patel	94f5d73182	[SimplifyCFG] fix formatting; NFC	2020-03-13 14:12:28 -04:00
Sanjay Patel	51e53af11c	[SimplifyCFG] fix debug print formatting; NFC	2020-03-13 14:12:28 -04:00
Florian Hahn	0c5b6e2ea5	Recommit "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This patch should fix the cause of the stage2 failures and PR45185. This reverts the revert commit `c52f839e72`.	2020-03-13 17:03:22 +00:00
Tyker	69375fd0a3	[AssumeBundles] Preserve Information in the inliner Summary: during inling Create and insert an llvm.assume with attributes to preserve them. to prevent any changes for now generation of llvm.assume is under a flag disabled by default. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75825	2020-03-13 17:35:47 +01:00
omarahmed1111	b285b333dc	[Attributor] Detect possibly unbounded cycles in functions This patch add mayContainUnboundedCycle helper function which checks whether a function has any cycle which we don't know if it is bounded or not. Loops with maximum trip count are considered bounded, any other cycle not. It also contains some fixed tests and some added tests contain bounded and unbounded loops and non-loop cycles. Reviewed By: jdoerfert, uenoku, baziotis Differential Revision: https://reviews.llvm.org/D74691	2020-03-13 11:17:33 -05:00
Pankaj Gode	bf990530ae	[Attributor] Improve noalias preservation using reachability Resolution for below fixme: (ii) Check whether the value is captured in the scope using AANoCapture. FIXME: This is conservative though, it is better to look at CFG and check only uses possibly executed before this callsite. Propagates caller argument's noalias attribute to callee. Reviewed by: jdoerfert, uenoku Reviewers: jdoerfert, sstefan1, uenoku Subscribers: uenoku, sstefan1, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D71617	2020-03-13 21:09:08 +05:30
Sanjay Patel	cbeffa3f6c	[SimplifyCFG] convert if-else chain to switch; NFC Fix formatting of related function names while changing the code.	2020-03-13 10:28:41 -04:00
Nico Weber	86eb2c3991	Revert "[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't" This reverts commit `1f5b471b8b`. Causes asserts when building code with arc. See https://bugs.chromium.org/p/chromium/issues/detail?id=1061289#c2 for a full repro. Will post a creduced repro once creduce is done running.	2020-03-13 10:16:02 -04:00
Johannes Doerfert	a198adb490	[Attributor] IPO across definition boundary of a function marked alwaysinline Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D75590	2020-03-13 01:06:12 -05:00
rathod-sahaab	263c4a3c75	Fix compiler warning when compiling without asserts This patch aims to prevent warning-as-error failures in release build. As suggested in this comment https://reviews.llvm.org/D69930#1910922 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D75970	2020-03-13 00:26:49 -05:00
Huihui Zhang	118abf2017	[SVE] Update API ConstantVector::getSplat() to use ElementCount. Summary: Support ConstantInt::get() and Constant::getAllOnesValue() for scalable vector type, this requires ConstantVector::getSplat() to take in 'ElementCount', instead of 'unsigned' number of element count. This change is needed for D73753. Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74386	2020-03-12 13:22:41 -07:00
Florian Hahn	c52f839e72	Revert "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This commit is likely causing clang-with-lto-ubuntu to fail http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/16052 Also causes PR45185. This reverts commit `f1ac5d2263`.	2020-03-12 18:49:11 +00:00
Hideto Ueno	d9bf79f4e9	[Attributor][FIX] Add a missing dependence track in noalias deduction	2020-03-12 15:27:35 +00:00
Florian Hahn	f1ac5d2263	[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI) This patch switches SCCP to use ValueLatticeElement for lattice values, instead of the local LatticeVal, as first step to enable integer range support. This patch does not make use of constant ranges for additional operations and the only difference for now is that integer constants are represented by single element ranges. To preserve the existing behavior, the following helpers are used * isConstant(LV): returns true when LV is either a constant or a constant range with a single element. This should return true in the same cases where LV.isConstant() returned true previously. * getConstant(LV): returns a constant if LV is either a constant or a constant range with a single element. This should return a constant in the same cases as LV.getConstant() previously. * getConstantInt(LV): same as getConstant, but additionally casted to ConstantInt. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60582	2020-03-12 12:03:06 +00:00
Max Kazantsev	3dc6e53c97	[LoopPeel] Turn incorrect assert into a check Summary: This patch replaces incorrectt assert with a check. Previously it asserts that if SCEV cannot prove `isKnownPredicate(A != B)`, then it should be able to prove `isKnownPredicate(A == B)`. Both these fact may be not provable. It is shown in the provided test: Could not prove: `{-294,+,-2}<%bb1> != 0` Asserting: `{-294,+,-2}<%bb1> == 0` Obviously, this SCEV is not equal to zero, but 0 is in its range so we cannot also prove that it is not zero. Instead of assert, we should be checking the required conditions explicitly. Reviewers: lebedev.ri, fhahn, sanjoy, fedor.sergeev Reviewed By: lebedev.ri Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76050	2020-03-12 17:23:07 +07:00
Sanjay Patel	fae900921b	[InstCombine] reduce demand-limited bool math to logic The cmp math test is inspired by memcmp() patterns seen in D75840. I know there's at least 1 related fold we can do here if both values are sext'd, but I'm not seeing a way to generalize further. We have some other bool math patterns that we want to reduce, but that might require fixing the bogus transforms noted in D72396. Alive proof translations of the regression tests: https://rise4fun.com/Alive/zGWi Name: demand add 1 %xz = zext i1 %x to i32 %ys = sext i1 %y to i32 %sub = add i32 %xz, %ys %r = lshr i32 %sub, 31 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = zext i1 %and to i32 Name: demand add 2 %xz = zext i1 %x to i5 %ys = sext i1 %y to i5 %sub = add i5 %xz, %ys %r = and i5 %sub, 16 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = select i1 %and, i5 -16, i5 0 Name: demand add 3 %xz = zext i1 %x to i8 %ys = sext i1 %y to i8 %a = add i8 %ys, %xz %r = ashr i8 %a, 7 => %notx = xor i1 %x, 1 %and = and i1 %y, %notx %r = sext i1 %and to i8 Name: cmp math %gt = icmp ugt i32 %x, %y %lt = icmp ult i32 %x, %y %xz = zext i1 %gt to i32 %yz = zext i1 %lt to i32 %s = sub i32 %xz, %yz %r = lshr i32 %s, 31 => %r = zext i1 %lt to i32 Differential Revision: https://reviews.llvm.org/D75961	2020-03-11 15:45:58 -04:00
Florian Hahn	bc6c8c4bbb	[Matrix] Add remark propagation along the inlined-at chain. This patch adds support for propagating matrix expressions along the inlined-at chain and emitting remarks at the traversed function scopes. To motivate this new behavior, consider the example below. Without the remark 'up-leveling', we would only get remarks in load.h and store.h, but we cannot generate a remark describing the full expression in toplevel.cpp, which is the place where the user has the best chance of spotting/fixing potential problems. With this patch, we generate a remark for the load in load.h, one for the store in store.h and one for the complete expression in toplevel.cpp. For a bigger example, please see remarks-inlining.ll. load.h: template <typename Ty, unsigned R, unsigned C> Matrix<Ty, R, C> load(Ty Ptr) { Matrix<Ty, R, C> Result; Result.value = reinterpret_cast <typename Matrix<Ty, R, C>::matrix_t >(Ptr); return Result; } store.h: template <typename Ty, unsigned R, unsigned C> void store(Matrix<Ty, R, C> M1, Ty Ptr) { reinterpret_cast<typename decltype(M1)::matrix_t >(Ptr) = M1.value; } toplevel.cpp void test(double A, double B, double *C) { store(add(load<double, 3, 5>(A), load<double, 3, 5>(B)), C); } For a given function, we traverse the inlined-at chain for each matrix instruction (= instructions with shape information). We collect the matrix instructions in each DISubprogram we visit. This produces a mapping of DISubprogram -> (List of matrix instructions visible in the subpogram). We then generate remarks using the list of instructions for each subprogram in the inlined-at chain. Note that the list of instructions for a subprogram includes the instructions from its own subprograms recursively. For example using the example above, for the subprogram 'test' this includes inline functions 'load' and 'store'. This allows surfacing the remarks at a level useful to users. Please note that the current approach may create a lot of extra remarks. Additional heuristics to cut-off the traversal can be implemented in the future. For example, it might make sense to stop 'up-leveling' once all matrix instructions are at the same debug location. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D73600	2020-03-11 17:40:08 +00:00
Anna Welker	a6d3bec83f	[TTI][ARM][MVE] Refine gather/scatter cost model Refines the gather/scatter cost model, but also changes the TTI function getIntrinsicInstrCost to accept an additional parameter which is needed for the gather/scatter cost evaluation. This did require trivial changes in some non-ARM backends to adopt the new parameter. Extending gathers and truncating scatters are now priced cheaper. Differential Revision: https://reviews.llvm.org/D75525	2020-03-11 10:23:41 +00:00
Fangrui Song	a0c0389ffb	[SimplifyLibcalls] Don't replace locked IO (fgetc/fgets/fputc/fputs/fread/fwrite) with unlocked IO (_unlocked) This essentially reverts some of the SimplifyLibcalls part changes of D45736 [SimplifyLibcalls] Replace locked IO with unlocked IO. C11 7.21.5.2 The fflush function > If stream is a null pointer, the fflush function performs this flushing action on all streams for which the behavior is defined above. i.e. fopen'ed FILE is inherently captured. POSIX.1-2017 getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked - stdio with explicit client locking > These functions can safely be used in a multi-threaded program if and only if they are called while the invoking thread owns the ( FILE ) object, as is the case after a successful call to the flockfile() or ftrylockfile() functions. After a thread fopen'ed a FILE, when it is calling foobar() which is now replaced by foobar_unlocked(), if another thread is concurrently calling fflush(0), the behavior is undefined. C11 7.22.4.4 The exit function > Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed. The replacement is only feasible if the program is single threaded, or exit or fflush(0) is never called. See also http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556615.html for how the replacement makes libc interceptors difficult to implement. dalias: in a worst case, it's unbounded data corruption because of concurrent access to pointers without synchronization. f->wpos or rpos could get outside of the buffer, thread A could do f->wpos += j after knowing j is in bounds, while thread B also changes it concurrently. This can produce exploitable conditions depending on libc internals. Revert the SimplifyLibcalls part change because the cons obviously overweigh the pros. Even when the replacement is feasible, the benefit is indemonstrable, more so in an application instead of an artificial glibc benchmark. Theoretically the replacement could be beneficial when calling getc_unlocked/putc_unlocked in a loop, but then it is better using a blocked IO operation and the user is likely aware of that. The function attribute inference is still useful and thus kept. Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D75933	2020-03-10 11:11:58 -07:00
Benjamin Kramer	247a177cf7	Give helpers internal linkage. NFC.	2020-03-10 18:27:42 +01:00
Tyker	a4cde9ad7b	Fixed [AssumeBundles] Move to IR so it can be used by Analysis This is a recommit of `57c964aaa7` after fixing modules build.	2020-03-10 18:02:39 +01:00
Simon Moll	d871ef4e6a	[instcombine] remove fsub to fneg hacks; only emit fneg Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp negation. This also extends the scalarization cost in instcombine for unary operators to result in the same IR rewrites for fneg as for the idiom. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75467	2020-03-10 16:57:02 +01:00
Florian Hahn	c8c14d979a	[InstCombine] Support vectors in SimplifyAddWithRemainder. SimplifyAddWithRemainder currently also matches for vector types, but tries to create an integer constant, which causes a crash. By using Constant::getIntegerValue() we can support both the scalar and vector cases. The 2 added test cases crash without the fix. Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Differential Revision: https://reviews.llvm.org/D75906	2020-03-10 14:29:40 +00:00
Jonas Paulsson	c2dafe12dc	[SimplifyCFG] Skip merging return blocks if it would break a CallBr. SimplifyCFG should not merge empty return blocks and leave a CallBr behind with a duplicated destination since the verifier will then trigger an assert. This patch checks for this case and avoids the transformation. CodeGenPrepare has a similar check which also has a FIXME comment about why this is needed. It seems perhaps better if these two passes would eventually instead update the CallBr instruction instead of just checking and avoiding. This fixes https://bugs.llvm.org/show_bug.cgi?id=45062. Review: Craig Topper Differential Revision: https://reviews.llvm.org/D75620	2020-03-10 14:59:13 +01:00
Sanjay Patel	467eec0910	[InstCombine] fold gep-of-select-of-constants (PR45084) As shown in: https://bugs.llvm.org/show_bug.cgi?id=45084 ...we failed to combine a gep with constant indexes with a pointer operand that is a select of constants. Differential Revision: https://reviews.llvm.org/D75807	2020-03-10 09:25:13 -04:00
Florian Hahn	2d6ecf4648	[SLP] Support vectorizing functions provided by vector libs. It seems like the SLPVectorizer is currently not aware of vector versions of functions provided by libraries like Accelerate [1]. This patch updates SLPVectorizer to use the same infrastructure the LoopVectorizer uses to detect vectorizable library functions. For calls, it computes the cost of an intrinsic call (existing behavior) and the cost of a vector function library call, if available. Like LoopVectorizer, it assumes the cost of the vector function is simply the cost of a call to a vector function. [1] https://developer.apple.com/documentation/accelerate Reviewers: ABataev, RKSimon, spatel Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D75878	2020-03-10 13:10:50 +00:00
ahatanak	1f5b471b8b	[ObjC][ARC] Don't remove autoreleaseRV/retainRV pairs if the call isn't a tail call Previosly ARC optimizer removed the autoreleaseRV/retainRV pair in the following code, which caused the object returned by @something to be placed in the autorelease pool because the call to @something isn't a tail call: ``` %call = call i8* @something(...) %2 = call i8* @objc_retainAutoreleasedReturnValue(i8* %call) %3 = call i8* @objc_autoreleaseReturnValue(i8* %2) ret i8* %3 ``` Fix the bug by checking whether @something is a tail call. rdar://problem/59275894	2020-03-09 13:21:38 -07:00
Nikita Popov	c3ca6876ed	[InstCombine] Don't simplify calls without uses When simplifying a call without uses, replaceInstUsesWith() is going to do nothing, but we'll skip all following folds. We can only run into this problem with calls that both simplify and are not trivially dead if unused, which currently seems to happen only with calls to undef, as the test diff shows. When extending SimplifyCall() to handle "returned" attributes, this becomes a much bigger problem, so I'm fixing this first. Differential Revision: https://reviews.llvm.org/D75814	2020-03-09 18:47:46 +01:00
Jonas Devlieghere	882f589e20	Revert "[AssumeBundles] Move to IR so it can be used by Analysis" This breaks the modules build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/ http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/ This reverts commit `57c964aaa7`.	2020-03-09 09:02:47 -07:00
evgeny	6d2032e259	[WPD] Provide a way to prevent functions from being devirtualized Differential revision: https://reviews.llvm.org/D75617	2020-03-09 14:05:15 +03:00
Hideto Ueno	bdcbdb4848	[Attributor] Deduction based on path exploration This patch introduces the propagation of known information based on path exploration. For example, ``` int u(int c, int p){ if(c) { return p; } else { return *p + 1; } } ``` An argument `p` is dereferenced whatever c's value is. For an instruction `CtxI`, we accumulate branch instructions in the must-be-executed-context of `CtxI` and then, we take the conjunction of the successors' known state. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65593	2020-03-09 14:29:26 +09:00
Sanjay Patel	a69158c12a	[VectorCombine] fold extract-extract-op with different extraction indexes opcode (extelt V0, Ext0), (ext V1, Ext1) --> extelt (opcode (splat V0, Ext0), V1), Ext1 The first part of this patch generalizes the cost calculation to accept different extraction indexes. The second part creates a shuffle+extract before feeding into the existing code to create a vector op+extract. The patch conservatively uses "TargetTransformInfo::SK_PermuteSingleSrc" rather than "TargetTransformInfo::SK_Broadcast" (splat specifically from element 0) because we do not have a more general "SK_Splat" currently. That does not affect any of the current regression tests, but we might be able to find some cost model target specialization where that comes into play. I suspect that we can expose some missing x86 horizontal op codegen with this transform, so I'm speculatively adding a debug flag to disable the binop variant of this transform to allow easier testing. The test changes show that we're sensitive to cost model diffs (as we should be), so that means that patches like D74976 should have better coverage. Differential Revision: https://reviews.llvm.org/D75689	2020-03-08 09:57:55 -04:00
Tyker	57c964aaa7	[AssumeBundles] Move to IR so it can be used by Analysis Summary: Assume bundles need to be usable by Analysis and Transforms/Utils isn't. so this commit moves utilities to deal with asusme bundles to IR. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75618	2020-03-08 12:21:50 +01:00
Tyker	84056394e9	[AssumeBundles] Add API to query a bundles from a use Summary: Finding what information is know about a value from a use is generally useful and can be done quickly. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75616	2020-03-08 12:04:23 +01:00
Nikita Popov	51a466a61f	[InstCombine] Fix known bits handling in SimplifyDemandedUseBits Fixes a regression from D75801. SimplifyDemandedUseBits() is also supposed to compute the known bits (of the demanded subset) of the instruction. For unknown instructions it does so by directly calling computeKnownBits(). For known instructions it will compute known bits itself. However, for instructions where only some cases are handled directly (e.g. a constant shift amount) the known bits invocation for the unhandled case is sometimes missing. This patch adds the missing calls and thus removes the main discrepancy with ExpensiveCombines mode. Differential Revision: https://reviews.llvm.org/D75804	2020-03-07 18:16:41 +01:00
Stefanos Baziotis	01c48d7d11	[Attributor] Fold terminators before changing instructions to unreachable It is possible that an instruction to be changed to unreachable is in the same block with a terminator that can be constant-folded. In this case, as of now, the instruction will be changed to unreachable before the terminator is folded. But, then the whole BB becomes invalidated and so when we go ahead to fold the terminator, we trap. Change the order of these two. Differential Revision: https://reviews.llvm.org/D75780	2020-03-07 12:38:44 +02:00
Andrew Monshizadeh	c5a06019d2	Extend TimeTrace to LLVM's new pass manager With the addition of the LLD time tracing it made sense to include coverage for LLVM's various passes. Doing so ensures that ThinLTO is also covered with a time trace. Before: {F11333974} After: {F11333928} Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D74516	2020-03-06 14:45:19 -08:00
Anna Thomas	59029b9eef	[RS4GC] Handle uses of extractelement for conversion from vector to scalar base As mentioned in the comments, extractelement is special since we actually want a scalar base for that element we extracted from the vector (i.e. not a vector base). This same logic should apply to uses of the extractelement such as phis and selects which have the same BDV as the extractelement. Howeber, for these uses we conservatively mark the BDV state as conflict, since setting the EE's new base BDV does not always dominate these uses. Added testcase showcases the problem where the BDV identification chokes on the incorrect cast from vector to scalar for the phi use of extractelement. Tests-Run: make check, internal fuzzer testing Reviewers: reames, skatkov, dantrushin Reviewed-By: dantrushin Differential Revision: https://reviews.llvm.org/D75704	2020-03-06 16:28:49 -05:00
Roman Lebedev	1badf7c33a	[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org/z/G24anE This resolves phase-ordering bug that was introduced in D75145 for https://godbolt.org/z/2gBwF2 https://godbolt.org/z/XvgSua Reviewers: spatel, nikic, dmgreen, xbolva00 Reviewed By: nikic, xbolva00 Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75757	2020-03-06 21:39:07 +03:00
Jay Foad	11d1573bb6	[APFloat] Make use of new overloaded comparison operators. NFC. Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75744	2020-03-06 16:42:53 +00:00
Fangrui Song	952ee0df9e	ThinLTOBitcodeWriter: drop dso_local when a GlobalVariable is converted to a declaration If we infer the dso_local flag for -fpic, dso_local should be dropped when we convert a GlobalVariable a declaration. dso_local causes the generation of direct access (e.g. R_X86_64_PC32). Such relocations referencing STB_GLOBAL STV_DEFAULT objects are not allowed in a -shared link. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D74749	2020-03-05 18:09:33 -08:00
Zhongduo Lin	eae228a292	[IndVarSimplify] Extend previous special case for load use instruction to any narrow type loop variant to avoid extra trunc instruction Summary: The widenIVUse avoids generating trunc by evaluating the use as AddRec, this will not work when: 1) SCEV traces back to an instruction inside the loop that SCEV can not expand, eg. add %indvar, (load %addr) 2) SCEV finds a loop variant, eg. add %indvar, %loopvariant While SCEV fails to avoid trunc, we can still try to use instruction combining approach to prove trunc is not required. This can be further extended with other instruction combining checks, but for now we handle the following case (sub can be "add" and "mul", "nsw + sext" can be "nus + zext") ``` Src: %c = sub nsw %b, %indvar %d = sext %c to i64 Dst: %indvar.ext1 = sext %indvar to i64 %m = sext %b to i64 %d = sub nsw i64 %m, %indvar.ext1 ``` Therefore, as long as the result of add/sub/mul is extended to wide type with right extension and overflow wrap combination, no trunc is required regardless of how %b is generated. This pattern is common when calculating address in 64 bit architecture. Note that this patch reuse almost all the code from D49151 by @az: https://reviews.llvm.org/D49151 It extends it by providing proof of why trunc is unnecessary in more general case, it should also resolve some of the concerns from the following discussion with @reames. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180910/585945.html Reviewers: sanjoy, efriedma, sebpop, reames, az, javed.absar, amehsan Reviewed By: az, amehsan Subscribers: hiraditya, llvm-commits, amehsan, reames, az Tags: #llvm Differential Revision: https://reviews.llvm.org/D73059	2020-03-05 16:27:59 -05:00
Hiroshi Yamauchi	76b9901fb1	[PGO][PGSO] Use IsColdXNthPercentile for sample PGO. Summary: This performs better for sample PGO. NFC as PGSOColdCodeOnlyForSamplePGO is still true. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75550	2020-03-05 09:54:54 -08:00
Florian Hahn	40e7bfc424	[VPlan] Use consecutive numbers to print VPValues instead of addresses. Currently when printing VPValues we use the object address, which makes it hard to distinguish VPValues as they usually are large numbers with varying distance between them. This patch adds a simple slot tracker, similar to the ModuleSlotTracker used for IR values. In order to dump a VPValue or anything containing a VPValue, a slot tracker for the enclosing VPlan needs to be created. The existing VPlanPrinter can take care of that for the existing code. We assign consecutive numbers to each VPValue we encounter in a reverse post order traversal of the VPlan. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D73078	2020-03-05 14:55:15 +00:00
Simon Pilgrim	01a91a6de7	Fix static analyzer uninitialized variable warning. NFCI.	2020-03-05 14:22:24 +00:00
Jun Ma	b10deb9487	[Coroutines] Optimized coroutine elision based on reachability Differential Revision: https://reviews.llvm.org/D75440	2020-03-05 14:43:50 +08:00
Sameer Sahasrabuddhe	42febbab91	StructurizeCFG: simplify phi nodes when possible After structurization, some phi nodes can have a single incoming edge and can be simplified away. This change runs a simplify query on all phis that are either modified or added by the structurizer. This also moves some phis closer to their use as a side benefit. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D75500	2020-03-05 10:33:15 +05:30
Guozhi Wei	ee9a3eba76	[CodeGenPrepare] Handle ExtractValueInst in dupRetToEnableTailCallOpts As the test case shows if there is an ExtractValueInst in the Ret block, function dupRetToEnableTailCallOpts can't duplicate it into the block containing call. So later no tail call is generated in CodeGen. This patch adds the ExtractValueInst handling code in function dupRetToEnableTailCallOpts and FoldReturnIntoUncondBranch, and later tail call can be generated for this case. Differential Revision: https://reviews.llvm.org/D74242	2020-03-04 11:10:32 -08:00
David Green	38e532278e	[LSR] Add masked load and store handling This teaches Loop Strength Reduction the details about masked load and store address operands, so that it can have a better time optimising them as it would for normal loads and stores. Differential Revision: https://reviews.llvm.org/D75371	2020-03-04 18:36:10 +00:00
Nikita Popov	293d813020	[InstCombine] Don't explicitly invoke const folding in shift combine InstCombine uses an IRBuilder that automatically performs target-dependent constant folding, so explicitly invoking it here is not necessary.	2020-03-04 18:33:00 +01:00
Nikita Popov	9b5de84e27	[InstCombine] Use IRBuilder to create bitcast This makes sure that the constant expression bitcast goes through target-dependent constant folding, and thus avoids an additional iteration of InstCombine.	2020-03-04 18:28:38 +01:00
Nikita Popov	0e890cd4d4	[ConstantFolding] Always return something from ConstantFoldConstant Spin-off from D75407. As described there, ConstantFoldConstant() currently returns null for non-ConstantExpr/ConstantVector inputs, but otherwise always returns non-null, independently of whether any folding has happened or not. This is confusing and makes consumer code more complicated. I would expect either that ConstantFoldConstant() returns only if it actually folded something, or that it always returns non-null. I'm going to the latter possibility here, which appears to be more useful considering existing usage. Differential Revision: https://reviews.llvm.org/D75543	2020-03-04 18:24:47 +01:00
Sanjay Patel	71a316883d	[PassManager] adjust VectorCombine placement The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs: https://bugs.llvm.org/show_bug.cgi?id=45015 https://bugs.llvm.org/show_bug.cgi?id=42022 This patch contains a few independent changes: 1. Move the pass up in the pipeline, so it happens just after loop-vectorization. This is only to keep vectorization passes together in the pipeline at the moment. I don't have evidence of interaction between these yet. 2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was partly proposed as far back as rL219644 (which is why it's effectively being moved in the old PM code). This is important because the subsequent -instcombine doesn't work as well without EarlyCSE. With the CSE, -instcombine is able to squash shuffles together in 1 of the tests (because those are simple "select" shuffles). 3. Remove the -vector-combine pass that was running after SLP. We may want to do that eventually, but I don't have a test case to support it yet. Differential Revision: https://reviews.llvm.org/D75145	2020-03-04 11:10:49 -05:00
Matt Arsenault	f9047ede58	LICM: Reorder condition checks Check the fast math flag before the more expensive loop check.	2020-03-03 17:15:57 -05:00
Brian Gesiak	aa85b437a9	[Coroutines] Use dbg.declare for frame variables Summary: https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f is an example of a small C++ program that uses C++20 coroutines that is difficult to debug, due to the loss of debug info for variables that "spill" across coroutine suspension boundaries. This patch addresses that issue by inserting 'llvm.dbg.declare' intrinsics that point the debugger to the variables' location at an offset to the coroutine frame. With this patch, I confirmed that running the 'frame variable' commands in https://gist.github.com/modocache/ed7c62f6e570766c0f39b35dad675c2f at the specified breakpoints results in the correct values being printed for coroutine frame variables 'i' and 'j' when using an lldb built from trunk, as well as with gdb 8.3 (lldb 9.0.1, however, could not print the values). The added test case also verifies this improved behavior. The existing coro-debug.ll test case is also modified to reflect the locations at which Clang actually places calls to 'dbg.declare', and additional checks are added to ensure this patch works as intended in that example as well. Reviewers: vsk, jmorse, GorNishanov, lewissbaker, wenlei Subscribers: EricWF, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75338	2020-03-03 17:13:46 -05:00
Tyker	c5ec8890c9	[NFC] Try fix ubsan buildbot after `876d133789`	2020-03-03 17:53:02 +01:00
Tyker	876d133789	[AssumeBundles] Add API to fill a map from operand bundles of an llvm.assume. Summary: This patch adds a new way to query operand bundles of an llvm.assume that is much better suited to some users like the Attributor that need to do many queries on the operand bundles of llvm.assume. Some modifications of the IR like replaceAllUsesWith can cause information in the map to be outdated, so this API is more suited to analysis passes and passes that don't make modification that could invalidate the map. Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75020	2020-03-03 14:22:52 +01:00
Florian Hahn	05afa55521	[VPlan] Add getPlan() to VPBlockBase. This patch adds a getPlan accessor to VPBlockBase, which finds the entry block of the plan containing the block and returns the plan set for this block. VPBlockBase contains a VPlan pointer, but it should only be set for the entry block of a plan. This allows moving blocks without updating the pointer for each moved block and in the future we might introduce a parent relationship between plans and blocks, similar to the one in LLVM IR. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D74445	2020-03-03 13:20:13 +00:00
David Green	ec7e4a9a80	[LoopVectorizer] Add reduction tests for inloop reductions. NFC Also adds a force-reduction-intrinsics option for testing, for forcing the generation of reduction intrinsics even when the backend is not requesting them.	2020-03-03 10:54:00 +00:00
Alok Kumar Sharma	6f029dadf6	[DebugInfo] Avoid generating duplicate llvm.dbg.value Summary: This is to avoid generating duplicate llvm.dbg.value instrinsic if it already exists after the Instruction. Before inserting llvm.dbg.value instruction, LLVM checks if the same instruction is already present before the instruction to avoid duplicates. Currently it misses to check if it already exists after the instruction. flang generates IR like this. %4 = load i32, i32* %i1_311, align 4, !dbg !42 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 When this IR is processed in llvm, it ends up inserting duplicates. %4 = load i32, i32* %i1_311, align 4, !dbg !42 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 call void @llvm.dbg.value(metadata i32 %4, metadata !35, metadata !DIExpression()), !dbg !33 We have now updated LdStHasDebugValue to include the cases when instruction is already followed by same dbg.value instruction we intend to insert. Now, Definition and usage of function LdStHasDebugValue are deleted. RemoveRedundantDbgInstrs is called for the cleanup of duplicate dbg.value's Testing: Added unit test for validation check-llvm check-debuginfo (the debug info integration tests) Reviewers: aprantl, probinson, dblaikie, jmorse, jini.susan.george SouraVX, awpandey, dstenb, vsk Reviewed By: aprantl, jmorse, dstenb, vsk Differential Revision: https://reviews.llvm.org/D74030	2020-03-03 09:56:45 +05:30
Juneyoung Lee	9f1f244d3c	[LICM] Allow freeze to hoist/sink out of a loop Summary: This patch allows LICM to hoist/sink freeze instructions out of a loop. Reviewers: reames, fhahn, efriedma Reviewed By: reames Subscribers: jfb, lebedev.ri, hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75400	2020-03-03 12:29:39 +09:00
Sumanth Gundapaneni	9897daa6bf	Update LSR's logic that identifies a post-increment SCEV value. One of the checks has been removed as it seem invalid. The LoopStep size is always almost a 32-bit. Differential Revision: https://reviews.llvm.org/D75079	2020-03-02 16:34:18 -06:00
Teresa Johnson	80bf137fa1	Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"" This reverts commit `80d0a137a5`, and the follow on fix in `873c0d0786`. It is causing test failures after a multi-stage clang bootstrap. See discussion on D73242 and D75201.	2020-03-02 14:02:13 -08:00
Teresa Johnson	873c0d0786	[ThinLTO/LowerTypeTests] Handle unpromoted local type ids Summary: Fixes an issue that cropped up after the changes in D73242 to delay the lowering of type tests. LTT couldn't handle any type tests with non-string type id (which happens for local vtables, which we try to promote during the compile step but cannot always when there are no exported symbols). We can simply treat the same as having an Unknown resolution, which delays their lowering, still allowing such type tests to be used in subsequent optimization (e.g. planned usage during ICP). The final lowering which simply removes these handles them fine. Beefed up an existing ThinLTO test for such unpromoted type ids so that the internal vtable isn't removed before lower type tests, which hides the problem. Reviewers: evgeny777, pcc Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, aganea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75201	2020-03-02 09:31:44 -08:00
Arkady Shlykov	3dcaf296ae	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: RKSimon, xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-03-02 08:37:11 -08:00
David Green	d0d38df091	[LoopVectorizer] Change types of lists from pointers to references. NFC getReductionVars, getInductionVars and getFirstOrderRecurrences were all being returned from LoopVectorizationLegality as pointers to lists. This just changes them to be references, cleaning up the interface slightly. Differential Revision: https://reviews.llvm.org/D75448	2020-03-02 15:04:41 +00:00
Reid Kleckner	1adbe86d87	[WinEH] Fix inttoptr+phi optimization in presence of catchswitch getFirstInsertionPt's return value must be checked for validity before casting it to Instruction*. Don't attempt to insert casts after a phi in a catchswitch block. Fixes PR45033, introduced in D37832. Reviewed By: davidxl, hfinkel Differential Revision: https://reviews.llvm.org/D75381	2020-03-01 07:49:28 -08:00
Juneyoung Lee	5cbb265694	[GVN] Fold equivalent freeze instructions Summary: This patch defines two freeze instructions to have the same value number if they are equivalent. This is allowed because GVN replaces all uses of a duplicated instruction with another. If it partially rewrites use, it is not allowed. e.g) ``` a = freeze(x) b = freeze(x) use(a) use(a) use(b) => use(a) use(b) // This is not allowed! use(b) ``` Reviewers: fhahn, reames, spatel, efriedma Reviewed By: fhahn Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75398	2020-03-01 07:32:05 +09:00
Simon Pilgrim	7e9747b50b	[X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554) This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default. Differential Revision: https://reviews.llvm.org/D75162	2020-02-29 18:57:35 +00:00
Vedant Kumar	dd1ea9de2e	Reland: [Coverage] Revise format to reduce binary size Try again with an up-to-date version of D69471 (`99317124` was a stale revision). --- Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2020-02-28 18:12:04 -08:00
Vedant Kumar	3388871714	Revert "[Coverage] Revise format to reduce binary size" This reverts commit `99317124e1`. This is still busted on Windows: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/40873 The llvm-cov tests report 'error: Could not load coverage information'.	2020-02-28 18:03:15 -08:00
Vedant Kumar	99317124e1	[Coverage] Revise format to reduce binary size Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2020-02-28 17:33:25 -08:00
Matt Morehouse	30bb737a75	[DFSan] Add __dfsan_cmp_callback. Summary: When -dfsan-event-callbacks is specified, insert a call to __dfsan_cmp_callback on every CMP instruction. Reviewers: vitalybuka, pcc, kcc Reviewed By: kcc Subscribers: hiraditya, #sanitizers, eugenis, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75389	2020-02-28 15:49:44 -08:00
Matt Morehouse	f668baa459	[DFSan] Add __dfsan_mem_transfer_callback. Summary: When -dfsan-event-callbacks is specified, insert a call to __dfsan_mem_transfer_callback on every memcpy and memmove. Reviewers: vitalybuka, kcc, pcc Reviewed By: kcc Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75386	2020-02-28 15:48:25 -08:00
Matt Morehouse	52f889abec	[DFSan] Add __dfsan_load_callback. Summary: When -dfsan-event-callbacks is specified, insert a call to __dfsan_load_callback() on every load. Reviewers: vitalybuka, pcc, kcc Reviewed By: vitalybuka, kcc Subscribers: hiraditya, #sanitizers, llvm-commits, eugenis, kcc Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75363	2020-02-28 14:26:09 -08:00
Austin Kerbow	4fa63fd452	[VectorCombine] Fix assert on compare extract index Extract index could be a differnet integral type. Differential Revision: https://reviews.llvm.org/D75327	2020-02-28 10:37:08 -08:00
Valery N Dmitriev	d723ec4f04	[SLP][NFC] Assert that tree entry operands completed when scheduler looks for dependencies. This change adds an assertion to prevent tricky bug related to recursive approach of building vectorization tree. For loop below takes number of operands directly from tree entry rather than from scalars. If the entry at this moment turns out incomplete (i.e. not all operands set) then not all the dependencies will be seen by the scheduler. This can lead to failed scheduling (and thus failed vectorization) for perfectly vectorizable tree. Here is code example which is likely to fire the assertion: for (i : VL0->getNumOperands()) { ... TE->setOperand(i, Operands); buildTree_rec(Operands, Depth + 1,...); } Correct way is two steps process: first set all operands to a tree entry and then recursively process each operand. Differential Revision: https://reviews.llvm.org/D75296	2020-02-28 10:34:48 -08:00
Hiroshi Yamauchi	f16d2bec40	Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration. This aims to fix a missed inlining case. If there's a virtual call in the callee on an alloca (stack allocated object) in the caller, and the callee is inlined into the caller, the post-inline cleanup would devirtualize the virtual call, but if the next iteration of DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is based on a heuristic to determine whether to reiterate, we may miss inlining the devirtualized call. This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp. This is a second commit after a revert https://reviews.llvm.org/rG4569b3a86f8a4b1b8ad28fe2321f936f9d7ffd43 and a fix https://reviews.llvm.org/rG41e06ae7ba91. Differential Revision: https://reviews.llvm.org/D69591	2020-02-28 09:43:32 -08:00
Hiroshi Yamauchi	41e06ae7ba	[CallPromotionUtils] Add missing promotion legality check to tryPromoteCall. Summary: This fixes the crash that led to the revert of D69591. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75307	2020-02-28 09:35:09 -08:00
Valery N Dmitriev	02e5e47e17	[SLP][NFC] Delete some unreachable code. This patch deletes some dead code out of SLP vectorizer. Couple of changes taken out of D57059 to slightly lighten it plus one more similar case fixed. Differential Revision: https://reviews.llvm.org/D75276	2020-02-28 09:22:51 -08:00
Teresa Johnson	f9ca75f19b	[Inliner] Inlining should honor nobuiltin attributes Summary: Final patch in series to fix inlining between functions with different nobuiltin attributes/options, which was specifically an issue in LTO. See discussion on D61634 for background. The prior patch in this series (D67923) enabled per-Function TLI construction that identified the nobuiltin attributes. Here I have allowed inlining to proceed if the callee's nobuiltins are a subset of the caller's nobuiltins, but not in the reverse case, which should be conservatively correct. This is controlled by a new option, -inline-caller-superset-nobuiltin, which is enabled by default. Reviewers: hfinkel, gchatelet, chandlerc, davidxl Subscribers: arsenm, jvesely, nhaehnle, mehdi_amini, eraman, hiraditya, haicheng, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74162	2020-02-28 07:34:14 -08:00
Pierre-vh	2809abbd98	[Transform][MemCpyOpt] Add missing DebugLoc to %tmpbitcast Fix for https://bugs.llvm.org/show_bug.cgi?id=37967 Differential Revision: https://reviews.llvm.org/D75173	2020-02-28 15:20:51 +00:00
Juneyoung Lee	cc28a75467	Let EarlyCSE fold equivalent freeze instructions Summary: This patch makes EarlyCSE fold equivalent freeze instructions. Another optimization that I think will be useful is to remove freeze if its operand is used as a branch condition or at llvm.assume: ``` %c = ... br i1 %c, label %A, .. A: %d = freeze %c ; %d can be optimized to %c because %c cannot be poison or undef (or 'br %c' would be UB otherwise) ``` If it make sense for EarlyCSE to support this as well, I will make a patch for this. Reviewers: spatel, reames, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75334	2020-02-28 20:35:20 +09:00
Hans Wennborg	d48c981697	SROA: Don't drop atomic load/store alignments (PR45010) SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266	2020-02-28 10:38:40 +01:00
Jun Ma	43c8307c6c	[Coroutines] CoroElide enhancement Fix regression of CoreElide pass when current function is coroutine. Differential Revision: https://reviews.llvm.org/D71663	2020-02-28 10:41:59 +08:00
Juneyoung Lee	2b5a897651	Revert "[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison" .. due to performance regression. This patch is reverted until infrastructore for CSE/LICM support for freeze is added. This reverts commit `181628b`	2020-02-28 11:10:46 +09:00
Eli Friedman	b299926453	[IndVars] Fix sort comparator. std::sort will compare an element to itself in some cases. We should not crash if this happens. Differential Revision: https://reviews.llvm.org/D75000	2020-02-27 17:25:18 -08:00
Matt Morehouse	470db54cbd	[DFSan] Add flag to insert event callbacks. Summary: For now just insert the callback for stores, similar to how MSan tracks origins. In the future we may want to add callbacks for loads, memcpy, function calls, CMPs, etc. Reviewers: pcc, vitalybuka, kcc, eugenis Reviewed By: vitalybuka, kcc, eugenis Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits, kcc Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D75312	2020-02-27 17:14:19 -08:00
Matt Morehouse	2a29617b9d	[DFSan] Remove unused IRBuilder. NFC Reviewers: pcc, vitalybuka, kcc Reviewed By: kcc Subscribers: hiraditya, llvm-commits, kcc Tags: #llvm Differential Revision: https://reviews.llvm.org/D75190	2020-02-27 16:27:20 -08:00
Artur Pilipenko	02e3d5c3a2	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Nikita Popov	4ef272ec9c	[InstCombine] DCE instructions earlier When InstCombine initially populates the worklist, it already performs constant folding and DCE. However, as the instructions are initially visited in program order, this DCE can pick up only the last instruction of a dead chain, the rest would only get picked up in the main InstCombine run. To avoid this, we instead perform the DCE in separate pass over the collected instructions in reverse order, which will allow us to pick up full dead instruction chains. We already need to do this reverse iteration anyway to populate the worklist, so this shouldn't add extra cost. This by itself only fixes a small part of the problem though: The same basic issue also applies during the main InstCombine loop. We generally always want DCE to occur as early as possible, because it will allow one-use folds to happen. Address this by also performing DCE while adding deferred instructions to the main worklist. This drops the number of tests that perform more than 2 InstCombine iterations from ~80 to ~40. There's some spurious test changes due to operand order / icmp toggling. Differential Revision: https://reviews.llvm.org/D75008	2020-02-27 18:45:59 +01:00
Simon Moll	ddd11273d9	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Pierre-vh	f64e457cb7	[Transforms][Debugify] Ignore PHI nodes when checking for DebugLocs Fix for: https://bugs.llvm.org/show_bug.cgi?id=37964 Differential Revision: https://reviews.llvm.org/D75242	2020-02-27 16:14:11 +00:00
Kirill Bobyrev	4569b3a86f	Revert "Devirtualize a call on alloca without waiting for post inline cleanup and next" This reverts commit `59fb9cde7a`. The patch caused internal miscompilations.	2020-02-27 15:58:39 +01:00
Jay Foad	f41e82c82c	[InstCombine] Fix confusing variable name.	2020-02-27 11:27:49 +00:00
Nemanja Ivanovic	c46b85aaf4	[LoopVectorize] Fix cost for calls to functions that have vector versions A recent commit (https://reviews.llvm.org/rG66c120f02560ef528a60924104ead66f330190f1) changed the cost for calls to functions that have a vector version for some vectorization factor. However, no check is performed for whether the vectorization factor matches the current one being cost modeled. This leads to attempts to widen call instructions to a vectorization factor for which such a function does not exist, which in turn leads to an assertion failure. This patch adds the check for vectorization factor (i.e. not just that the called function has a vector version for some VF, but that it has a vector version for this VF). Differential revision: https://reviews.llvm.org/D74944	2020-02-26 21:39:11 -06:00
Sanjay Patel	25c6544f32	[VectorCombine] add a debug flag to skip all transforms As suggested in D75145 - I'm not sure why, but several passes have this kind of disable/enable flag implemented at the pass manager level. But that means we have to duplicate the flag for both pass managers and add code to check the flag every time the pass appears in the pipeline. We want a debug option to see if this pass is misbehaving regardless of the pass managers, so just add a disablement check at the single point before any transforms run. Differential Revision: https://reviews.llvm.org/D75204	2020-02-26 15:15:42 -05:00
Nikita Popov	00f54050f7	[SimpleLoopUnswitch] Remove unnecessary include; NFC	2020-02-26 20:40:43 +01:00
Nikita Popov	9d9633fb70	[CVP] Simplify cmp of local phi node CVP currently does not simplify cmps with instructions in the same block, because LVI getPredicateAt() currently does not provide much useful information for that case (D69686 would change that, but is stuck.) However, if the instruction is a Phi node, then LVI can compute the result of the predicate by threading it into the predecessor blocks, which allows it simplify some conditions that nothing else can handle. Relevant code: `6d6a4590c5/llvm/lib/Analysis/LazyValueInfo.cpp (L1904-L1927)` Differential Revision: https://reviews.llvm.org/D72169	2020-02-26 20:36:41 +01:00
Nikita Popov	7da3b5e45c	[InstCombine] Simplify DCE code; NFC As pointed out on D75008, MadeIRChange is already set by eraseInstFromFunction(), which also already does a debug print.	2020-02-26 20:33:00 +01:00
Nikita Popov	56f7de5baa	[InstCombine] Remove trivially empty ranges from end InstCombine removes pairs of start+end intrinsics that don't have anything in between them. Currently this is done by starting at the start intrinsic and scanning forwards. This patch changes it to start at the end intrinsic and scan backwards. The motivation here is as follows: When we process the start intrinsic, we have not yet looked at the following instructions, which may still get folded/removed. If they do, we will only be able to remove the start/end pair on the next iteration. When we process the end intrinsic, all the instructions before it have already been visited, and we don't run into this problem. Differential Revision: https://reviews.llvm.org/D75011	2020-02-26 20:04:11 +01:00
Hiroshi Yamauchi	59fb9cde7a	Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration. Needs ReviewPublic This aims to fix a missed inlining case. If there's a virtual call in the callee on an alloca (stack allocated object) in the caller, and the callee is inlined into the caller, the post-inline cleanup would devirtualize the virtual call, but if the next iteration of DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is based on a heuristic to determine whether to reiterate, we may miss inlining the devirtualized call. This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.	2020-02-26 09:51:24 -08:00
Reid Kleckner	465dca79b3	Avoid SmallString.h include in MD5.h, NFC Saves 200 includes, which is mostly immaterial.	2020-02-26 09:10:24 -08:00
Hans Wennborg	546918cbb4	Revert "[compiler-rt] Add a critical section when flushing gcov counters" See discussion on PR44792. This reverts commit `02ce9d8ef5`. It also reverts the follow-up commits `8f46269f0` "[profile] Don't dump counters when forking and don't reset when calling exec** functions" `62c7d8402` "[profile] gcov_mutex must be static"	2020-02-26 13:27:44 +01:00
Juneyoung Lee	1cb7ec870d	[SimpleLoopUnswitch] Canonicalize variable names	2020-02-26 15:33:02 +09:00
Juneyoung Lee	181628b52d	[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison Summary: Loop unswitch hoists branches on loop-invariant conditions. However, if this condition is poison/undef and the branch wasn't originally reachable, loop unswitch introduces UB (since the optimized code will branch on poison/undef and the original one didn't)). We fix this problem by freezing the condition to ensure we don't introduce UB. We will now transform the following: while (...) { if (C) { A } else { B } } Into: C' = freeze(C) if (C') { while (...) { A } } else { while (...) { B } } This patch fixes the root cause of the following bug reports (which use the old loop unswitch, but can be reproduced with minor changes in the code and -enable-nontrivial-unswitch): - https://llvm.org/bugs/show_bug.cgi?id=27506 - https://llvm.org/bugs/show_bug.cgi?id=31652 Reviewers: reames, majnemer, chenli, sanjoy, hfinkel Reviewed By: reames Subscribers: hiraditya, jvesely, nhaehnle, filcab, regehr, trentxintong, nlopes, llvm-commits, mzolotukhin Tags: #llvm Differential Revision: https://reviews.llvm.org/D29015	2020-02-26 13:47:33 +09:00
Johannes Doerfert	396b725394	[OpenMP][Opt] Combine `struct ident_t` during deduplication If we deduplicate OpenMP runtime calls we have multiple `ident_t` that represent information like source location. So far, we simply kept the one used by the replacement call. However, as exposed by PR44893, that can cause problems if we have stack allocated `ident_t` objects. While we need to revisit the use of these as well, it is clear that we eventually want to merge source location information in some way. With this patch we add the infrastructure to do so but without doing the actual merge. Instead we pick a global `ident_t` from the replaced calls, if possible, or create a new one with an unknown location instead. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D74925	2020-02-25 14:07:14 -08:00
Akira Hatanaka	430512ed7d	[ObjC][ARC] Don't move a retain call living outside a loop into the loop body We started seeing cases where ARC optimizer would move retain calls into loop bodies, causing imbalance in the number of retain and release calls, after changes were made to delete inert ARC calls since the inert calls that used to block code motion are gone. Fix the bug by setting the CFG hazard flag when visiting a loop header. rdar://problem/56908836	2020-02-25 13:00:10 -08:00
Roman Lebedev	400ceda425	[SCEV][IndVars] Always provide insertion point to the SCEVExpander::isHighCostExpansion() Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728 Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73777	2020-02-25 23:05:59 +03:00
Roman Lebedev	44edc6fd2c	[SCEV] rewriteLoopExitValues(): even if have hard uses, still rewrite if cheap (PR44668) Summary: Replacing uses of IV outside of the loop is likely generally useful, but `rewriteLoopExitValues()` is cautious, and if it isn't told to always perform the replacement, and there are hard uses of IV in loop, it doesn't replace. In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]], that prevents `-indvars` from replacing uses of induction variable after the loop, which might be one of the optimization failures preventing that code from being vectorized. Instead, now that the cost model is fixed, i believe we should be a little bit more optimistic, and also perform replacement if we believe it is within our budget. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 \| PR44668 ]]. Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov Reviewed By: mkazantsev Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73501	2020-02-25 23:05:59 +03:00
Roman Lebedev	b99c91a087	[NFC][SCEV] Piping to pass new SCEVCheapExpansionBudget option into SCEVExpander::isHighCostExpansionHelper() Summary: In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling. This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73705	2020-02-25 23:05:57 +03:00
Roman Lebedev	0789f28048	[NFC][SCEV] Piping to pass TTI into SCEVExpander::isHighCostExpansionHelper() Summary: Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()` This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73704	2020-02-25 23:05:56 +03:00
Philip Reames	14845b2c45	Revert "[LICM] Support hosting of dynamic allocas out of loops" This reverts commit `8d22100f66`. There was a functional regression reported (https://bugs.llvm.org/show_bug.cgi?id=44996). I'm not actually sure the patch is wrong, but I don't have time to investigate currently, and this line of work isn't something I'm likely to get back to quickly.	2020-02-25 09:05:31 -08:00
Roman Lebedev	2855c8fed9	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): fix miscompile (PR44802) Much like with reassociateShiftAmtsOfTwoSameDirectionShifts(), as input, we have the following pattern: icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0 We want to rewrite that as: icmp eq/ne (and (x shift (Q+K)), y), 0 iff (Q+K) u< bitwidth(x) While we know that originally (Q+K) would not overflow (because 2 * (N-1) u<= iN -1), we may have looked past extensions of shift amounts. so it may now overflow in smaller bitwidth. To ensure that does not happen, we need to ensure that the total maximal shift amount is still representable in that smaller bitwidth. If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus. https://bugs.llvm.org/show_bug.cgi?id=44802	2020-02-25 18:23:58 +03:00
Roman Lebedev	781d077afb	[InstCombine] reassociateShiftAmtsOfTwoSameDirectionShifts(): fix miscompile (PR44802) As input, we have the following pattern: Sh0 (Sh1 X, Q), K We want to rewrite that as: Sh x, (Q+K) iff (Q+K) u< bitwidth(x) While we know that originally (Q+K) would not overflow (because 2 * (N-1) u<= iN -1), we may have looked past extensions of shift amounts. so it may now overflow in smaller bitwidth. To ensure that does not happen, we need to ensure that the total maximal shift amount is still representable in that smaller bitwidth. If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus. https://bugs.llvm.org/show_bug.cgi?id=44802	2020-02-25 18:23:51 +03:00
Sanjay Patel	10ea01d80d	[VectorCombine] make cost calc consistent for binops and cmps Code duplication (subsequently removed by refactoring) allowed a logic discrepancy to creep in here. We were being conservative about creating a vector binop -- but not a vector cmp -- in the case where a vector op has the same estimated cost as the scalar op. We want to be more aggressive here because that can allow other combines based on reduced instruction count/uses. We can reverse the transform in DAGCombiner (potentially with a more accurate cost model) if this causes regressions. AFAIK, this does not conflict with InstCombine. We have a scalarize transform there, but it relies on finding a constant operand or a matching insertelement, so that means it eliminates an extractelement from the sequence (so we won't have 2 extracts by the time we get here if InstCombine succeeds). Differential Revision: https://reviews.llvm.org/D75062	2020-02-25 08:41:59 -05:00
Florian Hahn	b8d638d337	[DSE,MSSA] Do not attempt to remove un-removable memdefs. We have to skip MemoryDefs that cannot be removed. This fixes a crash in the newly added test case and fixes a wrong case in memset-and-memcpy.ll.	2020-02-25 13:31:46 +00:00
Hideto Ueno	2c0edbf19c	[Attributor] Use AssumptionCache in AANonNullFloating::initialize	2020-02-25 13:00:03 +09:00
Ayke van Laethem	2a7a989c3e	[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder. Differential Revision: https://reviews.llvm.org/D51642	2020-02-24 20:15:51 +01:00
Calixte Denizet	8f46269f0c	[profile] Don't dump counters when forking and don't reset when calling exec functions Summary: There is no need to write out gcdas when forking because we can just reset the counters in the parent process. Let say a counter is N before the fork, then fork and this counter is set to 0 in the child process. In the parent process, the counter is incremented by P and in the child process it's incremented by C. When dump is ran at exit, parent process will dump N+P for the given counter and the child process will dump 0+C, so when the gcdas are merged the resulting counter will be N+P+C. About exec functions, since the current process is replaced by an another one there is no need to reset the counters but just write out the gcdas since the counters are definitely lost. To avoid to have lists in a bad state, we just lock them during the fork and the flush (if called explicitely) and lock them when an element is added. Reviewers: marco-c Reviewed By: marco-c Subscribers: hiraditya, cfe-commits, #sanitizers, llvm-commits, sylvestre.ledru Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D74953	2020-02-24 10:38:33 +01:00
Florian Hahn	7769030b93	Recommit "[PatternMatch] Match XOR variant of unsigned-add overflow check." This version fixes a buildbot failure cause by picking the wrong insert point for XORs. We cannot pick the XOR binary operator as insert point, as it is not guaranteed that both input operands for the overflow intrinsic are defined before it. This reverts the revert commit `c7fc0e5da6`.	2020-02-23 18:33:18 +00:00
Florian Hahn	af69d5e10e	[DSE] Track overlapping stores. Add a map from BasicBlocks to overlap intervals. For partial writes, we can keep track of those in IOLs. We only add candidates that are valid for eliminations. Reviewers: dmgreen, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D73757	2020-02-23 15:44:40 +00:00
Tyker	837d8129e9	[NFC] Remove some GCC warning from `c9e93c84f6`	2020-02-22 14:11:31 +01:00
Johannes Doerfert	9708279c72	[Attributor][FIX] Undo `16188f9` until SCC iterator bug is fixed The buildbot http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win shows some strange SCC iterator bug since `16188f9` which we need to investigate. This patch should remove the part of `16188f9` that could have exposed the problem.	2020-02-21 14:20:42 -08:00
Whitney Tsang	0a70edd696	[CloneFunction] Update loop headers after cloning all blocks in loop. Summary: Blocks in a loop can be in any order as long as the loop header is the first block in Blocks. With some order of Blocks, cloneLoopWithPreheader would trigger the assertion in addBasicBlockToLoop. Example: define void @test(i64 %N) { preheader.i: br label %header.i header.i: %i = phi i64 [ 0, %preheader.i ], [ %inc.i, %latch.i ] br label %header.j header.j: %j = phi i64 [ 0, %header.i ], [ %inc.j, %latch.j ] br label %header.k header.k: %k = phi i64 [ 0, %header.j ], [ %inc.k, %latch.k ] call void @baz(i64 %i, i64 %j, i64 %k) br label %latch.k latch.k: %inc.k = add nsw i64 %k, 1 %cmp.k = icmp slt i64 %inc.k, %N br i1 %cmp.k, label %header.k, label %latch.j latch.j: %inc.j = add nsw i64 %j, 1 %cmp.j = icmp slt i64 %inc.j, %N br i1 %cmp.j, label %header.j, label %latch.i latch.i: %inc.i = add nsw i64 %i, 1 %cmp.i = icmp slt i64 %inc.i, %N br i1 %cmp.i, label %header.i, label %exit.i exit.i: ret void } declare void @baz(i64, i64, i64) If the blocks of loop-i is in the order: header.i, latch.k, header.k, header.j, latch.j, latch.i, then cloneLoopWithPreheader would trigger the assertion in addBasicBlockToLoop assert(contains(SameHeader) && getHeader() == SameHeader->getHeader() && "Incorrect LI specified for this loop!"); As latch.k is in both loop-j and loop-k, it would be set as the header of both loops after adding latch.k. If we update loop headers during cloning blocks, then after adding header.k, the header of loop-k would be updated with header.k, while the header of loop-j stays as latch.k. When adding header.j, SameHeader is loop-k, SameHeader->getHeader() is header.k, but getHeader() is latch.k, which trigger the assertion. Reviewer: jdoerfert, Meinersbur, fhahn, kbarton, hfinkel, bmahjour, etiotto Reviewed By: Meinersbur Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D74382	2020-02-21 22:18:24 +00:00
Sanjay Patel	e9c79a7aef	[VectorCombine] refactor to reduce duplicated code; NFC This should be the last step in the current cleanup. Follow-ups should resolve the TODO about cost calc and enable the more general case where we extract different elements.	2020-02-21 15:56:00 -05:00
Sanjay Patel	34e3485560	[VectorCombine] refactor cost calcs to reduce duplication; NFC More cleanup is possible now, but we probably need to resolve the TODO about the existing difference between compares and binops.	2020-02-21 15:12:00 -05:00
Krzysztof Parzyszek	d2b7c09e79	[Hexagon] Simplify intrinsic (vandvrt (vandqrt q b) m) -> q if possible When each byte in b&m is non-zero, this conversion Q->V->Q is a no-op.	2020-02-21 13:56:04 -06:00
Nikita Popov	b178555318	[InstCombine] Improve simplify demanded bits worklist management This fixes a small mistake from D72944: The worklist add should happen before assigning the new operand, not after. In case an actual replacement happens, the old operand needs to be added for DCE. If no actual replacement happens, then old/new are the same, so it doesn't matter. This drops one iteration from the annotated test case.	2020-02-21 18:51:41 +01:00
Nikita Popov	656dff9af4	[InstCombine] Use replaceOperand() in more places Followup to D73919 with another batch of replacements of setOperand() -> replaceOperand(), to make sure the old operand gets DCEd right away. Differential Revision: https://reviews.llvm.org/D74932	2020-02-21 18:41:16 +01:00
Florian Hahn	98f5268a72	[VectorUtils] Move ToVectorTy to VectorUtils.h (NFC). ToVectorTy is defined and used in multiple places. Hoist it to VectorUtils.h to avoid duplication and improve re-usability. Reviewers: rengolin, hsaito, Ayal, gilr, fpetrogalli Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D74959	2020-02-21 17:31:24 +00:00
Nikita Popov	a8db806d52	[SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls This changes the SimplifyLibCalls utility to accept an IRBuilderBase, which allows us to pass through the IRBuilder used by InstCombine. This will ensure that new instructions get added to the worklist. The annotated test-case drops from 4 to 2 InstCombine iterations thanks to this. To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard, which is basically the same as the existing InsertPointGuard and FastMathFlagsGuard, but for operand bundles. Also add a setDefaultOperandBundles() method so these can be set outside the constructor. Differential Revision: https://reviews.llvm.org/D74792	2020-02-21 18:26:05 +01:00
Sanjay Patel	fc4455891c	[VectorCombine] refactor matching code to reduce duplication; NFC cmp/binop were already diverging even though they are largely the same logic.	2020-02-21 12:06:51 -05:00
Florian Hahn	134bab7cd5	[DSE,MSSA] Add debug counter. Can be used like -debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20 Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72147	2020-02-21 17:04:37 +00:00
Bill Wendling	2fe457690d	Filter callbr insts from critical edge splitting Similarly to how splitting predecessors with an indirectbr isn't handled in the generic way, we also shouldn't split callbrs, for similar reasons.	2020-02-20 16:24:42 -08:00
Florian Hahn	99809f98d7	[SCCP] Do not mark unknown loads as overdefined. For tracked globals that are unknown after solving, we expect all non-store uses to be replaced. This is a follow-up to `f8045b250d`, which removed forcedconstant. We should not mark unknown loads as overdefined, as they either load from an unknown pointer or an undef global. Restore the original logic for loads.	2020-02-20 22:48:58 +01:00
dfukalov	dbfc682e2b	SpeculativeExecution: fixed ingoring free execution Summary: After updating cost model in AMDGPU target (`47a5c36b37`) the pass started to ignore some BBs since they got all instructions estimated as free. Reviewers: arsenm, chandlerc, nhaehnle Reviewed By: nhaehnle Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74825	2020-02-20 14:45:02 +03:00
Johannes Doerfert	d95cb56649	[Attributor] Make sure abstract attributes are properly initialized	2020-02-20 02:46:40 -06:00
Johannes Doerfert	6185fb13d6	[Attributor][NFC] Refactor interface	2020-02-20 02:46:40 -06:00
Johannes Doerfert	8e76fec0ae	[Attributor][NFC] Improve the debug output & add a TODO	2020-02-19 23:46:08 -06:00
Johannes Doerfert	f8ad735729	[Attributor] Use existing `returned` information better We can look through calls with `returned` argument attributes when we collect subsuming positions. This allows us to get existing attributes from more places.	2020-02-19 23:46:07 -06:00
Johannes Doerfert	a801ee869d	[Attributor][FIX] Avoid setting wrong load/store alignments	2020-02-19 23:46:07 -06:00
Johannes Doerfert	e1eed6c5b9	[Attributor] Generalize `getAssumedConstantInt` interface We are often interested in an assumed constant and sometimes it has to be an integer constant. Before we only looked for the latter, now we can ask for either.	2020-02-19 22:33:51 -06:00
Johannes Doerfert	16188f9d70	[Attributor][FIX] Do not create new calls edge we cannot handle If we propagate function pointers across function boundaries we can create new call edges. These need to be represented in the CG if we run as a CGSCC pass. In the new pass manager that is currently not handled by the CallGraphUpdater so we need to prevent the situation for now.	2020-02-19 22:33:51 -06:00
Johannes Doerfert	1e99fc9d58	[Attributor] Add initial AAIsDead for arguments We usually will ask for liveness of an argument anyway so we ended up lazily creating the attribute anyway. However, that is not always the case and even if it is we should go the eager route here. Various tests show how this can improve the outcome. One test exposed a problem with type mismatches between argument and call site argument, a fix is included. For liveness various more tests were added as well.	2020-02-19 21:39:45 -06:00
Johannes Doerfert	c6ac717aa7	[Attributor] Allow multiple uses of a casted function pointer If a function pointer is casted into a different type the resulting expression can be a constant. If so, it can be used multiple times which cannot be handled by the AbstractCallSite constructor alone. Instead, we follow the cast expression uses now explicitly during the call site traversal.	2020-02-19 20:43:38 -06:00
Michael Kruse	e4d20ec8ad	[IndVarSimply] Fix assert/release build difference. In builds with assertions enabled (!NDEBUG), IndVarSimplify does an additional query to ScalarEvolution which may change future SCEV queries since it fills the internal cache differently. The result is actually only used with the -verify-indvars command line option. We fix the issue by only calling SE->getBackedgeTakenCount(L) if -verify-indvars is enabled such that only -verify-indvars shows the behavior, but not debug builds themselves. Also add a remark to the description of -verify-indvars about this behavior. Fixes llvm.org/PR44815 Differential Revision: https://reviews.llvm.org/D74810	2020-02-19 14:36:22 -06:00
Florian Hahn	c7fc0e5da6	Revert "[PatternMatch] Match XOR variant of unsigned-add overflow check." This reverts commit `e01a3d49c2`. and commit `a6a585b803`. This causes a failure on GreenDragon: http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/9597	2020-02-19 19:37:08 +01:00
Florian Hahn	e01a3d49c2	[PatternMatch] Match XOR variant of unsigned-add overflow check. Instcombine folds (a + b <u a) to (a ^ -1 <u b) and that does not match the expected pattern in CodeGenPerpare via UAddWithOverflow. This causes a regression over Clang 7 on both X86 and AArch64: https://gcc.godbolt.org/z/juhXYV This patch extends UAddWithOverflow to also catch the XOR case, if the XOR is only used in the ICMP. This covers just a single case, but I'd like to make sure I am not missing anything before tackling the other cases. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D74228	2020-02-19 15:25:18 +01:00
Brian Gesiak	5a187d8ed1	[Coroutines][4/6] New pass manager: coro-cleanup Summary: Depends on https://reviews.llvm.org/D71900. The fourth in a series of patches that ports the LLVM coroutines passes to the new pass manager infrastructure. This patch implements 'coro-cleanup'. No existing regression tests check the behavior of coro-cleanup on its own, so this patch adds one. (A test named 'coro-cleanup.ll' exists, but it relies on the entire coroutines pipeline being run. It's updated to test the new pass manager in the 5th patch of this series.) Reviewers: GorNishanov, lewissbaker, chandlerc, junparser, deadalnix, wenlei Reviewed By: wenlei Subscribers: wenlei, EricWF, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71901	2020-02-19 00:30:27 -05:00
Brian Gesiak	2365238b9d	Re-land new pass manager coro-split and coro-elide This re-applies patches https://reviews.llvm.org/D71899 and https://reviews.llvm.org/D71900, which were reverted in https://reviews.llvm.org/rG11053a1cc61 and https://reviews.llvm.org/rGe999aa38d16. The underlying problem that caused two buildbots to fail with these patches is explained in https://reviews.llvm.org/rG26f356350bd -- older compliers disagree with the order in which the left- and right-hand side of an assignment in LazyCallGraph ought to be evaluated, which caused an assertion in SmallVector::operator[] to fire when the test suite was run.	2020-02-19 00:11:23 -05:00
Reid Kleckner	0c2b09a9b6	[IR] Lazily number instructions for local dominance queries Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664	2020-02-18 14:44:24 -08:00
Fangrui Song	13a97305ba	[JumpThreading] Skip unconditional PredBB when threading jumps through two basic blocks Fixes https://bugs.llvm.org/show_bug.cgi?id=44922 (caused by `4698bf145d`) ThreadThroughTwoBasicBlocks assumes PredBBBranch is conditional. The following code can segfault. AddPHINodeEntriesForMappedBlock(PredBBBranch->getSuccessor(1), PredBB, NewBB, ValueMapping); We can also allow unconditional PredBB, but the produced code is not better. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D74747	2020-02-18 11:01:46 -08:00
Tyker	c9e93c84f6	Add Query API for llvm.assume holding attributes Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72885	2020-02-18 19:42:07 +01:00
Huihui Zhang	8ee0e1dc02	[NFC] Silence compiler warning [-Wmissing-braces].	2020-02-18 10:37:12 -08:00
Florian Hahn	e32522ca17	[SLPVectorizer] Do not assume extracelement idx is a ConstantInt. The index of an ExtractElementInst is not guaranteed to be a ConstantInt. It can be any integer value. Check explicitly for ConstantInts. The new test cases illustrate scenarios where we crash without this patch. I've also added another test case to check the matching of extractelement vector ops works. Reviewers: RKSimon, ABataev, dtemirbulatov, vporpo Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D74758	2020-02-18 18:16:06 +01:00
Nikita Popov	ec6c623ff9	[SimplifyLibCalls] Accept IRBuilderBase; NFC	2020-02-18 17:59:07 +01:00
Nikita Popov	28ffe38bba	[LoopUtils] Accept IRBuilderBase; NFC	2020-02-18 17:58:46 +01:00
Nikita Popov	ed6d30b517	[BuildLibCalls] Accept IRBuilderBase; NFC Accept IRBuilderBase instead of IRBuilder<>. Remove dependency on IRBuilder from header.	2020-02-18 17:58:16 +01:00
Nikita Popov	1ab37fad61	[InstCombine] Fix worklist management when simplifying demanded bits When simplifying demanded bits, we currently only report the instruction on which SimplifyDemandedBits was called as changed. However, this is a recursive call, and the actually modified instruction will usually be further up the chain. Additionally, all the intermediate instructions should also be revisited, as additional combines may be possible after the demanded bits simplification. We fix this by explicitly adding them back to the worklist. Differential Revision: https://reviews.llvm.org/D72944	2020-02-18 17:55:40 +01:00
Nikita Popov	c9540fe59b	[InstCombine] Fix multi-use handling in cttz transform The select-of-cttz transform can currently duplicate cttz intrinsics and zext/trunc ops. The cause is that it unnecessarily duplicates the intrinsic and the zext/trunc when setting the "undef_on_zero" flag to false. However, it's always legal to set the flag from true to false, so we can make this replacement even if there are extra users. Differential Revision: https://reviews.llvm.org/D74685	2020-02-18 17:55:00 +01:00
Nikita Popov	9adedd146d	[InstCombine] Relax preconditions for ashr+and+icmp fold (PR44754) Fix for https://bugs.llvm.org/show_bug.cgi?id=44754. We already have a fold that converts icmp (and (ashr X, C3), C2), C1 into icmp (and C2'), C1', but it imposed overly strict requirements on the transform. Relax this by checking that both C2 and C1 don't shift out bits (in a signed sense) when forming the new constants. Alive proofs (https://rise4fun.com/Alive/PTz0): Name: ashr_legal Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) == C1 %a = ashr i16 %x, C3 %b = and i16 %a, C2 %c = icmp i16 %b, C1 => %d = and i16 %x, C2 << C3 %c = icmp i16 %d, C1 << C3 Name: ashr_shiftout_eq Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) != C1 %a = ashr i16 %x, C3 %b = and i16 %a, C2 %c = icmp eq i16 %b, C1 => %c = false Note that >> corresponds to ashr here. The case of an equality comparison has some special handling in this transform, because it will form to a true/false result if the condition on the comparison constant it violated. Differential Revision: https://reviews.llvm.org/D74294	2020-02-18 17:49:46 +01:00
Florian Hahn	9063022573	[InstCombin] Avoid nested Create calls, to guarantee order. The original code allowed creating the != checks in unpredictable order, causing http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34014 to fail.	2020-02-18 09:44:11 +01:00
Florian Hahn	6c85e92bcf	[InstCombine] Simplify a umul overflow check to a != 0 && b != 0. This patch adds a simplification if an OR weakens the overflow condition for umul.with.overflow by treating any non-zero result as overflow. In that case, we overflow if both umul.with.overflow operands are != 0, as in that case the result can only be 0, iff the multiplication overflows. Code like this is generated by code using __builtin_mul_overflow with negative integer constants, e.g. bool test(unsigned long long v, unsigned long long *res) { return __builtin_mul_overflow(v, -4775807LL, res); } ``` ---------------------------------------- Name: D74141 %res = umul_overflow {i8, i1} %a, %b %mul = extractvalue {i8, i1} %res, 0 %overflow = extractvalue {i8, i1} %res, 1 %cmp = icmp ne %mul, 0 %ret = or i1 %overflow, %cmp ret i1 %ret => %t0 = icmp ne i8 %a, 0 %t1 = icmp ne i8 %b, 0 %ret = and i1 %t0, %t1 ret i1 %ret %res = umul_overflow {i8, i1} %a, %b %mul = extractvalue {i8, i1} %res, 0 %cmp = icmp ne %mul, 0 %overflow = extractvalue {i8, i1} %res, 1 Done: 1 Optimization is correct! ``` Reviewers: nikic, lebedev.ri, spatel, Bigcheese, dexonsmith, aemerson Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D74141	2020-02-18 09:11:55 +01:00
Brian Gesiak	11053a1cc6	Revert new pass manager coro-split and coro-elide This reverts https://reviews.llvm.org/rG7125d66f9969605d886b5286780101a45b5bed67 and https://reviews.llvm.org/rG00fec8004aca6588d8d695a2c3827c3754c380a0 due to buildbot failures: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34004	2020-02-17 23:55:10 -05:00
Brian Gesiak	00fec8004a	[Coroutines][3/6] New pass manager: coro-elide Summary: Depends on https://reviews.llvm.org/D71899. The third in a series of patches that ports the LLVM coroutines passes to the new pass manager infrastructure. This patch implements 'coro-elide'. The new pass manager infrastructure does not implicitly repeat CGSCC pass pipelines when a function is devirtualized, and so the tests for the new pass manager that rely on that behavior now explicitly specify `repeat<2>`. Reviewers: GorNishanov, lewissbaker, chandlerc, jdoerfert, junparser, deadalnix, wenlei Reviewed By: wenlei Subscribers: wenlei, EricWF, Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71900	2020-02-17 23:41:57 -05:00
Brian Gesiak	7125d66f99	[Coroutines][2/6] New pass manager: coro-split Summary: This patch has four dependencies: 1. The first in this series of patches that implement coroutine passes in the new pass manager: https://reviews.llvm.org/D71898. 2. A patch that introduces an API for CGSCC passes to add new reference edges to a `LazyCallGraph`, `updateCGAndAnalysisManagerForCGSCCPass`: https://reviews.llvm.org/D72025. 3. A patch that introduces a `CallGraphUpdater` helper class that is capable of mutating internal `LazyCallGraph` state in order to insert new function nodes into a specific SCC: https://reviews.llvm.org/D70927. 4. And finally, a small edge case fix for updating `LazyCallGraph` that patch 3 above happens to run into: https://reviews.llvm.org/D72226. This is the second in a series of patches that ports the LLVM coroutines passes to the new pass manager infrastructure. This patch implements 'coro-split'. Some notes: * Using the new CGSCC pass manager resulted in IR being printed in the reverse order in some tests. To prevent FileCheck checks from failing due to these reversed orders, this patch splits up test files that test multiple different coroutine functions: specifically coro-alloc-with-param.ll, coro-split-eh.ll, and coro-eh-aware-edge-split.ll. * CoroSplit.cpp contained 2 overloads of `splitCoroutine`, one of which dispatched to the other based on the coroutine ABI being used (C++20 switch-based versus Swift returned-continuation-based). I found this confusing, especially with the additional branching based on `CallGraph` vs. `LazyCallGraph`, so I removed the ABI-checking overload of `splitCoroutine`. Reviewers: GorNishanov, lewissbaker, chandlerc, jdoerfert, junparser, deadalnix, wenlei Reviewed By: wenlei Subscribers: wenlei, qcolombet, EricWF, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71899	2020-02-17 23:35:27 -05:00
Vedant Kumar	c74026daf3	[HotColdSplit] Mark entire function cold when entry block is cold rdar://58855712	2020-02-17 15:57:50 -08:00
Nicolai Hähnle	58297e4d8f	LowerMatrixIntrinsics: Avoid use of deprecated CreateCall methods Reviewers: t.p.northover Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74675	2020-02-18 00:24:09 +01:00
Tim Northover	464d4cf7e6	Coroutines: avoid use of deprecated CreateLoad and CreateCall methods Summary: Patch originally by Tim Northover Reviewers: t.p.northover Subscribers: EricWF, hiraditya, modocache, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74674	2020-02-18 00:24:09 +01:00
Brian Gesiak	e9849d5195	[Coroutines][1/6] New pass manager: coro-early Summary: The first in a series of patches that ports the LLVM coroutines passes to the new pass manager infrastructure. This patch implements 'coro-early'. NB: All coroutines passes begin by checking that coroutine intrinsics are declared within the LLVM IR module they're operating on. To do so, they call `coro::declaresIntrinsics`. The next 3 patches in this series, which add new pass manager implementations of the 'coro-split', 'coro-elide', and 'coro-cleanup' passes, use a similar pattern as the one used here: a static function is shared across both old and new passes to detect if relevant coroutine intrinsics are delcared. To make this pattern easier to read, this patch adds `const` keywords to the parameters of `coro::declaresIntrinsics`. Reviewers: GorNishanov, lewissbaker, junparser, chandlerc, deadalnix, wenlei Reviewed By: wenlei Subscribers: ychen, wenlei, EricWF, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71898	2020-02-17 13:27:48 -05:00
Nikita Popov	3eaa53e805	Reapply "[IRBuilder] Virtualize IRBuilder" Relative to the original commit, this fixes some warnings, and is based on the deletion of the IRBuilder copy constructor in D74693. The automatic copy constructor would no longer be safe. ----- Related llvm-dev thread: http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html This patch moves the IRBuilder from templating over the constant folder and inserter towards making both of these virtual. There are a couple of motivations for this: 1. It's not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers). 2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but... 3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments. 4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.) This patch addresses the issue by making the following changes: * IRBuilderDefaultInserter::InsertHelper becomes virtual. IRBuilderBase accepts a reference to it. * IRBuilderFolder is introduced as a virtual base class. It is implemented by ConstantFolder (default), NoFolder and TargetFolder. IRBuilderBase has a reference to this as well. * All the logic is moved from IRBuilder to IRBuilderBase. This means that methods can in the future replace their IRBuilder<> & uses (or other specific IRBuilder types) with IRBuilderBase & and thus be usable with different IRBuilders. * The IRBuilder class is now a thin wrapper around IRBuilderBase. Essentially it only stores the folder and inserter and takes care of constructing the base builder. What this patch doesn't do, but should be simple followups after this change: * Fixing use of the inserter for creation methods originally defined on IRBuilderBase. * Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful. * Moving code from the IRBuilder header to the source file. From the user perspective, these changes should be mostly transparent: The only thing that consumers using a custom inserted may need to do is inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper as public. Differential Revision: https://reviews.llvm.org/D73835	2020-02-17 19:04:11 +01:00
Nikita Popov	80397d2d12	[IRBuilder] Delete copy constructor D73835 will make IRBuilder no longer trivially copyable. This patch deletes the copy constructor in advance, to separate out the breakage. Currently, the IRBuilder copy constructor is usually used by accident, not by intention. In rG7c362b25d7a9 I've fixed a number of cases where functions accepted IRBuilder rather than IRBuilder &, thus performing an unnecessary copy. In rG5f7b92b1b4d6 I've fixed cases where an IRBuilder was copied, while an InsertPointGuard should have been used instead. The only non-trivial use of the copy constructor is the getIRBForDbgInsertion() helper, for which I separated construction and setting of the insertion point in this patch. Differential Revision: https://reviews.llvm.org/D74693	2020-02-17 18:14:48 +01:00
Benjamin Kramer	564a9de28e	Hide implementation details. NFC>	2020-02-17 17:55:23 +01:00
Benjamin Kramer	5fc5c7db38	Strength reduce vectors into arrays. NFCI.	2020-02-17 15:37:35 +01:00
Nikita Popov	5f7b92b1b4	[IRBuilder] Prefer InsertPointGuard over full copy; NFC Don't copy the IRBuilder when an InsertPointGuard would also do.	2020-02-16 18:02:29 +01:00
Nikita Popov	7c362b25d7	[IRBuilder] Fix unnecessary IRBuilder copies; NFC Fix a few cases where an IRBuilder is passed to a helper function by value, while a by reference pass was intended.	2020-02-16 17:57:18 +01:00
Nikita Popov	af480e8c63	Revert "[IRBuilder] Virtualize IRBuilder" This reverts commit `0765d3824d`. This reverts commit `1b04866a3d`. Relevant looking crashes observed on: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win	2020-02-16 17:01:10 +01:00
Sanjay Patel	62dd44d76d	[VectorCombine] fix cost calc for extract-cmp getOperationCost() is not the cost we wanted; that's not the throughput value that the rest of the calculation uses. We may want to switch everything in this code to use the getInstructionThroughput() wrapper to avoid these kinds of problems, but I'll look at that as a follow-up because that can create other logical diffs via using optional parameters (we'd need to speculatively create the vector instruction to make a fair(er) comparison).	2020-02-16 10:40:28 -05:00
Nikita Popov	893c630fbe	[InstCombine] Create new log2 intrinsic; NFCI Rather than mixing creation of new instructions and in-place modification here, create a new log2 intrinsic. This should be NFC apart from worklist order changes.	2020-02-16 15:52:09 +01:00
Nikita Popov	1b04866a3d	[IRBuilder] Try to fix warnings Try to fix -Wnon-virtual-dtor warnings that cause build failure on clang-pcc64le-rhel.	2020-02-16 15:32:11 +01:00
Nikita Popov	0765d3824d	[IRBuilder] Virtualize IRBuilder Related llvm-dev thread: http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html This patch moves the IRBuilder from templating over the constant folder and inserter towards making both of these virtual. There are a couple of motivations for this: 1. It's not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers). 2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but... 3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments. 4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.) This patch addresses the issue by making the following changes: * IRBuilderDefaultInserter::InsertHelper becomes virtual. IRBuilderBase accepts a reference to it. * IRBuilderFolder is introduced as a virtual base class. It is implemented by ConstantFolder (default), NoFolder and TargetFolder. IRBuilderBase has a reference to this as well. * All the logic is moved from IRBuilder to IRBuilderBase. This means that methods can in the future replace their IRBuilder<> & uses (or other specific IRBuilder types) with IRBuilderBase & and thus be usable with different IRBuilders. * The IRBuilder class is now a thin wrapper around IRBuilderBase. Essentially it only stores the folder and inserter and takes care of constructing the base builder. What this patch doesn't do, but should be simple followups after this change: * Fixing use of the inserter for creation methods originally defined on IRBuilderBase. * Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful. * Moving code from the IRBuilder header to the source file. From the user perspective, these changes should be mostly transparent: The only thing that consumers using a custom inserted may need to do is inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper as public. Differential Revision: https://reviews.llvm.org/D73835	2020-02-16 13:48:55 +01:00
Johannes Doerfert	1d5da8cd30	[Attributor][FIX] Use pointer not reference as it can be null	2020-02-15 20:38:49 -06:00
Florian Hahn	f8045b250d	Recommit "[SCCP] Remove forcedconstant, go to overdefined instead" This includes a fix for cases where things get marked as overdefined in ResolvedUndefsIn, but we later discover a constant. To avoid crashing, we consistently bail out on overdefined values in the visitors. This is similar to the previous behavior with forcedconstant. This reverts the revert commit `02b72f564c`.	2020-02-15 18:36:44 +01:00
Simon Pilgrim	8a48c4a97c	Fix boolean/bitwise operator precedence warnings. NFCI.	2020-02-15 13:53:18 +00:00
Johannes Doerfert	ef746aa11f	[Attributor] Collect memory accesses with their respective kind and location In addition to a single bit per memory locations, e.g., globals and arguments, we now collect more information about the actual accesses, e.g., what instruction caused it, was it a read/write/read+write, and what the underlying base pointer was. Follow up patches will make explicit use of this. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D73527	2020-02-15 02:12:04 -06:00
Fangrui Song	fd5665af2c	[Attributor] Fix -Wunused-variable for -DLLVM_ENABLE_ASSERTIONS=off builds after `b4352e43d8`	2020-02-14 21:47:19 -08:00
Johannes Doerfert	b70297a39a	[Attributor][FIX] Ensure abstract attributes are existing before manifest While the function return updateImpl did only look at call sites the manifest method looked at return values. If we don't do this during the updateImpl we might create new abstract attributes during manifest. This is a problem when it comes to liveness information.	2020-02-14 21:44:46 -06:00
Johannes Doerfert	ad121ea14d	[Attributor] Manifest simplified (return) values properly If we simplify a function return value we have to modify the return instructions.	2020-02-14 21:44:46 -06:00
Johannes Doerfert	b53af0e7f9	[Attributor][FIX] Collapse `undef` to a proper value If we see an undef we cannot assume it's the same as "no value". For now we just collapse it to 0.	2020-02-14 21:44:46 -06:00
Johannes Doerfert	137c99a6a5	[Attributor][FIX] Restrict cross-SCC call deletion If we know a call was not needed we might have ended up deleting it even if it was in a different SCC. This prevents us from doing so.	2020-02-14 21:44:46 -06:00
Johannes Doerfert	32e98a7089	[Attributor][FIX] Carefully strip casts in AANoAlias We can strip casts in AANoAlias but that might cause us to end up with a non-pointer type. We do properly handle that case now.	2020-02-14 21:44:46 -06:00
Johannes Doerfert	b4352e43d8	[Attributor][FIX] Do not RAUW void values This caused an error when passes iterated over cached assumptions in the tracker and assumed them to be `null` or an instruction. I failed to create a test case so far.	2020-02-14 21:44:46 -06:00
Johannes Doerfert	282f5d7ad1	[Attributor] Derive memory location attributes (argmemonly, ...) In addition to memory behavior attributes (readonly/writeonly) we now derive memory location attributes (argmemonly/inaccessiblememonly/...). The former is part of AAMemoryBehavior and the latter part of AAMemoryLocation. While they are similar in nature it got messy when they were put in a single AA. Location attributes for arguments and floating values will follow later. Note that both memory attributes kinds can derive readnone. If there are no accesses AAMemoryBehavior will derive readnone. If there are accesses but only to stack (=local) locations AAMemoryLocation will derive readnone. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D73426	2020-02-14 19:05:51 -06:00
Johannes Doerfert	7cbb107feb	[Attributor][FIX] Validate the type for AAValueConstantRange as needed Due to the genericValueTraversal we might visit values for which we did not create an AAValueConstantRange object, e.g., as they are behind a PHI or select or call with `returned` argument. As a consequence we need to validate the types as we are about to query AAValueConstantRange for operands.	2020-02-14 17:22:40 -06:00
Alina Sbirlea	1326a5a4cf	[LoopRotate] Get and update MSSA only if available in legacy pass manager. Summary: Potential fix for: https://bugs.llvm.org/show_bug.cgi?id=44889 and https://bugs.llvm.org/show_bug.cgi?id=44408 In the legacy pass manager, loop rotate need not compute MemorySSA when not being in the same loop pass manager with other loop passes. There isn't currently a way to differentiate between the two cases, so this attempts to limit the usage in LoopRotate to only update MemorySSA when the analysis is already available. The side-effect of this is that it will split the Loop pipeline. This issue does not apply to the new pass manager, where we have a flag specifying if all loop passes in that loop pass manager preserve MemorySSA. Reviewers: dmgreen, fedor.sergeev, nikic Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74574	2020-02-14 10:47:26 -08:00
Kadir Cetinkaya	1674f772b4	[VecotrCombine] Fix unused variable for assertion disabled builds	2020-02-14 09:30:29 +01:00
Vedant Kumar	8e77b33b3c	[Local] Do not move around dbg.declares during replaceDbgDeclare replaceDbgDeclare is used to update the descriptions of stack variables when they are moved (e.g. by ASan or SafeStack). A side effect of replaceDbgDeclare is that it moves dbg.declares around in the instruction stream (typically by hoisting them into the entry block). This behavior was introduced in llvm/r227544 to fix an assertion failure (llvm.org/PR22386), but no longer appears to be necessary. Hoisting a dbg.declare generally does not create problems. Usually, dbg.declare either describes an argument or an alloca in the entry block, and backends have special handling to emit locations for these. In optimized builds, LowerDbgDeclare places dbg.values in the right spots regardless of where the dbg.declare is. And no one uses replaceDbgDeclare to handle things like VLAs. However, there doesn't seem to be a positive case for moving dbg.declares around anymore, and this reordering can get in the way of understanding other bugs. I propose getting rid of it. Testing: stage2 RelWithDebInfo sanitized build, check-llvm rdar://59397340 Differential Revision: https://reviews.llvm.org/D74517	2020-02-13 14:35:02 -08:00
Sanjay Patel	19b62b79db	[VectorCombine] try to form vector binop to eliminate an extract element binop (extelt X, C), (extelt Y, C) --> extelt (binop X, Y), C This is a transform that has been considered for canonicalization (instcombine) in the past because it reduces instruction count. But as shown in the x86 tests, it's impossible to know if it's profitable without a cost model. There are many potential target constraints to consider. We have implemented similar transforms in the backend (DAGCombiner and target-specific), but I don't think we have this exact fold there either (and if we did it in SDAG, it wouldn't work across blocks). Note: this patch was intended to handle the more general case where the extract indexes do not match, but it got too big, so I scaled it back to this pattern for now. Differential Revision: https://reviews.llvm.org/D74495	2020-02-13 17:23:27 -05:00
Vedant Kumar	02b72f564c	Revert "Recommit "[SCCP] Remove forcedconstant, go to overdefined instead"" This reverts commit `bb310b3f73`. This breaks the stage2 ASan build, see: https://bugs.llvm.org/show_bug.cgi?id=44898 rdar://59431448	2020-02-13 11:55:18 -08:00
stozer	9bda7ab835	Re-revert: Recover debug intrinsics when killing duplicated/empty blocks This reverts commit `61b35e4111`. This commit causes a timeout in chromium builds; likely to have a similar cause to the previous timeout issue caused by this commit (see `6ded69f294` for more details). It is possible that there is no way to fix this bug that will not cause this issue; further investigations as to the efficiency of handling large amounts of debug info will be necessary.	2020-02-13 11:48:19 +00:00
Johannes Doerfert	70cac41a2b	Reapply "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late" Reapply `8a56d64d76` with minor fixes. The problem was that cancellation can cause new edges to the parallel region exit block which is not outlined. The CodeExtractor will encode the information which "exit" was taken as a return value. The fix is to ensure we do not return any value from the outlined function, to prevent control to value conversion we ensure a single exit block for the outlined region. This reverts commit `3aac953afa`.	2020-02-12 22:29:07 -06:00
Johannes Doerfert	3aac953afa	Revert "[OpenMP][IRBuilder] Perform finalization (incl. outlining) late" This reverts commit `8a56d64d76`. Will be recommitted once the clang test problem is addressed.	2020-02-12 18:50:43 -06:00
Johannes Doerfert	8a56d64d76	[OpenMP][IRBuilder] Perform finalization (incl. outlining) late In order to fix PR44560 and to prepare for loop transformations we now finalize a function late, which will also do the outlining late. The logic is as before but the actual outlining step happens now after the function was fully constructed. Once we have loop transformations we can apply them in the finalize step before the outlining. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D74372	2020-02-12 17:55:01 -06:00
Johannes Doerfert	23f41f16d4	[Attributor] Use fine-grained liveness in all helpers We used coarse-grained liveness before, thus we looked if the instruction was executed, but we did not use fine-grained liveness, hence if the instruction was needed or could be deleted even if the surrounding ones are live. This patches introduces this level of liveness checks together with other liveness queries, e.g., for uses. For more control we enforce that all liveness queries go through the Attributor. Test have been adjusted to reflect the changes or augmented to prevent deletion of the parts we want to check. Reviewed By: sstefan1 Differential Revision: https://reviews.llvm.org/D73313	2020-02-12 17:36:38 -06:00
Johannes Doerfert	b2c76002ca	[Attributor] Ignore uses if a value is simplified If we have a replacement for a value, via AAValueSimplify, the original value will lose all its uses. Thus, as long as a value is simplified we can skip the uses in checkForAllUses, given that these uses are transitive uses for the simplified version and will therefore affect the simplified version as necessary. Since this allowed us to remove calls without side-effects and a known return value, we need to make sure not to eliminate `musttail` calls. Those we keep around, or later remove the entire `musttail` call chain.	2020-02-12 17:36:38 -06:00
Johannes Doerfert	86509e8c3b	[Attributor] Use assumed information to determine side-effects We relied on wouldInstructionBeTriviallyDead before but that functions does not take assumed information, especially for calls, into account. The replacement, AAIsDead::isAssumeSideEffectFree, does. This change makes AAIsDeadCallSiteReturn more complex as we can have a dead call or only dead users. The test have been modified to include a side effect where there was none in order to keep the coverage. Reviewed By: sstefan1 Differential Revision: https://reviews.llvm.org/D73311	2020-02-12 17:36:38 -06:00
Ehud Katz	d8a2ea9fd5	[LoopExtractor] Fix legacy pass dependencies Fixes a memory leak of allocating `LoopInfoWrapperPass` and `DominatorTreeWrapperPass`.	2020-02-12 22:39:21 +02:00
Vedant Kumar	34d9f93977	[AddressSanitizer] Ensure only AllocaInst is passed to dbg.declare Various parts of the LLVM code generator assume that the address argument of a dbg.declare is not a `ptrtoint`-of-alloca. ASan breaks this assumption, and this results in local variables sometimes being unavailable at -O0. GlobalISel, SelectionDAG, and FastISel all do not appear to expect dbg.declares to have a `ptrtoint` as an operand. This means that they do not place entry block allocas in the usual side table reserved for local variables available in the whole function scope. This isn't always a problem, as LLVM can try to lower the dbg.declare to a DBG_VALUE, but those DBG_VALUEs can get dropped for all the usual reasons DBG_VALUEs get dropped. In the ObjC test case I'm looking at, the cause happens to be that `replaceDbgDeclare` has hoisted dbg.declares into the entry block, causing LiveDebugValues to "kill" the DBG_VALUEs because the lexical dominance check fails. To address this, I propose: 1) Have ASan (always) pass an alloca to dbg.declares (this patch). This is a narrow bugfix for -O0 debugging. 2) Make replaceDbgDeclare not move dbg.declares around. This should be a generic improvement for optimized debug info, as it would prevent the lexical dominance check in LiveDebugValues from killing as many variables. This means reverting llvm/r227544, which fixed an assertion failure (llvm.org/PR22386) but no longer seems to be necessary. I was able to complete a stage2 build with the revert in place. rdar://54688991 Differential Revision: https://reviews.llvm.org/D74369	2020-02-12 11:24:02 -08:00
Jay Foad	32aac25637	[KnownBits] Introduce anyext instead of passing a flag into zext Summary: This was a very odd API, where you had to pass a flag into a zext function to say whether the extended bits really were zero or not. All callers passed in a literal true or false. I think it's much clearer to make the function name reflect the operation being performed on the value we're tracking (rather than on the KnownBits Zero and One fields), so zext means the value is being zero extended and new function anyext means the value is being extended with unknown bits. NFC. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74482	2020-02-12 19:06:53 +00:00
Florian Hahn	bb310b3f73	Recommit "[SCCP] Remove forcedconstant, go to overdefined instead" This version includes a fix for a set of crashes caused by marking values depending on a yet unknown & tracked call as overdefined. In some cases, we would later discover that the call has a constant result and try to mark a user of it as constant, although it was already marked as overdefined. Most instruction handlers bail out early if the instruction is already overdefined. But that is not necessary for CastInsts for example. By skipping values that depend on skipped calls, we resolve the crashes and also improve the precision in some cases (see resolvedundefsin-tracked-fn.ll). Note that we may not skip PHI nodes that may depend on a skipped call, but they can be safely marked as overdefined, as we bail out early if the PHI node is overdefined. This reverts the revert commit a74b31a3e9cd844c7ce2087978568e3f5ec8519.	2020-02-12 18:02:18 +00:00
Anh Tuyen Tran	a5b6480d05	[NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files Summary: This refactor patch removes some header files which are not needed and also add some to meet IWYU principles. Reviewers: rnk (Reid Kleckner), Meinersbur (Michael Kruse), dmgreen (Dave Green) Reviewed By: dmgreen (Dave Green), rnk (Reid Kleckner), Meinersbur (Michael Kruse) Subscribers: dmgreen (Dave Green), Whitney (Whitney Tsang), hiraditya (Aditya Kumar), zzheng (Z. Zheng), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D73498	2020-02-12 17:57:56 +00:00
Alina Sbirlea	4f33a68973	Compute ORE, BPI, BFI in Loop passes. Summary: Passes ORE, BPI, BFI are not being preserved by Loop passes, hence it is incorrect to retrieve these passes as cached. This patch makes the loop passes in question compute a new instance. In some of these cases, however, it may be beneficial to change the Loop pass to a Function pass instead, similar to the change for LoopUnrollAndJam. Reviewers: chandlerc, dmgreen, jdoerfert, reames Subscribers: mehdi_amini, hiraditya, zzheng, steven_wu, dexonsmith, Whitney, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72891	2020-02-12 09:15:18 -08:00
Sven van Haastregt	665dcdacc0	Add missing newlines at EOF; NFC	2020-02-12 15:57:25 +00:00
stozer	61b35e4111	Re-reapply: Recover debug intrinsics when killing duplicated/empty blocks This reverts commit `636c93ed11`. The original patch caused build failures on TSan buildbots. Commit `6ded69f294` fixes this issue by reducing the rate at which empty debug intrinsics propagate, reducing the memory footprint and preventing a fatal spike.	2020-02-12 14:36:30 +00:00
Florian Hahn	81dbb6aec6	Recommit "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This includes a fix for the santizier failures. This reverts the revert commit `42f8b915eb`.	2020-02-12 14:17:50 +00:00
Ayman Musa	35f02aa021	Revert "[AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand." This reverts commit `cf155150f9`.	2020-02-12 15:04:49 +02:00
Ayman Musa	cf155150f9	[AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand.	2020-02-12 15:01:27 +02:00
stozer	ffeb64db35	Reapply "[DebugInfo] Prevent explosion of debug intrinsics during jump threading" This reverts commit `6ded69f294`.	2020-02-12 12:39:54 +00:00
Ayman Musa	3bda9059b8	[AggressiveInstCombine] Add support for select instruction. Differential Revision: https://reviews.llvm.org/D72837	2020-02-12 13:59:34 +02:00
stozer	6ded69f294	Revert "[DebugInfo] Prevent explosion of debug intrinsics during jump threading" This reverts commit `fe6f6cd6b8`. Found test failure on several buildbots.	2020-02-12 11:48:00 +00:00
Ayman Musa	49a4d85f6d	[NFC][AggressiveInstCombine] Remove redundant std::max. Differential Revision: https://reviews.llvm.org/D74476	2020-02-12 13:47:40 +02:00
stozer	fe6f6cd6b8	[DebugInfo] Prevent explosion of debug intrinsics during jump threading This patch is a fix following the revert of `72ce759` (https://reviews.llvm.org/rG72ce759928e6dfee6a9efa310b966c19722352ba) and fixes the failure that it caused. The above patch failed on the Thread Sanitizer buildbot with an out of memory error. After an investigation, the cause was identified as an explosion in debug intrinsics while running the Jump Threading pass on ModuleMap.ll. The above patched prevented debug intrinsics from being dropped when their Basic Block was deleted due to being "empty". In this case, one of the functions in ModuleMap.ll had (after many optimization passes) a very large number of debug intrinsics representing a set of repeatedly inlined variables. Previously the vast majority of these were silently dropped during Jump Threading when their blocks were deleted, but as of the above patch they survived for longer, causing a large increase in the number of debug intrinsics. These intrinsics were then repeatedly cloned by the Jump Threading pass as edges were threaded, multiplying the intrinsic count further. The memory consumed by this process spiralled out of control, crashing the buildbot that uses TSan (which has an estimated 5-10x memory overhead compared to non-sanitized builds). This patch adds RemoveRedundantDbgInstrs to the Jump Threading pass, in order to reduce the number of debug intrinsics down to a manageable amount in cases where many intrinsics for the same variable end up bunched together contiguously, as in this case. Differential Revision: https://reviews.llvm.org/D73054	2020-02-12 11:22:54 +00:00
Florian Hahn	fa74b31a3e	Revert "[SCCP] Remove forcedconstant, go to overdefined instead" This causes a crash for the reproducer below enum { a }; enum b { c, d }; e; static _Bool g(struct f *h, enum b i) { i &&j(); return a; } static k(char h, enum b i) { _Bool l = g(e, i); l; } m(h) { k(h, c); g(h, d); } This reverts commit `aadb635e04`.	2020-02-12 09:41:19 +00:00
Matt Arsenault	86f9117d47	AMDGPU: Don't report 2-byte alignment as fast This is apparently worse than 1-byte alignment. This does not attempt to decompose 2-byte aligned wide stores, but will stop trying to produce them. Also fix bug in LoadStoreVectorizer which was decreasing the alignment and vectorizing stack accesses. It was assuming a stack object was an alloca that could have its base alignment changed, which is not true if the pointer is derived from a function argument.	2020-02-11 18:35:00 -05:00
Johannes Doerfert	52aec3221f	[Attributor][NFC] Clarify the documentation a bit more	2020-02-11 15:11:55 -06:00
Johannes Doerfert	8e62968d45	[Attributor] Identify dead uses in PHIs (almost) based on dead edges As an approximation to a dead edge we can check if the terminator is dead. If so, the corresponding operand use in a PHI node is dead even if the PHI node itself is not.	2020-02-11 15:11:55 -06:00
Teresa Johnson	80d0a137a5	Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP" This restores commit `748bb5a0f1`, along with a fix for a Chromium test suite build issue (and a new test for that case). Differential Revision: https://reviews.llvm.org/D73242	2020-02-11 10:48:05 -08:00
Johannes Doerfert	185e9b083e	[Attributor][NFC] Improve documentation	2020-02-11 11:19:34 -06:00
Johannes Doerfert	f95553923f	[Attributor] Return uses do not free pointers If a pointer is returned that does not mean it is freed in the current (function) scope. We can ignore such uses in AANoFree.	2020-02-11 11:02:59 -06:00
Johannes Doerfert	4c62a35860	[Attributor][FIX] Remove duplicate, half-broken functionality The changeXXXAfterManifest functions are better suited to deal with changes so we should prefer them. These functions also recursively delete dead instructions which is why we see test changes.	2020-02-11 11:02:59 -06:00
Johannes Doerfert	77a9e61c9a	[Attributor][NFC] Improve debug message	2020-02-11 11:02:59 -06:00
Nikita Popov	5a8819b216	[InstCombine] Use replaceOperand() in more places This is a followup to D73803, which uses the replaceOperand() helper in more places. This should be NFC apart from changes to worklist order. Differential Revision: https://reviews.llvm.org/D73919	2020-02-11 17:38:23 +01:00
Florian Hahn	aadb635e04	[SCCP] Remove forcedconstant, go to overdefined instead This patch removes forcedconstant to simplify things for the move to ValueLattice, which includes constant ranges, but no forced constants. This patch removes forcedconstant and changes ResolvedUndefsIn to mark instructions with unknown operands as overdefined. This means we do not do simplifications based on undef directly in SCCP any longer, but this seems to hardly come up in practice (see stats below), presumably because InstCombine & others take care of most of the relevant folds already. It is still beneficial to keep ResolvedUndefIn, as it allows us delaying going to overdefined until we propagated all known information. I also built MultiSource, SPEC2000 and SPEC2006 and compared sccp.IPNumInstRemoved and sccp.NumInstRemoved. It looks like the impact is quite low: Tests: 244 Same hash: 238 (filtered out) Remaining: 6 Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...arks/VersaBench/dbms/dbms.test 4.00 3.00 -25.0% test-suite...TimberWolfMC/timberwolfmc.test 38.00 34.00 -10.5% test-suite...006/453.povray/453.povray.test 158.00 155.00 -1.9% test-suite.../CINT2000/176.gcc/176.gcc.test 668.00 668.00 0.0% test-suite.../CINT2006/403.gcc/403.gcc.test 1209.00 1209.00 0.0% test-suite...arks/mafft/pairlocalalign.test 76.00 76.00 0.0% Tests: 244 Same hash: 238 (filtered out) Remaining: 6 Metric: sccp.NumInstRemoved Program base patch diff test-suite...arks/mafft/pairlocalalign.test 185.00 175.00 -5.4% test-suite.../CINT2006/403.gcc/403.gcc.test 2059.00 2056.00 -0.1% test-suite.../CINT2000/176.gcc/176.gcc.test 2358.00 2357.00 -0.0% test-suite...006/453.povray/453.povray.test 317.00 317.00 0.0% test-suite...TimberWolfMC/timberwolfmc.test 12.00 12.00 0.0% Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61314	2020-02-11 15:24:15 +00:00
Kadir Cetinkaya	42f8b915eb	Revert "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This reverts commit `d0c4d4fe09`. Revert "[DSE,MSSA] Move more passing test cases from todo to simple.ll." This reverts commit `02266e64bb`. Revert "[DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE." This reverts commit `74f03e4ff0`.	2020-02-11 15:34:48 +01:00
Sanjay Patel	a2a0f9a43a	[VectorCombine] remove unused debug counter; NFC The variable was added to the initial commit via copy/paste of existing code, but it wasn't actually used in the code. We can add it back with the proper usage if/when that is needed.	2020-02-11 08:24:07 -05:00
Sanjay Patel	b8ebc11f03	[EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083) As discussed in PR41083: https://bugs.llvm.org/show_bug.cgi?id=41083 ...we can assert/crash in EarlyCSE using the current hashing scheme and instructions with flags. ValueTracking's matchSelectPattern() may rely on overflow (nsw, etc) or other flags when detecting patterns such as min/max/abs composed of compare+select. But the value numbering / hashing mechanism used by EarlyCSE intersects those flags to allow more CSE. Several alternatives to solve this are discussed in the bug report. This patch avoids the issue by doing simple matching of min/max/abs patterns that never requires instruction flags. We give up some CSE power because of that, but that is not expected to result in much actual performance difference because InstCombine will canonicalize these patterns when possible. It even has this comment for abs/nabs: /// Canonicalize all these variants to 1 pattern. /// This makes CSE more likely. (And this patch adds PhaseOrdering tests to verify that the expected transforms are still happening in the standard optimization pipelines. I left this code to use ValueTracking's "flavor" enum values, so we don't have to change the callers' code. If we decide to go back to using the ValueTracking call (by changing the hashing algorithm instead), it should be obvious how to replace this chunk. Differential Revision: https://reviews.llvm.org/D74285	2020-02-10 17:25:34 -05:00
Hiroshi Yamauchi	bb383ae612	[CallPromotionUtils] Add tryPromoteCall. Summary: It attempts to devirtualize a call on alloca through vtable loads. Reviewers: davidxl Subscribers: mgorny, Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71308	2020-02-10 13:43:16 -08:00
Sanjay Patel	62ce7e650a	[InstCombine] fix use check when canonicalizing abs/nabs We were checking for extra uses of the negated operand even if we were not going to create it as part of this canonicalization. This was showing up as a regression when we limit EarlyCSE as proposed in D74285.	2020-02-10 14:57:37 -05:00
David Stenberg	982944525c	Revert "[InstCombine][DebugInfo] Fold constants wrapped in metadata" This reverts commit `b54a8ec1bc`. The commit triggered debug invariance (different output with/without -g). The patch seems to have exposed a pre-existing invariance problem in GlobalOpt, which I'll write a bug report for.	2020-02-10 17:58:33 +01:00
Bill Wendling	c55cf4afa9	Revert "Remove redundant "std::move"s in return statements" The build failed with error: call to deleted constructor of 'llvm::Error' errors. This reverts commit `1c2241a793`.	2020-02-10 07:07:40 -08:00
Bill Wendling	1c2241a793	Remove redundant "std::move"s in return statements	2020-02-10 06:39:44 -08:00
Mikael Holmen	a50c0b0df7	Fix compiler warning when compiling without asserts [NFC]	2020-02-10 13:55:52 +01:00
Florian Hahn	d0c4d4fe09	[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk). This patch adds a first version of a MemorySSA based DSE. It is missing a lot of features, which will get added as follow-ups, to help to keep the review manageable. The patch uses the following general approach: given a MemoryDef, walk upwards to find clobbering MemoryDefs that may be killed by the starting def. Then check that there are no uses that may read the location of the original MemoryDef in between both MemoryDefs. A bit more concretely: For all MemoryDefs StartDef: 1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards. 2. Check that there no reads between DomAccess and the StartDef by checking all uses starting at DomAccess and walking until we see StartDef. 3. For each found DomDef, check that: 1. There are no barrier instructions between DomDef and StartDef (like throws or stores with ordering constraints). 2. StartDef is executed whenever DomDef is executed. 3. StartDef completely overwrites DomDef. 4. Erase DomDef from the function and MemorySSA. The patch uses a very simple approach to guarantee that no throwing instructions are between 2 stores: We only allow accesses to stack objects, access that are in the same basic block if the block does not contain any throwing instructions or accesses in functions that do not contain any throwing instructions. This will get lifted later. Besides adding support for the missing cases, there is plenty of additional potential for improvements as follow-up work, e.g. the way we visit stores (could be just a traversal of the MemorySSA, rather than collecting them up-front), using the alias information discovered during walking to optimize the MemorySSA. This is loosely based on D40480 by Dave Green. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72700	2020-02-10 11:52:11 +00:00
Florian Hahn	da52b9c118	[DSE] Add tests for MemorySSA based DSE. This copies the DSE tests into a MSSA subdirectory to test the MemorySSA backed DSE implementation, without disturbing the original tests. Differential Revision: https://reviews.llvm.org/D72145	2020-02-10 10:28:43 +00:00
Johannes Doerfert	87ddf1f4fa	[Attributor] Simple casts preserve no-alias property This is a minimal but important advancement over the existing code. A cast with an operand that is only used in the cast retains the no-alias property of the operand.	2020-02-10 01:11:32 -06:00
Johannes Doerfert	8155439331	[Attributor] Allow PHI nodes in AAValueConstantRangeFloating Traversing PHI nodes is natural with the genericValueTraversal but also a bit tricky. The problem is similar to the ones we have seen in AAAlign and AADereferenceable, namely that we continue to increase the range in each iteration. We use a pessimistic approach here to stop the iterations. Nevertheless, optimistic information can now be propagated through a PHI node.	2020-02-10 00:55:10 -06:00
Johannes Doerfert	63adbb9a0e	[Attributor][FIX] Remove FIXME that seems outdated The change is performed as stated by the FIXME and the tests are adjusted. All changes look fine to me and values can be inferred as undef without it being an error.	2020-02-10 00:55:10 -06:00
Johannes Doerfert	7e7e6594b3	[Attributor] Allow SelectInst in AAValueConstantRangeFloating The genericValueTraversal will already handle SelectInst properly and we just needed to allow them in the initialize method.	2020-02-10 00:55:09 -06:00
Johannes Doerfert	ffdbd2a06c	[Attributor] Look through (some) casts in AAValueConstantRangeFloating Casts can be handled natively by the ConstantRange class. We do limit it to extends for now as we assume an integer type in different locations. A TODO and a test case with a FIXME was added to remove that restriction in the future.	2020-02-10 00:38:01 -06:00
Johannes Doerfert	028db8c490	[Attributor][FIX] Call right base method in AAValueConstantRangeFloating We now call the base class method as we should.	2020-02-10 00:38:01 -06:00
Michael Liao	ab3da5dd66	Fix `-Wparentheses` warning. NFC.	2020-02-10 00:45:02 -05:00
Sanjay Patel	a17f03bd93	[VectorCombine] new IR transform pass for partial vector ops We have several bug reports that could be characterized as "reducing scalarization", and this topic was also raised on llvm-dev recently: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html ...so I'm proposing that we deal with these patterns in a new, lightweight IR vector pass that runs before/after other vectorization passes. There are 4 alternate options that I can think of to deal with this kind of problem (and we've seen various attempts at all of these), but they all have flaws: InstCombine - can't happen without TTI, but we don't want target-specific folds there. SDAG - too late to assist other vectorization passes; TLI is not equipped for these kind of cost queries; limited to a single basic block. CGP - too late to assist other vectorization passes; would need to re-implement basic cleanups like CSE/instcombine. SLP - doesn't fit with existing transforms; limited to a single basic block. This initial patch/transform is based on existing code in AggressiveInstCombine: we walk backwards through the function looking for a pattern match. But we diverge from that cost-independent IR canonicalization pass by using TTI to decide if the vector alternative is profitable. We probably have at least 10 similar bug reports/patterns (binops, constants, inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements. It's possible that we could iterate on a worklist to fix-point like InstCombine does, but it's safer to start with a most basic case and evolve from there, so I didn't try to do anything fancy with this initial implementation. Differential Revision: https://reviews.llvm.org/D73480	2020-02-09 10:04:41 -05:00
Ehud Katz	3b70ee27a5	[LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass The LoopExtractor created new functions (by definition), which violates the restrictions of a LoopPass. The correct implementation of this pass should be as a ModulePass. Includes reverting rL82990 implications on the LoopExtractor. Fixes PR3082 and PR8929. Differential Revision: https://reviews.llvm.org/D69069	2020-02-09 12:25:21 +02:00
Johannes Doerfert	b0c77c36d2	[Attributor] Add an Attributor CGSCC pass and run it In addition to the module pass, this patch introduces a CGSCC pass that runs the Attributor on a strongly connected component of the call graph (both old and new PM). The Attributor was always design to be used on a subset of functions which makes this patch mostly mechanical. The one change is that we give up `norecurse` deduction in the module pass in favor of doing it during the CGSCC pass. This makes the interfaces simpler but can be revisited if needed. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D70767	2020-02-08 21:27:34 -06:00
Johannes Doerfert	e565db49c6	[OpenMP][Opt] Delete terminating and read-only parallel regions Parallel regions known to be read-only, e.g., after we removed all dead write accesses, and terminating (`willreturn`) can be removed. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D69954	2020-02-08 18:52:04 -06:00
Johannes Doerfert	e28936f613	[OpenMP][Opt] Annotate known runtime functions and deduplicate more This adds ~27 more runtime calls to the OpenMPKinds.def file, all with attributes. We deduplicate 16 of those automatically in function = thread scope. And we annotate all of them automatically during the OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to track annotation coverage is included. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D69984	2020-02-08 18:35:39 -06:00
Nikita Popov	a05932931c	[InstCombine] Refactor foldICmpAndShift(); NFCI Separate out handling for shl, lshr and ashr. The combined handling obscured some overly pessimistic requirements for the transform.	2020-02-08 22:27:43 +01:00
Johannes Doerfert	9548b74a83	[OpenMP] Introduce the OpenMPOpt transformation pass The OpenMPOpt pass is a CGSCC pass in which OpenMP specific optimizations can reside. The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime calls and their uses. This allows targeted transformations and eases their implementation. This initial patch deduplicates `__kmpc_global_thread_num` and `omp_get_thread_num` calls. We can also identify arguments that are equivalent to such a call result and use it instead. Later we can determine "gtid" arguments based on the use in kernel functions etc. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D69930	2020-02-08 14:47:03 -06:00
Johannes Doerfert	72277ecd62	Introduce a CallGraph updater helper class The CallGraphUpdater is a helper that simplifies the process of updating the call graph, both old and new style, while running an CGSCC pass. The uses are contained in different commits, e.g. D70767. More functionality is added as we need it. Reviewed By: modocache, hfinkel Differential Revision: https://reviews.llvm.org/D70927	2020-02-08 14:16:48 -06:00
George Burgess IV	f8c9ceb1ce	[SimplifyLibCalls] Add __strlen_chk. Bionic has had `__strlen_chk` for a while. Optimizing that into a constant is quite profitable, when possible. Differential Revision: https://reviews.llvm.org/D74079	2020-02-08 11:51:00 -08:00
Nikita Popov	a148b9e990	[InstCombine] Fix infinite min/max canonicalization loop (PR44541) While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541, it does so in a more roundabout manner and there might be other loopholes to trigger the same issue. This is a more direct fix, that prevents the transform if the min/max is based on a non-canonical sub X, 0 instruction. Differential Revision: https://reviews.llvm.org/D73849	2020-02-08 20:42:17 +01:00
Nikita Popov	5b2b67be8e	[InstCombine] Remove unnecessary worklist push; NFCI This is no longer needed after `d4627b90a0`, should have dropped it there...	2020-02-08 17:09:28 +01:00
Nikita Popov	d4627b90a0	[InstCombine] Avoid modifying instructions in-place As discussed on D73919, this replaces a few cases where we were modifying multiple operands of instructions in-place with the creation of a new instruction, which we generally prefer nowadays. This tends to be more readable and less prone to worklist management bugs. Test changes are only superficial (instruction naming and order).	2020-02-08 17:05:56 +01:00
Nikita Popov	9d03b7d0d0	[InstCombine] Use swapValues(); NFC Less code, and makes it more obvious that these operands do not need to be added back to the worklist.	2020-02-08 16:57:28 +01:00
Nikita Popov	23db9724d0	[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835) Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform if it wouldn't actually do anything (apart from removing and reinserting the same instructions). Note that the test case doesn't loop on current master anymore, only on the LLVM 10 release branch. The issue is already mitigated on master due to worklist order fixes, but we should fix the root cause there as well. As a side note, we should probably assert in combineLoadToNewType() that it does not combine to the same type. Not doing this here, because this assertion would also be triggered in another place right now. Differential Revision: https://reviews.llvm.org/D74278	2020-02-08 16:55:22 +01:00
Akira Hatanaka	4dcc029edb	[ObjC][ARC] Keep track of phis that have been discovered to avoid an infinite loop This fixes a bug introduced in `6770fbb314`. rdar://problem/59137105	2020-02-07 20:33:11 -08:00
Akira Hatanaka	6770fbb314	[ObjC][ARC] Delete ARC runtime calls that take inert phi values This improves on the following patch, which removed ARC runtime calls taking inert global variables: https://reviews.llvm.org/D62433 rdar://problem/59137105	2020-02-07 16:31:36 -08:00
Hiroshi Yamauchi	4ed205c816	[PGO][PGSO] Enable profile guided size optimization for non-cold code under instrumentation PGO. Summary: This enables it for large working set size cases only. This does not enable it under sample PGO. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74073	2020-02-06 10:29:01 -08:00
Denis Antrushin	99a6e405ed	[IRCE] Use SCEVExpander to modify loop bound IRCE pass checks that it can calculate loop bounds by checking SCEV availability at loop entry. However it is possible that loop bound SCEV is loop invariant, but instruction used to compute it resides within loop. In such case adjusting loop bound in preheader using IRBuilder leads to malformed SSA. Use SCEVExpander instead to generate proper instructions. Reviewed-by: mkazantsev Differential Revision: https://reviews.llvm.org/D73496	2020-02-06 12:44:43 +03:00
Teresa Johnson	25aa2eef99	Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP" This reverts commit `748bb5a0f1`. Due to Chromium CFI+ThinLTO test crashes reported on patch.	2020-02-05 19:27:32 -08:00
Juneyoung Lee	5687acf431	[MemCpyOpt] Simplify find*Alignment	2020-02-06 06:42:07 +09:00
Juneyoung Lee	ad9ae6ee2b	MemCpyOpt cannot use ABI alignment even if it was not given Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given. Reviewers: gchatelet, lenary, nikic Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74083	2020-02-06 06:21:55 +09:00
Hiroshi Yamauchi	b70f23f599	[PGO][PGSO] Tune flags for profile guided size optimization. Summary: Tune the profile threshold flag value for instrumentation PGO based on internal benchmarks. Also, add flags to allow profile guided size optimizations for non-cold code to be enabled separately for instrumentation and sample PGSO. Neither changes the default behavior (yet) as it's disabled for non-cold code. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72937	2020-02-05 09:37:32 -08:00
Kazu Hirata	4698bf145d	Resubmit^2: [JumpThreading] Thread jumps through two basic blocks This reverts commit `41784bed01`. Since the original revision `ead815924e`, this revision fixes three issues: - This revision fixes the Windows build. My original patch improperly copied EH pads on Windows. This patch disregards jump threading opportunities having to do with EH pads. - This revision fixes jump threading to a wrong destination. Specifically, my original patch treated any Constant other than 0 as 1 while evaluating the branch condition. This bug led to treating constant expressions like: icmp ugt i8* null, inttoptr (i64 4 to i8) to "true". This patch fixes the bug by calling isOneValue. - This revision fixes the cost calculation of two basic blocks being threaded through. Note that getJumpThreadDuplicationCost returns "(unsigned)~0" for those basic blocks that cannot be duplicated. If we sum of two return values from getJumpThreadDuplicationCost, we could have an unsigned overflow like: (unsigned)~0 + 5 = 4 and mistakenly determine that it's safe and profitable to proceed with the jump threading opportunity. The patch fixes the bug by checking each return value before summing them up. [JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32 [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-02-05 09:23:37 -08:00
Alina Sbirlea	67904db23c	[IRCE] Make IRCE a Function pass. Summary: Make InductiveRangeCheckElimination a FunctionPass. Reviewers: reames, mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73592	2020-02-05 09:22:41 -08:00
Teresa Johnson	748bb5a0f1	[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP Summary: Currently type test assume sequences inserted for devirtualization are removed during WPD. This patch delays their removal until later in the optimization pipeline. This is an enabler for upcoming enhancements to indirect call promotion, for example streamlined promotion guard sequences that compare against vtable address instead of the target function, when there are small number of possible vtables (either determined via WPD or by in-progress type profiling). We need the type tests to correlate the callsites with the address point offset needed in the compare sequence, and optionally to associated type summary info computed during WPD. This depends on work in D71913 to enable invocation of LowerTypeTests to drop type test assume sequences, which will now be invoked following ICP in the ThinLTO post-LTO link pipelines, and also after the existing export phase LowerTypeTests invocation in regular LTO (which is already after ICP). We cannot simply move the existing import phase LowerTypeTests pass later in the ThinLTO post link pipelines, as the comment in PassBuilder.cpp notes (it must run early because when performing CFI other passes may disturb the sequences it looks for). This necessitated adding a new type test resolution "Unknown" that we can use on the type test assume sequences previously removed by WPD, that we now want LTT to ignore. Depends on D71913. Reviewers: pcc, evgeny777 Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73242	2020-02-05 08:59:48 -08:00
Alina Sbirlea	1c03cc5a39	[NFCI] Update according to style. clang-tidy + clang-format	2020-02-04 17:11:36 -08:00
Sanjay Patel	dc42ff6697	[InstCombine] add FIXME comment to shuffle transform; NFC Existing tests: rG5d04e008f708 rG2a191cf8500f ...should verify that the underlying analysis doesn't improve too much without updating this user code.	2020-02-04 13:02:06 -05:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Sanjay Patel	0cf0be993c	[InstCombine] fix operands of shouldChangeType() for casted phi transform This is a bug noted in the recent D72733 and seen in the similar transform just above the changed source code. I added tests with illegal types and zexts to show the bug - we could transform legal phi ops to illegal, etc. I did not add tests with trunc because we won't see any diffs on those patterns. That is because InstCombiner::SliceUpIllegalIntegerPHI() appears to do those transforms independently of datalayout. It can also create more casts than are present in existing code. There are some existing regression tests that do not include a datalayout that would be altered by this fix. I assumed that the lack of a datalayout in those regression files is an oversight, so I added the minimal layout (make i32 legal) necessary to preserve behavior on those tests. Differential Revision: https://reviews.llvm.org/D73907	2020-02-04 07:45:48 -05:00
Thomas Raoux	e53bbf1213	[GVN] Add GVNOption to control load-pre more fine-grained. Adds the global (cl::opt) GVNOption enable-load-in-loop-pre in order to control whether the optimization will be performed if the load is part of a loop. Patch by Hendrik Greving! Differential Revision: https://reviews.llvm.org/D73804	2020-02-03 23:00:58 -08:00
Tyker	15f54d348b	[NFC] Factor out function to detect if an attribute has an argument.	2020-02-03 22:27:24 +01:00
Alina Sbirlea	388de9dfcd	[LoopUtils] Make duplicate method a utility. [NFCI] Summary: Method appendLoopsToWorklist is duplicate in LoopUnroll and in the LoopPassManager as an internal method. Make it an utility. Reviewers: dmgreen, chandlerc, fedor.sergeev, yamauchi Subscribers: mehdi_amini, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73569	2020-02-03 10:24:18 -08:00
Nikita Popov	575a975afd	[SimplifyLibCalls] Remove unused IRBuilder argument; NFC isLocallyOpenedFile() does not use IRBuilder.	2020-02-03 19:12:57 +01:00
Nikita Popov	878cb38a5c	[InstCombine] Add replaceOperand() helper Adds a replaceOperand() helper, which is like Instruction.setOperand() but adds the old operand to the worklist. This reduces the amount of missing or incorrect worklist management. This only applies the helper to a relatively small subset of setOperand() calls in InstCombine, namely those of the pattern `I.setOperand(); return &I;`, where it is most obviously applicable. Differential Revision: https://reviews.llvm.org/D73803	2020-02-03 19:00:17 +01:00
Nikita Popov	e6c9ab4fb7	[InstCombine] Rename worklist methods; NFC This renames Worklist.AddDeferred() to Worklist.add() and Worklist.Add() to Worklist.push(). The intention here is that Worklist.add() should be the go-to method for explicit worklist management, while the raw Worklist.push() is mostly for InstCombine internals. I will then migrate uses of Worklist.push() to Worklist.add() in followup changes. As suggested by spatel on D73411 I'm also changing the remaining method names to lowercase first character, in line with current coding standards. Differential Revision: https://reviews.llvm.org/D73745	2020-02-03 18:56:51 +01:00
Nikita Popov	a59954051e	[InstCombine] Fix unused variable warning; NFC	2020-02-03 18:47:38 +01:00
Teresa Johnson	bed4d9c897	[ThinLTO] More efficient export computation (NFC) Summary: A recent change to enable more importing of global variables with references exposed some efficiency issues with export computation. See D73724 for more information and detailed analysis. The first was specific to variable importing. The code was marking every copy of a referenced value (from possibly thousands of files in the case of linkonce_odr) as exported, and we only need to mark the copy in the module containing the variable def being imported as exported. The reason is that this is tracking what values are newly exported as a result of importing. Anything that was defined in another module and simply used in the exporting module is already exported, and would have been identified by the caller (e.g. the LTO API implementations). The second issue is that the code was re-adding previously exported values (along with all references). It is easy to identify when a variable was already imported into the same module (via the import list insert call return value), and we already did this for function importing. However, what we weren't doing for either function or variable importing was avoiding a re-insertion when it was previously exported into a different importing module. The reason we couldn't do this is there was no way of telling from the export list whether it was previously inserted there because its definition was exported (in which case we already marked all its references as exported) from when it was inserted there because it was referenced by another exported value (in which case we haven't yet inserted its own references). To address this we can restructure the way the export list is constructed. This patch only adds the actual imported definitions (variable or function) to the export list for its module during the import computation. After import computation is complete, where we were already post-processing the export list we go ahead and add all references made by those exported values to the export list. These changes speed up the thin link not only with constant variable importing enabled, but also without (due to the efficiency improvement in function importing). Some thin link user time measurements for one large application, average of 5 runs: With constant variable importing enabled: - without this patch: 479.5s - with this patch: 74.6s Without constant variable importing enabled: - without this patch: 80.6s - with this patch: 70.3s Note I have not re-enabled constant variable importing here, as I would like to do additional compile time measurements with these fixes first. Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73851	2020-02-03 09:15:33 -08:00
Sanjay Patel	e78fb556c5	[InstCombine] reassociate splatted vector ops bo (splat X), (bo Y, OtherOp) --> bo (splat (bo X, Y)), OtherOp This patch depends on the splat analysis enhancement in D73549. See the test with comment: ; Negative test - mismatched splat elements ...as the motivation for that first patch. The motivating case for reassociating splatted ops is shown in PR42174: https://bugs.llvm.org/show_bug.cgi?id=42174 In that example, a slight change in order-of-associative math results in a big difference in IR and codegen. This patch gets all of the unnecessary shuffles out of the way, but doesn't address the potential scalarization (see D50992 or D73480 for that). Differential Revision: https://reviews.llvm.org/D73703	2020-02-03 09:08:36 -05:00
Sam Parker	2663a25fad	[JumpThreading] Half the duplicate threshold at Oz Duplicating instructions can lead to code size increases but using a threshold of 3 is good for reducing code size. Differential Revision: https://reviews.llvm.org/D72916	2020-02-03 08:40:20 +00:00
Johannes Doerfert	26d02b0f28	[Attributor] AANoRecurse check all call sites for `norecurse` If all call sites are in `norecurse` functions we can derive `norecurse` as the ReversePostOrderFunctionAttrsPass does. This should make ReversePostOrderFunctionAttrsLegacyPass obsolete once the Attributor is enabled. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D72017	2020-02-02 23:57:17 -06:00
Johannes Doerfert	368f7ee7a5	[Attributor] Propagate known information from `checkForAllCallSites` If we know that all call sites have been processed we can derive an early fixpoint. The use in this patch is likely not to trigger right now but a follow up patch will make use of it. Reviewed By: uenoku, baziotis Differential Revision: https://reviews.llvm.org/D72016	2020-02-02 23:57:17 -06:00
Juneyoung Lee	578d2e2cb1	[llvm-extract] Add -keep-const-init commandline option Summary: This adds -keep-const-init option to llvm-extract which preserves initializers of used global constants. For example: ``` $ cat a.ll @g = constant i32 0 define i32 @f() { %v = load i32, i32* @g ret i32 %v } $ llvm-extract --func=f a.ll -S -o - @g = external constant i32 define i32 @f() { .. } $ llvm-extract --func=f a.ll -keep-const-init -S -o - @g = constant i32 0 define i32 @f() { .. } ``` This option is useful in checking whether a function that uses a constant global is optimized correctly. Reviewers: jsji, MaskRay, david2050 Reviewed By: MaskRay Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73833	2020-02-03 14:30:28 +09:00
Johannes Doerfert	342357c568	[Inliner][NoAlias] Use call site attributes too If we had `noalias` on an argument the inliner created alias scope metadata already. However, the call site `noalias` annotation was not considered. Since the Attributor can derive such call site `noalias` annotation we should treat them the same as argument annotations. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D73528	2020-02-02 23:21:29 -06:00
Tyker	a7bbe45a3e	Build assume from call Fix attempt this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.	2020-02-02 19:43:36 +01:00
Tyker	7cb5d96fbe	Revert "[WIP] Build assume from call" casued buildbot failure This reverts commit `8ebe001553`.	2020-02-02 18:35:19 +01:00
Tyker	8ebe001553	[WIP] Build assume from call Summary: this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72475	2020-02-02 18:15:50 +01:00
Tyker	c2d0336208	Revert "[WIP] Build assume from call" caused build bot failure This reverts commit `780d2c532f`.	2020-02-02 18:09:06 +01:00
Tyker	780d2c532f	[WIP] Build assume from call Summary: this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72475	2020-02-02 17:54:31 +01:00
Tyker	ad8ffc5010	Revert "[WIP] Build assume from call" This reverts commit `355e4bfd78`.	2020-02-02 17:49:23 +01:00
Tyker	355e4bfd78	[WIP] Build assume from call Summary: this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72475	2020-02-02 17:17:46 +01:00
Tyker	0adda3df92	Revert "[WIP] Build assume from call" This reverts commit `2ff5602cb5`.	2020-02-02 15:05:33 +01:00
Tyker	d431c5d9af	Revert "[NFC] Factor out function to detect if an attribute has an argument." This reverts commit `ff1b9add2f`.	2020-02-02 15:03:06 +01:00
Tyker	ff1b9add2f	[NFC] Factor out function to detect if an attribute has an argument. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72884	2020-02-02 14:50:31 +01:00
Tyker	2ff5602cb5	[WIP] Build assume from call Summary: this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72475	2020-02-02 14:50:31 +01:00
Fangrui Song	ba3a1774a9	[Transforms] Simplify with make_early_inc_range	2020-02-02 00:54:32 -08:00
Artur Pilipenko	34547ac959	NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween Separated from https://reviews.llvm.org/D68006 review.	2020-01-31 15:22:33 -08:00
Nikita Popov	ff17da3f75	[InstCombine] Push negation through multiply (PR44234) Fixes https://bugs.llvm.org/show_bug.cgi?id=44234 by adding multiply support to freelyNegateValue(). Only one of the operands needs to be negatible, so this still fits within the framework. Differential Revision: https://reviews.llvm.org/D73410	2020-01-31 20:58:55 +01:00
Hiroshi Yamauchi	ac8da31a0f	[PGO][PGSO] Handle MBFIWrapper Some code gen passes use MBFIWrapper to keep track of the frequency of new blocks. This was not taken into account and could lead to incorrect frequencies as MBFI silently returns zero frequency for unknown/new blocks. Add a variant for MBFIWrapper in the PGSO query interface. Depends on D73494.	2020-01-31 09:36:55 -08:00
Sanjay Patel	bc1148e7bc	[PATCH] D73727: [SLP] drop poison-generating flags for shuffle reduction ops (PR44536) We may calculate reassociable math ops in arbitrary order when creating a shuffle reduction, so there's no guarantee that things like 'nsw' hold on those intermediate values. Drop all poison-generating flags for safety. This change is limited to shuffle reductions because I don't think we have a problem in the general case (where we intersect flags of each scalar op that goes into a vector op), but if there's evidence of other cases being wrong, we can extend this fix to cover those cases. https://bugs.llvm.org/show_bug.cgi?id=44536 Differential Revision: https://reviews.llvm.org/D73727	2020-01-31 09:54:35 -05:00
Nikita Popov	480391035c	[InstCombine] Remove unnecessary worklist add; NFCI Again, this will already be added by IRBuilder.	2020-01-30 23:24:59 +01:00
Nikita Popov	90b5ed996b	[InstCombine] Remove unnecessary worklist add; NFCI The IRBuilder will automatically add instructions to the worklist. Adding it manually is unnecessary, but may mess up worklist order.	2020-01-30 23:06:28 +01:00
Nikita Popov	cad91074a6	[InstCombine] Create new insts in foldICmpEqIntrinsicWithConstant; NFCI In line with current conventions, create new instructions rather than modify two operands in place and performing manual worklist management. This should be NFC apart from possible worklist order changes.	2020-01-30 23:03:16 +01:00
Whitney Tsang	e44f4a8a54	[LoopFusion] Move instructions from FC1.GuardBlock to FC0.GuardBlock and from FC0.ExitBlock to FC1.ExitBlock when proven safe. Summary: Currently LoopFusion give up when the second loop nest guard block or the first loop nest exit block is not empty. For example: if (0 < N) { for (int i = 0; i < N; ++i) {} x+=1; } y+=1; if (0 < N) { for (int i = 0; i < N; ++i) {} } The above example should be safe to fuse. This PR moves instructions in FC1 guard block (e.g. y+=1;) to FC0 guard block, or instructions in FC0 exit block (e.g. x+=1;) to FC1 exit block, which then LoopFusion is able to fuse them. Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D73641	2020-01-30 18:02:22 +00:00
David Stenberg	b54a8ec1bc	[InstCombine][DebugInfo] Fold constants wrapped in metadata Summary: When constant folding, constants that are wrapped in metadata were not folded. This could lead to dbg.values being the only user of a constant expression, due to the non-dbg uses having been rewritten, resulting in the constant later on being removed by some other pass. This occurred with the attached test case, in which the non-rewritten GEP in the dbg.value intrinsic was later on removed by globalopt. This patch makes the code look through metadata and fold such constants. I guess that we in the future may want to allow dbg.values using GEPs and other constant expressions to be emittable even if there are no non-dbg uses, but for example SelectionDAG does not support that. Reviewers: jmorse, aprantl, vsk, davide Reviewed By: aprantl, vsk, davide Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73630	2020-01-30 15:50:16 +01:00
Piotr Sobczak	dd7148822b	[InstCombine][AMDGPU] Trim components of s_buffer_load Summary: Add trimming of unused components of s_buffer_load. For s_buffer_load and unformatted buffer_load also trim unused components at the beginning of vector and update offset accordingly. Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71785	2020-01-30 10:48:25 +01:00
Michael Forster	676c29694c	Inline debug variable. Summary: In a release build this variable becomes unused and may break the build with `-Werror,-Wunused-variable`. Reviewers: gribozavr2, jdoerfert, sstefan1 Reviewed By: gribozavr2 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73683	2020-01-30 10:29:24 +01:00
Nikita Popov	8058196677	[InstCombine] Process newly inserted instructions in the correct order InstCombine operates on the basic premise that the operands of the currently processed instruction have already been simplified. It achieves this by pushing instructions to the worklist in reverse program order, so that instructions are popped off in program order. The worklist management in the main combining loop also makes sure to uphold this invariant. However, the same is not true for all the code that is performing manual worklist management. The largest problem (addressed in this patch) are instructions inserted by InstCombine's IRBuilder. These will be pushed onto the worklist in order of insertion (generally matching program order), which means that a) the users of the original instruction will be visited first, as they are pushed later in the main loop and b) the newly inserted instructions will be visited in reverse program order. This causes a number of problems: First, folds operate on instructions that have not had their operands simplified, which may result in optimizations being missed (ran into this in https://reviews.llvm.org/D72048#1800424, which was the original motivation for this patch). Additionally, this increases the amount of folds InstCombine has to perform, both within one iteration, and by increasing the number of total iterations. This patch addresses the issue by adding a Worklist.AddDeferred() method, which is used for instructions inserted by IRBuilder. These will only be added to the real worklist after the combine finished, and in reverse order, so they will end up processed in program order. I should note that the same should also be done to nearly all other uses of Worklist.Add(), but I'm starting with just this occurrence, which has by far the largest test fallout. Most of the test changes are due to https://bugs.llvm.org/show_bug.cgi?id=44521 or other cases where we don't canonicalize something. These are neutral. One regression has been addressed in D73575 and D73647. The remaining regression in an shl+sdiv fold can't really be fixed without dropping another transform, but does not seem particularly problematic in the first place. Differential Revision: https://reviews.llvm.org/D73411	2020-01-30 09:40:10 +01:00
Francesco Petrogalli	623cff81fe	[llvm][VectorUtils] Tweak VFShape for scalable vector functions. Summary: This patch makes sure that the field VFShape.VF is greater than zero when demangling the vector function name of scalable vector functions encoded in the "vector-function-abi-variant" attribute. This change is required to be able to provide instances of VFShape that can be used to query the VFDatabase for the vectorization passes, as such passes always require a positive value for the Vectorization Factor (VF) needed by the vectorization process. It is not possible to extract the value of VFShape.VF from the mangled name of scalable vector functions, because it is encoded as `x`. Therefore, the VFABI demangling function has been modified to extract such information from the IR declaration of the vector function, under the assumption that _all_ vectors in the signature of the vector function have the same number of lanes. Such assumption is valid because it is also assumed by the Vector Function ABI specifications supported by the demangling function (x86, AArch64, and LLVM internal one). The unit tests that demangle scalable names have been modified by adding the IR module that carries the declaration of the vector function name being demangled. In particular, the demangling function fails in the following cases: 1. When the declaration of the scalable vector function is not present in the module. 2. When the value of VFSHape.VF is not greater than 0. Reviewers: jdoerfert, sdesmalen, andwar Reviewed By: jdoerfert Subscribers: mgorny, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73286	2020-01-30 05:53:56 +00:00
Johannes Doerfert	89c2e733e8	[Attributor] Pointer privatization attribute (argument promotion) A pointer is privatizeable if it can be replaced by a new, private one. Privatizing pointer reduces the use count, interaction between unrelated code parts. This is a first step towards replacing argument promotion. While we can already handle recursion (unlike argument promotion!) we are restricted to stack allocations for now because we do not analyze the uses in the callee. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D68852	2020-01-29 21:31:04 -06:00
Johannes Doerfert	791c9f1145	[Attributor] Fix TODO to avoid recomputation of results The helpers AAReturnedFromReturnedValues and AACallSiteReturnedFromReturned are useful not only to avoid code duplication but also to avoid recomputation of results. If we have N call sites we should not recompute the function return information N times but once. These are mostly straightforward usages with some minor improvements on the helpers and addition of a new one (IRPosition::getAssociatedType) that knows about function return types.	2020-01-29 19:24:34 -06:00
Nikita Popov	e086e23024	[InstCombine] Support non-splat vectors in icmp eq + add/sub fold For the icmp eq (add X, C1), C2 => icmp eq X, C2-C1 icmp eq (sub C1, X), C2 => icmp eq X, C1-C2 folds, this allows C1 to be non-splat and contain undefs. C2 is still splat, due to the structure of the code. This is to address the remaining part of the regression in D73411, where demanded element analysis replaces some elements with undef. Differential Revision: https://reviews.llvm.org/D73647	2020-01-29 20:56:58 +01:00
Elia Geretto	ab2300bc15	[PassManagerBuilder] Remove global extension when a plugin is unloaded This commit fixes PR39321. GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction. This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet. Differential Revision: https://reviews.llvm.org/D71959	2020-01-29 16:15:45 +00:00
Simon Pilgrim	79748add70	Fix MSVC lamdba default capture mode warning. NFCI.	2020-01-29 15:47:04 +00:00
Whitney Tsang	da58e68fdf	[LoopFusion] Move instructions from FC1.Preheader to FC0.Preheader when proven safe. Summary: Currently LoopFusion give up when the second loop nest preheader is not empty. For example: for (int i = 0; i < 100; ++i) {} x+=1; for (int i = 0; i < 100; ++i) {} The above example should be safe to fuse. This PR moves instructions in FC1 preheader (e.g. x+=1; ) to FC0 preheader, which then LoopFusion is able to fuse them. Reviewer: kbarton, Meinersbur, jdoerfert, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71821	2020-01-29 15:06:11 +00:00
Sanjay Patel	87f6314f8c	[InstCombine] canonicalize splat shuffle after cmp cmp (splat V1, M), SplatC --> splat (cmp V1, SplatC'), M As discussed in PR44588: https://bugs.llvm.org/show_bug.cgi?id=44588 ...we try harder to push shuffles after binops than after compares. This patch handles the special (but presumably most common case) of splat shuffles. If both operands are splats, then we can do the comparison on the non-splat inputs followed by splat of the compare. That should take care of the regression noted in D73411. There's another potential fold requested in PR37463 to scalarize the compare, but that's another patch (and it's not clear if we can do that without the ability to undo it later): https://bugs.llvm.org/show_bug.cgi?id=37463 Differential Revision: https://reviews.llvm.org/D73575	2020-01-29 08:34:29 -05:00
Johannes Doerfert	76843ba37f	[Attributor][Fix] Initialize unused but loaded variable This hopefully un-breaks: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/38333	2020-01-28 23:52:16 -06:00
Johannes Doerfert	ea5fabe60c	[Attributor] Reuse existing logic to avoid duplication There was a TODO in AAValueConstantRangeArgument to reuse AAArgumentFromCallSiteArguments. We now do this by allowing new States to be build from the bestState.	2020-01-28 23:45:59 -06:00
Johannes Doerfert	224085409d	[Attributor][FIX] Treat invalidated attributes as changed If we invalidate an attribute we need to inform all dependent ones even if the fixpoint state is not invalid. Before we only continued invalidation if the fixpoint state was invalid, now we signal a change in case the fixpoint state is valid. The test case was already included in D71620 but the problem was hiding because it only manifested with the old PM (for that input).	2020-01-28 23:40:41 -06:00
Johannes Doerfert	53992c7bf7	[Attributor] Modularize AANoAliasCallSiteArgument to simplify extensions This patch modularizes the way we check for no-alias call site arguments by putting the existing logic into helper functions. The reasoning was not changed but special cases for readonly/readnone were added.	2020-01-28 23:39:29 -06:00
Johannes Doerfert	24ae77eebf	[Attributor] Mark a non-defined `null` pointer as `noalias` If `null` is not defined we cannot access it, hence the pointer is `noalias`. While this is not helpful on it's own it simplifies later deductions that can skip over already known `noalias` pointers in certain situations.	2020-01-28 23:09:37 -06:00
Johannes Doerfert	6626d1b7c0	[Attributor][NFC] Remove ugly and unneeded cast	2020-01-28 22:54:31 -06:00
Johannes Doerfert	02bd8180fc	[Attributor][NFC] Improve debug messages	2020-01-28 22:53:19 -06:00
Johannes Doerfert	b6dbd0f71f	[Attributor][NFC] Internalize helper function	2020-01-28 22:50:34 -06:00
Vedant Kumar	8359511c62	[CodeExtractor] Remove stale llvm.assume calls from extracted region During extraction, stale llvm.assume handles may be retained in the original function. The setup is: 1) CodeExtractor unregisters assumptions in the blocks that are to be extracted. 2) Extraction happens. There are now two functions: f1 and f1.extracted. 3) Leftover assumptions in f1 (/not/ removed as they were not in the set of blocks to be extracted) now have affected-value llvm.assume handles in f1.extracted. When assumptions for a value used in f1 are looked up, ValueTracking can assert as some of the handles are in the wrong function. To fix this, simply erase the llvm.assume calls in the extracted function. Alternatives include flushing the assumption cache in the original function, or walking all values used in the original function to prune stale affected-value handles. Both seem more expensive. Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled rdar://58460728	2020-01-28 17:18:01 -08:00
Eli Friedman	2f6b9edfa8	[AliasAnalysis] Add missing FMRB_* enums. Previously, the enums didn't account for all the possible cases, which could cause misleading results (particularly for a "switch" on FunctionModRefBehavior). Fixes regression in polly from recent patch to add writeonly to memset. While I'm here, also fix a few dubious uses of the FMRB_* enum values. Differential Revision: https://reviews.llvm.org/D73154	2020-01-28 15:47:08 -08:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Whitney Tsang	cd0cff4392	[NFCI][LoopUnrollAndJam] Minor changes. Summary: 1. Add assertions. 2. Verify more analyses. These changes are moved out of https://reviews.llvm.org/D73129 to simplify that review. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: dmgreen Subscribers: fhahn, hiraditya, zzheng, llvm-commits, prithayan, anhtuyen Tag: LLVM Differential Revision: https://reviews.llvm.org/D73204	2020-01-28 20:24:23 +00:00
Petr Hosek	127d3abf25	[Instrumentation] Set hidden visibility for the bias variable We have to avoid using a GOT relocation to access the bias variable, setting the hidden visibility achieves that. Differential Revision: https://reviews.llvm.org/D73529	2020-01-28 12:07:03 -08:00
Sanjay Patel	7a717d82ff	[InstCombine] refactor foldVectorCmp(); NFC We can handle other patterns here as shown in PR44588.	2020-01-28 14:40:48 -05:00
Florian Hahn	5d0ffbeb4d	[Matrix] Mark expressions shared between multiple remarks. This patch adds support for explicitly highlighting sub-expressions shared by multiple leaf nodes. For example consider the following code %shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10 %trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10 tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10 %load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10 %mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11 tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11 We have two leaf nodes (the 2 stores) and the first store stores %trans which is also used by the matrix multiply %mult. We generate separate remarks for each leaf (stores). To denote that parts are shared, the shared expressions are marked as shared (), with a reference to the other remark that shares it. The operation summary also denotes the shared operations separately. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D72526	2020-01-28 09:27:55 -08:00
Florian Hahn	d1f849a284	[LV] Hoist code to mark conditional assumes as dead to caller (NFC). This is a follow-up suggested in D73423. It is sufficient to just add the conditional assumes to DeadInstructions once.	2020-01-28 08:50:44 -08:00
Michael Liao	9c54b42338	Fix warning of `-Wcast-qual`. NFC.	2020-01-28 11:36:43 -05:00
Florian Hahn	a911fef3dd	[LV] Do not try to sink dead instructions. Dead instructions do not need to be sunk. Currently we try and record the recipies for them, but there are no recipes emitted for them and there's nothing to sink. They can be removed from SinkAfter while marking them for recording. Fixes PR44634. Reviewers: rengolin, hsaito, fhahn, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D73423	2020-01-28 08:28:03 -08:00
Whitney Tsang	78dc64989c	[CodeMoverUtils] Improve IsControlFlowEquivalent. Summary: Currently IsControlFlowEquivalent determine if two blocks are control flow equivalent by checking if A dominates B and B post dominates A. There exists blocks that are control flow equivalent even if they don't satisfy the A dominates B and B post dominates A condition. For example, if (cond) A if (cond) B In the PR, we determine if two blocks are control flow equivalent by also checking if the two sets of conditions A and B depends on are equivalent. Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn, hfinkel, kbarton Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71578	2020-01-28 14:18:00 +00:00
Florian Hahn	62e228f8fd	[Matrix] Add info about number of operations to remarks. This patch updates the remark to also include a summary of the number of vector operations generated for each matrix expression. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D72480	2020-01-27 17:43:39 -08:00
Florian Hahn	949294f396	[Matrix] Add optimization remarks for matrix expression. Generate remarks for matrix operations in a function. To generate remarks for matrix expressions, the following approach is used: 1. Collect leafs of matrix expressions (done in RemarkGenerator::getExpressionLeafs). Leafs are lowered matrix instructions without other matrix users (like stores). 2. For each leaf, create a remark containing a linearizied version of the matrix expression. The following improvements will be submitted as follow-ups: * Summarize number of vector instructions generated for each expression. * Account for shared sub-expressions. * Propagate matrix remarks up the inlining chain. The information provided by the matrix remarks helps users to spot cases where matrix expression got split up, e.g. due to inlining not happening. The remarks allow users to address those issues, ensuring best performance. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D72453	2020-01-27 16:39:29 -08:00
Sanjay Patel	747242af8d	[InstCombine] allow more narrowing of casted select D47163 created a rule that we should not change the casted type of a select when we have matching types in its compare condition. That was intended to help vector codegen, but it also could create situations where we miss subsequent folds as shown in PR44545: https://bugs.llvm.org/show_bug.cgi?id=44545 By using shouldChangeType(), we can continue to get the vector folds (because we always return false for vector types). But we also solve the motivating bug because it's ok to narrow the scalar select in that example. Our canonicalization rules around select are a mess, but AFAICT, this will not induce any infinite looping from the reverse transform (but we'll need to watch for that possibility if committed). Side note: there's a similar use of shouldChangeType() for phi ops just below this diff, and the source and destination types appear to be reversed. Differential Revision: https://reviews.llvm.org/D72733	2020-01-27 16:35:50 -05:00
Sanjay Patel	242fed9d7f	[InstCombine] convert fsub nsz with fneg operand to -(X + Y) This was noted in D72521 - we need to match fneg specifically to consistently handle that pattern along with (-0.0 - X).	2020-01-27 14:49:15 -05:00
Nikita Popov	bcfa0f592f	[InstCombine] Move negation handling into freelyNegateValue() Followup to D72978. This moves existing negation handling in InstCombine into freelyNegateValue(), which make it composable. In particular, root negations of div/zext/sext/ashr/lshr/sub can now always be performed through a shl/trunc as well. Differential Revision: https://reviews.llvm.org/D73288	2020-01-27 20:46:23 +01:00
Teresa Johnson	2f63d549f1	Restore "[LTO/WPD] Enable aggressive WPD under LTO option" This restores `59733525d3` (D71913), along with bot fix `19c76989bb`. The bot failure should be fixed by D73418, committed as `af954e441a`. I also added a fix for non-x86 bot failures by requiring x86 in new test lld/test/ELF/lto/devirt_vcall_vis_public.ll.	2020-01-27 07:55:05 -08:00
Whitney Tsang	2b335e9aae	[LoopUnroll] Remove remapInstruction(). Summary: LoopUnroll can reuse the RemapInstruction() in ValueMapper, or remapInstructionsInBlocks() in CloneFunction, depending on the needs. There is no need to have its own version in LoopUnroll. By calling RemapInstruction() without TypeMapper or Materializer and with Flags (RF_NoModuleLevelChanges \| RF_IgnoreMissingLocals), it does the same as remapInstruction(). remapInstructionsInBlocks() calls RemapInstruction() exactly as described. Looking at the history, I cannot find any obvious reason to have its own version. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto, foad, aprantl Reviewed By: jdoerfert Subscribers: hiraditya, zzheng, llvm-commits, prithayan, anhtuyen Tag: LLVM Differential Revision: https://reviews.llvm.org/D73277	2020-01-27 15:42:13 +00:00
Guillaume Chatelet	07c9d53266	[Alignment][NFC] Use Align with CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73449	2020-01-27 10:58:36 +01:00
Guillaume Chatelet	d0a7cc7177	[Alignment][NFC] Use Align with CreateMaskedScatter/Gather Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 This patch shows that CreateMaskedScatter/CreateMaskedGather can only take positive non zero alignment values. Reviewers: courbet Subscribers: hiraditya, llvm-commits, delena Tags: #llvm Differential Revision: https://reviews.llvm.org/D73361	2020-01-27 10:17:14 +01:00
Evgenii Stepanov	1df8549b26	[msan] Instrument x86.pclmulqdq* intrinsics. Summary: These instructions ignore parts of the input vectors which makes the default MSan handling too strict and causes false positive reports. Reviewers: vitalybuka, RKSimon, thakis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73374	2020-01-24 14:31:06 -08:00
Andy Kaylor	b35b7da460	[PGO] Attach appropriate funclet operand bundles to value profiling instrumentation calls Patch by Chris Chrulski When generating value profiling instrumentation, ensure the call gets the correct funclet token, otherwise WinEHPrepare will turn the call (and all subsequent instructions) into unreachable. Differential Revision: https://reviews.llvm.org/D73221	2020-01-24 11:20:53 -08:00
Simon Pilgrim	abd1927d44	Fix some comment typos. NFC.	2020-01-24 18:18:42 +00:00
Alina Sbirlea	0d90d2457c	[LoopStrengthReduce] Teach LoopStrengthReduce to preserve MemorySSA is available.	2020-01-24 10:13:52 -08:00
Andy Kaylor	a33accde95	[PGO] Early detection regarding whether pgo counter promotion is possible Patch by Chris Chrulski This fixes a problem with the current behavior when assertions are enabled. A loop that exits to a catchswitch instruction is skipped for the counter promotion, however this check was being done after the PGOCounterPromoter tried to collect an insertion point for the exit block. A call to getFirstInsertionPt() on a block that begins with a catchswitch instruction triggers an assertion. This change performs a check whether the counter promotion is possible prior to collecting the ExitBlocks and InsertPts. Differential Revision: https://reviews.llvm.org/D73222	2020-01-24 09:55:41 -08:00
Guillaume Chatelet	805c157e8a	[Alignment][NFC] Deprecate Align::None() Summary: This is a follow up on https://reviews.llvm.org/D71473#inline-647262. There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case. Reviewers: xbolva00, courbet, bollu Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73099	2020-01-24 12:53:58 +01:00
Evgeny Leviant	8973fae195	[WPD] Allow load/save bitcoded index when running opt -wholeprogramdevirt Differential revision: https://reviews.llvm.org/D73094	2020-01-24 00:31:39 -08:00
Andy Kaylor	c467faf23c	[WinEH] Ignore lifetime.end PHI nodes in empty cleanuppads This fixes a bug where a PHI node that is only referenced by a lifetime.end intrinsic in an otherwise empty cleanuppad can cause SimplyCFG to create an SSA violation while removing the empty cleanuppad. Theoretically the same problem can occur with debug intrinsics. Differential Revision: https://reviews.llvm.org/D72540	2020-01-23 18:18:50 -08:00
Teresa Johnson	90e630a95e	Revert "[LTO/WPD] Enable aggressive WPD under LTO option" This reverts commit `59733525d3`. There is a windows sanitizer bot failure in one of the cfi tests that I will need some time to figure out: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio	2020-01-23 17:29:24 -08:00
Johannes Doerfert	7ad17e008b	[Attributor] Avoid REQUIRED dependences in favor of OPTIONAL ones When we use information only to short-cut deduction or improve it, we can use OPTIONAL dependences instead of REQUIRED ones to avoid cascading pessimistic fixpoints. We also need to track dependences only when we use assumed information, e.g., we act on assumed liveness information.	2020-01-23 18:42:46 -06:00
Johannes Doerfert	214ed3f676	[Attributor] Record dependences only when necessary If we use assumed information from AAValueSimplify we need to record an OPTIONAL dependence, otherwise we do not.	2020-01-23 18:42:45 -06:00
Johannes Doerfert	5429c82db2	[Attributor][FIX] Avoid dangling pointers during code deletion It can happen that we have instructions in the ToBeDeletedInsts set which are deleted earlier already. To avoid dangling pointers we use weak tracking handles.	2020-01-23 18:42:45 -06:00
Johannes Doerfert	ff6254dc26	[Attributor][FIX] Handle non-pointers when following uses When we follow uses, e.g., in AAMemoryBehavior or AANoCapture, we need to make sure the value is a pointer before we ask for abstract attributes only valid for pointers. This happens because we follow pointers through calls that do not capture but may return the value.	2020-01-23 18:42:45 -06:00
Johannes Doerfert	9dcf889d15	[Attributor][NFC] Do not (try to) simplify void values We might accidentally ask AAValueSimplify to simplify a void value. That can lead to very interesting, and very wrong, results. We now handle this case gracefully.	2020-01-23 18:42:45 -06:00
Alina Sbirlea	1d09174290	[LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI] Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility method, which sets to nullptr the instructions that are not trivially dead. Use the new method in LoopStrengthReduce. Alternative: add a bool to the same method; this option adds a marginal amount of overhead to the other callers, and the method needs to be updated to return a bool status when it removes/doesn't remove instructions.	2020-01-23 16:27:32 -08:00
Johannes Doerfert	30179d7ecf	[Attributor][FIX][Alignment] Do not report a change if there was none If alignment was manifested but it is actually only as good as the data-layout provided one we should not report it as a change. For testing purposes we still manifest the information.	2020-01-23 18:13:52 -06:00
Johannes Doerfert	e273ac4d88	[Attributor][NFC] Add an assertion	2020-01-23 18:13:52 -06:00
Johannes Doerfert	d07b5a5525	[Attributor][NFC] Fix spelling	2020-01-23 18:13:52 -06:00
Johannes Doerfert	2baf000ecc	[Attributor] `byval` arguments are always `noalias` `byval` introduces a local copy of the argument. That copy cannot alias anything.	2020-01-23 18:13:52 -06:00
Johannes Doerfert	30ae859c69	[Attributor][FIX] Store alignment only holds for the pointer value We accidentally used the store alignment for the value operand as well, which is incorrect and crashed the SPASS application in the test suite.	2020-01-23 18:13:52 -06:00
Teresa Johnson	59733525d3	[LTO/WPD] Enable aggressive WPD under LTO option Summary: Third part in series to support Safe Whole Program Devirtualization Enablement, see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html This patch adds type test metadata under -fwhole-program-vtables, even for classes without hidden visibility. It then changes WPD to skip devirtualization for a virtual function call when any of the compatible vtables has public vcall visibility. Additionally, internal LLVM options as well as lld and gold-plugin options are added which enable upgrading all public vcall visibility to linkage unit (hidden) visibility during LTO. This enables the more aggressive WPD to kick in based on LTO time knowledge of the visibility guarantees. Support was added to all flavors of LTO WPD (regular, hybrid and index-only), and to both the new and old LTO APIs. Unfortunately it was not simple to split the first and second parts of this part of the change (the unconditional emission of type tests and the upgrading of the vcall visiblity) as I needed a way to upgrade the public visibility on legacy WPD llvm assembly tests that don't include linkage unit vcall visibility specifiers, to avoid a lot of test churn. I also added a mechanism to LowerTypeTests that allows dropping type test assume sequences we now aggressively insert when we invoke distributed ThinLTO backends with null indexes, which is used in testing mode, and which doesn't invoke the normal ThinLTO backend pipeline. Depends on D71907 and D71911. Reviewers: pcc, evgeny777, steven_wu, espindola Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71913	2020-01-23 16:09:44 -08:00
Alina Sbirlea	9e66c4ec12	[Utils] Use WeakTrackingVH in vector used as scratch storage. The utility method RecursivelyDeleteTriviallyDeadInstructions receives as input a vector of Instructions, where all inputs are valid instructions. This same vector is used as a scratch storage (per the header comment) to recursively delete instructions. If an instruction is added as an operand of multiple other instructions, it may be added twice, then deleted once, then the second reference in the vector is invalid. Switch to using a Vector<WeakTrackingVH>. This change facilitates a clean-up in LoopStrengthReduction.	2020-01-23 16:04:57 -08:00
Florian Hahn	4ed7355e44	[IPSCCP] Use ParamState for arguments at call sites. We currently use integer ranges to merge concrete function arguments. We use the ParamState range for those, but we only look up concrete values in the regular state. For concrete function arguments that are themselves arguments of the containing function, we can use the param state directly and improve the precision in some cases. Besides improving the results in some cases, this is also a small step towards switching to ValueLatticeElement, by allowing D60582 to be a NFC. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71836	2020-01-23 13:55:42 -08:00
Teresa Johnson	458676db6e	[WPD/VFE] Always emit vcall_visibility metadata for -fwhole-program-vtables Summary: First patch to support Safe Whole Program Devirtualization Enablement, see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html Always emit !vcall_visibility metadata under -fwhole-program-vtables, and not just for -fvirtual-function-elimination. The vcall visibility metadata will (in a subsequent patch) be used to communicate to WPD which vtables are safe to devirtualize, and we will optionally convert the metadata to hidden visibility at link time. Subsequent follow on patches will help enable this by adding vcall_visibility metadata to the ThinLTO summaries, and always emit type test intrinsics under -fwhole-program-vtables (and not just for vtables with hidden visibility). In order to do this safely with VFE, since for VFE all vtable loads must be type checked loads which will no longer be the case, this patch adds a new "Virtual Function Elim" module flag to communicate to GlobalDCE whether to perform VFE using the vcall_visibility metadata. One additional advantage of using the vcall_visibility metadata to drive more WPD at LTO link time is that we can use the same mechanism to enable more aggressive VFE at LTO link time as well. The link time option proposed in the RFC will convert vcall_visibility metadata to hidden (aka linkage unit visibility), which combined with -fvirtual-function-elimination will allow it to be done more aggressively at LTO link time under the same conditions. Reviewers: pcc, ostannard, evgeny777, steven_wu Subscribers: mehdi_amini, Prazek, hiraditya, dexonsmith, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71907	2020-01-23 11:36:01 -08:00
Alina Sbirlea	6770de9b8d	[LoopIdiomRecognize] Teach LoopIdiomRecognize to preserve MemorySSA.	2020-01-23 11:31:12 -08:00
Alina Sbirlea	a0f627d584	[IndVarSimplify] Fix for MemorySSA preserve.	2020-01-23 11:06:16 -08:00
Justin Bogner	b81a337be7	[LoopUnroll] Avoid UB when converting from WeakVH to `Value ` Calling `operator` on a WeakVH with a null value yields a null reference, which is UB. Avoid this by implicitly converting the WeakVH to a `Value *` rather than dereferencing and then taking the address for the type conversion. Differential Revision: https://reviews.llvm.org/D73280	2020-01-23 10:36:39 -08:00
Guillaume Chatelet	59f95222d4	[Alignment][NFC] Use Align with CreateAlignedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73274	2020-01-23 17:34:32 +01:00
Kazu Hirata	41784bed01	Revert "Resubmit: [JumpThreading] Thread jumps through two basic blocks" This reverts commit `53b68e676f`. Our internal tests are showing breakage with this patch.	2020-01-23 06:34:03 -08:00
Fedor Sergeev	2f6987ba61	[LoopRotate] add ability to repeat loop rotation until non-deoptimizing exit is found In case of loops with multiple exit where all-but-one exit are deoptimizing it might happen that the first rotation will end up with latch having a deoptimizing exit. This makes the loop unsuitable for trip-count analysis (say, getLoopEstimatedTripCount) as well as for loop transformations that know how to handle multple deoptimizing exits. It pretty much means that canonical form in multple-deoptimizing-exits case should be with non-deoptimizing exit at latch. Teach loop-rotation to reach this canonical form by repeating rotation. -loop-rotate-multi option introduced to control this behavior, currently disabled by default. Reviewers: skatkov, asbirlea, reames, fhahn Reviewed By: skatkov Tags: #llvm Differential Revision: https://reviews.llvm.org/D73058	2020-01-23 15:56:24 +03:00
Guillaume Chatelet	279fa8e006	[Alignement][NFC] Deprecate untyped CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260	2020-01-23 13:34:32 +01:00
Daniil Suchkov	4a8dbc617d	[SSAUpdater] Don't call ValueIsRAUWd upon single use replacement It is incorrect to call ValueHandleBase::ValueIsRAUWd when only one use is replaced since it simply violates semantics of the callback and leads to bugs like PR44320. Previously this call was used specifically to keep LICM's cache of AliasSetTrackers up to date across passes (as PR36801 showed, even for that purpose it didn't work properly), but since LICM doesn't have that cache anymore, we can safely remove this incorrect call with no repercussions. This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44320 Reviewers: asbirlea, fhahn, efriedma, reames Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D73089	2020-01-23 15:53:53 +07:00
Daniil Suchkov	6fc9e60149	NFC. Remove obsolete SimpleAnalysis infrastructure Apparently cache of AliasSetTrackers held by LICM was the only user of SimpleAnalysis infrastructure. Now, given that we no longer have that cache, this infrastructure is obsolete and, taking into account its nature, we don't want any new solutions to be based on it. Reviewers: asbirlea, fhahn, efriedma, reames Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D73085	2020-01-23 13:58:30 +07:00
Daniil Suchkov	53a28bd891	[LICM] NFC. Remove AST caching infrastructure Since LICM doesn't use AST caching any more (see D73081), this infrastructure is now obsolete and we can remove it. Reviewers: asbirlea, fhahn, efriedma, reames Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D73084	2020-01-23 12:33:50 +07:00
Florian Hahn	f14f2a8568	[LV] Fix predication for branches with matching true and false succs. Currently due to the edge caching, we create wrong predicates for branches with matching true and false successors. We will cache the condition for the edge from the true successor, and then lookup the same edge (src and dst are the same) for the edge to the false successor. If both successors match, the condition should always be true. At the moment, we cannot really create constant VPValues, but we can just create a true condition as X \| !X. Later passes will clean that up. Fixes PR44488. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D73079	2020-01-22 18:34:11 -08:00
Jonas Devlieghere	cf2b498d28	[llvm/Transforms] Fix warning: private field 'MSSA' is not used	2020-01-22 18:07:53 -08:00
Alina Sbirlea	adc4faf532	[IndVarSimplify] Teach IndVarSimplify to preserve MemorySSA.	2020-01-22 16:33:17 -08:00
Alina Sbirlea	b5b6126d97	[IndVarSimplify] Cleanup spaces and reduce variable scope [NFCI] Minor clean-ups + clang-format.	2020-01-22 15:32:20 -08:00
Alina Sbirlea	6baf31b7c1	[LoopIdiomRecognize] Reduce variable scope. [NFCI]	2020-01-22 15:30:08 -08:00
Nikita Popov	0b83c5a78f	[InstCombine] Combine neg of shl of sub (PR44529) Fixes https://bugs.llvm.org/show_bug.cgi?id=44529. We already have a combine to sink a negation through a left-shift, but it currently only works if the shift operand is negatable without creating any instructions. This patch introduces freelyNegateValue() as a more powerful extension of dyn_castNegVal(), which allows negating a value as long as this doesn't end up increasing instruction count. Specifically, this patch adds support for negating A-B to B-A. This mechanism could in the future be extended to handle general negation chains that a) start at a proper 0-X negation and b) only require one operand to be freely negatable. This would end up as a weaker form of D68408 aimed at the most obviously profitable subset that eliminates a negation entirely. Differential Revision: https://reviews.llvm.org/D72978	2020-01-22 23:03:58 +01:00
Nikita Popov	efba7ed05e	[PatternMatch] Make m_c_ICmp swap the predicate (PR42801) This addresses https://bugs.llvm.org/show_bug.cgi?id=42801. The m_c_ICmp() matcher is changed to provide the swapped predicate if the operands are swapped. Existing uses of m_c_ICmp() fall in one of two categories: Working on equality predicates only, where swapping is irrelevant. Or performing a manual swap, in which case this patch removes it. The only exception is the foldICmpWithLowBitMaskedVal() fold, which does not swap the predicate, and instead reasons about whether a swap occurred or not for each predicate. Getting the swapped predicate allows us to merge the logic for pairs of predicates, instead of duplicating it. Differential Revision: https://reviews.llvm.org/D72976	2020-01-22 22:56:26 +01:00
Alina Sbirlea	efb130fc93	[LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available. If MemorySSA analysis is analysis, LoopDeletion now preserves it.	2020-01-22 11:38:38 -08:00
Sanjay Patel	0ade2abdb0	[InstCombine] fneg(X + C) --> -C - X This is 1 of the potential folds uncovered by extending D72521. We don't seem to do this in the backend either (unless I'm not seeing some target-specific transform). icc and gcc (appears to be target-specific) do this transform. Differential Revision: https://reviews.llvm.org/D73057	2020-01-22 09:48:43 -05:00
Guillaume Chatelet	0957233320	[Alignment][NFC] Use Align with CreateMaskedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73106	2020-01-22 11:04:39 +01:00
Daniil Suchkov	7bdc83f340	[LICM] Don't cache AliasSetTrackers when run under legacy PM Summary: This is the first step towards complete removal of AST caching from LICM. Attempts to keep LICM's AST cache up to date across passes can lead to miscompiles like this one: https://bugs.llvm.org/show_bug.cgi?id=44320. LICM has already switched to using MemorySSA to do sinking and hoisting and only builds an AliasSetTracker on demand for the promoteToScalars step, without caching it from one LICM instance to the next. Given this, we don't have compile-time reasons to keep AST caching any more. The only scenario where the caching would be used currently is when using the LegacyPassManager and setting -enable-mssa-loop-dependency=false. This switch should help us to surface any possible issues that may arise along this way, also it turns subsequent removal of AST caching into NFC. Reviewers: asbirlea, fhahn, efriedma, reames Reviewed By: asbirlea Subscribers: hiraditya, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73081	2020-01-22 13:16:45 +07:00
Andrei Elovikov	e1d6d36852	[SLP] Don't allow Div/Rem as alternate opcodes Summary: We don't have control/verify what will be the RHS of the division, so it might happen to be zero, causing UB. Reviewers: Vasilis, RKSimon, ABataev Reviewed By: ABataev Subscribers: vporpo, ABataev, hiraditya, llvm-commits, vdmitrie Tags: #llvm Differential Revision: https://reviews.llvm.org/D72740	2020-01-21 15:21:17 -08:00
Florian Hahn	f42994f228	[Matrix] Hide and describe matrix-propagate-shape option.	2020-01-21 14:28:47 -08:00
Guillaume Chatelet	bc8a1ab26f	[Alignment][NFC] Use Align with CreateMaskedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73087	2020-01-21 14:13:22 +01:00
Sanjay Patel	7bee94410c	[InstCombine] form copysign from select of FP constants (PR44153) This should be the last step needed to solve the problem in the description of PR44153: https://bugs.llvm.org/show_bug.cgi?id=44153 If we're casting an FP value to int, testing its signbit, and then choosing between a value and its negated value, that's a complicated way of saying "copysign": (bitcast X) < 0 ? -TC : TC --> copysign(TC, X) Differential Revision: https://reviews.llvm.org/D72643	2020-01-20 10:51:14 -05:00
Guillaume Chatelet	46b9563cf6	[Alignment][NFC] Use Align with CreateElementUnorderedAtomicMemCpy Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, nicolasvasilache Subscribers: hiraditya, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73041	2020-01-20 15:39:45 +01:00
Evgeniy Brevnov	af7e158872	[LV] Vectorizer should adjust trip count in profile information Summary: Vectorized loop processes VFxUF number of elements in one iteration thus total number of iterations decreases proportionally. In addition epilog loop may not have more than VFxUF - 1 iterations. This patch updates profile information accordingly. Reviewers: hsaito, Ayal, fhahn, reames, silvas, dcaballe, SjoerdMeijer, mkuper, DaniilSuchkov Reviewed By: Ayal, DaniilSuchkov Subscribers: fedor.sergeev, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67905	2020-01-20 18:36:28 +07:00
Evgeniy Brevnov	cfe97681cd	[NFC][LoopUtils] Minor change in comment according to review D71990.	2020-01-20 17:10:10 +07:00
Evgeniy Brevnov	10357e1c89	[LoopUtils] Better accuracy for getLoopEstimatedTripCount. Summary: Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 // taken, i32 1 // exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2. Reviewers: Ayal, fhahn Reviewed By: Ayal Subscribers: mgorny, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71990	2020-01-20 16:58:07 +07:00
Sjoerd Meijer	93175a5caa	[IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFCI This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus making it a generic loop utility function. This allows to rewrite loop exit values by just calling this function without running the whole IndVarSimplify pass. We use this in D72714 to rematerialise the iteration count in exit blocks, so that we can clean-up loop update expressions inside the hardware-loops later. Differential Revision: https://reviews.llvm.org/D72602	2020-01-20 09:05:00 +00:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Alina Sbirlea	9f6c6ee6b9	[MemDepAnalysis/VNCoercion] Move static method to its only use. [NFCI] Static method MemoryDependenceResults::getLoadLoadClobberFullWidthSize does not have or use any info specific to MemoryDependenceResults. Move it to its only user: VNCoercion.	2020-01-17 15:18:42 -08:00
Petr Hosek	d3db13af7e	[profile] Support counter relocation at runtime This is an alternative to the continous mode that was implemented in D68351. This mode relies on padding and the ability to mmap a file over the existing mapping which is generally only available on POSIX systems and isn't suitable for other platforms. This change instead introduces the ability to relocate counters at runtime using a level of indirection. On every counter access, we add a bias to the counter address. This bias is stored in a symbol that's provided by the profile runtime and is initially set to zero, meaning no relocation. The runtime can mmap the profile into memory at abitrary location, and set bias to the offset between the original and the new counter location, at which point every subsequent counter access will be to the new location, which allows updating profile directly akin to the continous mode. The advantage of this implementation is that doesn't require any special OS support. The disadvantage is the extra overhead due to additional instructions required for each counter access (overhead both in terms of binary size and performance) plus duplication of counters (i.e. one copy in the binary itself and another copy that's mmapped). Differential Revision: https://reviews.llvm.org/D69740	2020-01-17 15:02:23 -08:00
Peter Collingbourne	cd40bd0a32	hwasan: Move .note.hwasan.globals note to hwasan.module_ctor comdat. As of D70146 lld GCs comdats as a group and no longer considers notes in comdats to be GC roots, so we need to move the note to a comdat with a GC root section (.init_array) in order to prevent lld from discarding the note. Differential Revision: https://reviews.llvm.org/D72936	2020-01-17 13:40:52 -08:00
Drew Wock	0bcfafc5e7	[SeparateConstOffsetFromGEP] Fix: sext(a) + sext(b) -> sext(a + b) matches add and sub instructions with one another During the SeparateConstOffsetFromGEP pass, signed extensions are distributed to the values that feed into them and then later recombined. The recombination stage is somewhat problematic- it doesn't differ add and sub instructions from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern in some instances. An example- the IR contains: %unextendedA %unextendedB %subuAuB = unextendedA - unextendedB %extA = extend A %extB = extend B %addeAeB = extA + extB The problematic optimization will transform that into: %unextendedA %unextendedB %subuAuB = unextendedA - unextendedB %extA = extend A %extB = extend B %addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input. This patch fixes that. Patch by Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D65967	2020-01-17 12:22:52 -05:00
Nikita Popov	522c030aa9	[InstCombine] Fix worklist management in DSE (PR44552) Fixes https://bugs.llvm.org/show_bug.cgi?id=44552. We need to make sure that the store is reprocessed, because performing DSE may expose more DSE opportunities. There is a slight caveat here though: We need to make sure that we add back the store the worklist first, because that means it will be processed after the operands of the removed store have been processed. This is a general bug in InstCombine worklist management that I hope to address at some point, but for now it means we need to do this manually rather than just returning the instruction as changed. Differential Revision: https://reviews.llvm.org/D72807	2020-01-17 18:10:56 +01:00
Nikita Popov	77befe54f7	[InstCombine] Fix worklist management in return combine There are two related bugs here: First, we don't add the operand we're replacing to the worklist, which means it may not get DCEd (see test change). Second, usually this would just get picked up in the next iteration, but we also do not report the instruction as changed. This means that we do not get that extra instcombine iteration, and more importantly, may break the pass pipeline, as the function is not marked as changed. Differential Revision: https://reviews.llvm.org/D72864	2020-01-17 17:59:23 +01:00
Nikita Popov	2ca092f320	[InstCombine] Support disabling expensive combines in opt Currently, there is no way to disable ExpensiveCombines when doing a standalone opt -instcombine run, as that's the default, and the opt option can currently only be used to force enable, not to force disable. The only way to disable expensive combines is via -O1 or -O2, but that of course also runs the rest of the kitchen sink... This patch allows using opt -instcombine -expensive-combines=0 to run InstCombine without ExpensiveCombines. Differential Revision: https://reviews.llvm.org/D72861	2020-01-17 17:56:20 +01:00
Matt Arsenault	3ef8cdf666	AMDGPU: Do permlane16 vdst_in discard optimization in InstCombine There's more potential value to discarding the source value earlier, since we always know the value of the fi/bc bits.	2020-01-16 17:27:53 -05:00
Kazu Hirata	53b68e676f	Resubmit: [JumpThreading] Thread jumps through two basic blocks This reverts commit `2d258ed931`. This revision fixes the Windows build and adds a testcase for it, namely thread-two-bbs3.ll. My original patch improperly copied EH pads on Windows. This patch disregards jump threading opportunities having to do with EH pads. [JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-01-16 12:33:37 -08:00
Arkady Shlykov	c87982b467	Revert "[Loop Peeling] Add possibility to enable peeling on loop nests." This reverts commit `3f3017e` because there's a failure on peel-loop-nests.ll with LLVM_ENABLE_EXPENSIVE_CHECKS on. Differential Revision: https://reviews.llvm.org/D70304	2020-01-16 10:33:38 -08:00
Fedor Sergeev	3478551bf3	[GVN] introduce GVNOptions to control GVN pass behavior There are a few global (cl::opt) controls that enable optional behavior in GVN. Introduce GVNOptions that provide corresponding per-pass instance controls. That will allow to use GVN multiple times in pipeline each time with different settings. Reviewers: asbirlea, rnk, reames, skatkov, fhahn Reviewed By: fhahn Tags: #llvm Differential Revision: https://reviews.llvm.org/D72732	2020-01-16 20:21:08 +03:00
Mircea Trofin	7acfda633f	[llvm] Make new pass manager's OptimizationLevel a class Summary: The old pass manager separated speed optimization and size optimization levels into two unsigned values. Coallescing both in an enum in the new pass manager may lead to unintentional casts and comparisons. In particular, taking a look at how the loop unroll passes were constructed previously, the Os/Oz are now (==new pass manager) treated just like O3, likely unintentionally. This change disallows raw comparisons between optimization levels, to avoid such unintended effects. As an effect, the O{s\|z} behavior changes for loop unrolling and loop unroll and jam, matching O2 rather than O3. The change also parameterizes the threshold values used for loop unrolling, primarily to aid testing. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72547	2020-01-16 09:00:56 -08:00
Francesco Petrogalli	66c120f025	[VectorUtils] Rework the Vector Function Database (VFDatabase). Summary: This commits is a rework of the patch in https://reviews.llvm.org/D67572. The rework was requested to prevent out-of-tree performance regression when vectorizing out-of-tree IR intrinsics. The vectorization of such intrinsics is enquired via the static function `isTLIScalarize`. For detail see the discussion in https://reviews.llvm.org/D67572. Reviewers: uabelho, fhahn, sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72734	2020-01-16 15:08:26 +00:00
Simon Pilgrim	23a887b0dd	Fix unused variable warning. NFCI.	2020-01-16 13:02:40 +00:00
Florian Hahn	23c113802e	[LV] Allow assume calls in predicated blocks. The assume intrinsic is intentionally marked as may reading/writing memory, to avoid passes moving them around. When flattening the CFG for predicated blocks, we have to drop the assume calls, as they are control-flow dependent. There are some cases where we can do better (when control flow is preserved), but that is follow-up work. Fixes PR43620. Reviewers: hsaito, rengolin, dcaballe, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D68814	2020-01-16 10:11:35 +00:00
Sameer Sahasrabuddhe	ed181efa17	[HIP][AMDGPU] expand printf when compiling HIP to AMDGPU Summary: This change implements the expansion in two parts: - Add a utility function emitAMDGPUPrintfCall() in LLVM. - Invoke the above function from Clang CodeGen, when processing a HIP program for the AMDGPU target. The printf expansion has undefined behaviour if the format string is not a compile-time constant. As a sufficient condition, the HIP ToolChain now emits -Werror=format-nonliteral. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D71365	2020-01-16 15:15:38 +05:30
Vedant Kumar	360abb7ee5	[CodeExtractor] Transfer debug info to extracted function After extracting, fix up debug info in both the old and new functions by 1) Pointing line locations and debug intrinsics to the new subprogram scope, and 2) Deleting intrinsics which point to values outside of the new function. Depends on https://reviews.llvm.org/D72795. Testing: check-llvm, check-clang, a build of LNT in the `-Os -g` config with "-mllvm -hot-cold-split=1" set, and end-to-end debugging of a toy program which undergoes splitting to verify that lldb can find variables, single step, etc. in extracted code. rdar://45507940 Differential Revision: https://reviews.llvm.org/D72801	2020-01-15 15:38:36 -08:00
Fedor Sergeev	8a4d12ae5b	[BasicBlock] add helper getPostdominatingDeoptimizeCall It appears to be rather useful when analyzing Loops with multiple deoptimizing exits, perhaps merged ones. For now it is used in LoopPredication, will be adding more uses in other loop passes. Reviewers: asbirlea, fhahn, skatkov, spatel, reames Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D72754	2020-01-16 01:15:57 +03:00
Mircea Trofin	5466597fee	[NFC] Refactor InlineResult for readability Summary: InlineResult is used both in APIs assessing whether a call site is inlinable (e.g. llvm::isInlineViable) as well as in the function inlining utility (llvm::InlineFunction). It means slightly different things (can/should inlining happen, vs did it happen), and the implicit casting may introduce ambiguity (casting from 'false' in InlineFunction will default a message about hight costs, which is incorrect here). The change renames the type to a more generic name, and disables implicit constructors. Reviewers: eraman, davidxl Reviewed By: davidxl Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72744	2020-01-15 13:34:20 -08:00
Zhongduo Lin	34ba96a3d4	[NFC][IndVarSimplify] remove duplicate code in widenWithVariantLoadUseCodegen. Summary: Duplicate code in widenWithVariantLoadUseCodegen is removed and also use assert to check unknown extension type as it should be filtered out by the pre condition check before calling this function. Reviewers: az, sanjoy, sebpop, efriedma, javed.absar, sanjoy.google Reviewed By: efriedma Subscribers: hiraditya, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D72652	2020-01-15 16:27:58 -05:00
Vedant Kumar	a2cc80bc95	DebugInfo: Factor out logic to update locations in MD_loop metadata, NFC Factor out the logic needed to update debug locations contained within MD_loop metadata. This refactor is preparation for a future change that also needs to rewrite MD_loop metadata. rdar://45507940	2020-01-15 13:02:36 -08:00
Arkady Shlykov	3f3017e162	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-01-15 08:25:21 -08:00
Sanjay Patel	3180af4362	[InstCombine] reassociate fsub+fsub into fsub+fadd As discussed in the motivating PR44509: https://bugs.llvm.org/show_bug.cgi?id=44509 ...we can end up with worse code using fast-math than without. This is because the reassociate pass greedily transforms fsub into fneg/fadd and apparently (based on the regression tests seen here) expects instcombine to clean that up if it wasn't profitable. But we were missing this fold: (X - Y) - Z --> X - (Y + Z) There's another, more specific case that I think we should handle as shown in the "fake" fneg test (but missed with a real fneg), but that's another patch. That may be tricky to get right without conflicting with existing transforms for fneg. Differential Revision: https://reviews.llvm.org/D72521	2020-01-15 11:14:13 -05:00
Hideto Ueno	188f9a348d	[Attributor] AAValueConstantRange: Value range analysis using constant range Summary: This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point. One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument). The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty). Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state). Supported - BinaryOperator(add, sub, ...) - CmpInst(icmp eq, ...) - !range metadata `AAValueConstantRange` is not intended to extend to polyhedral range value analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71620	2020-01-15 16:34:23 +09:00
Nikita Popov	04e586151e	[InstCombine] Fix worklist management when removing guard intrinsic When multiple guard intrinsics are merged into one, currently the result of eraseInstFromFunction() is returned -- however, this should only be done if the current instruction is being removed. In this case we're removing a different instruction and should instead report that the current one has been modified by returning it. For this test case, this reduces the number of instcombine iterations from 5 to 2 (the minimum possible). Differential Revision: https://reviews.llvm.org/D72558	2020-01-14 21:47:48 +01:00
Nikita Popov	410331869d	[NewPM] Port MergeFunctions pass This ports the MergeFunctions pass to the NewPM. This was rather straightforward, as no analyses are used. Additionally MergeFunctions needs to be conditionally enabled in the PassBuilder, but I left that part out of this patch. Differential Revision: https://reviews.llvm.org/D72537	2020-01-14 20:55:41 +01:00
Nikita Popov	65c0805be5	[InstCombine] Fix infinite loop due to bitcast <-> phi transforms Fix for https://bugs.llvm.org/show_bug.cgi?id=44245. The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up fighting against each other, because optimizeBitCastFromPhi() assumes that bitcasts of loads will get folded. This doesn't happen here, because a dangling phi node prevents the one-use fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering. This patch fixes the issue by explicitly performing the load combine as part of the bitcast of phi transform. Other attempts to force the load to be combined first were ultimately too unreliable. Differential Revision: https://reviews.llvm.org/D71164	2020-01-14 20:45:13 +01:00
Nikita Popov	b4dd928ffb	[InstCombine] Make combineLoadToNewType a method; NFC So it can be reused as part of other combines. In particular for D71164.	2020-01-14 20:40:03 +01:00
Nikita Popov	652cd7c100	[InstCombine] Fix user iterator invalidation in bitcast of phi transform This fixes the issue encountered in D71164. Instead of using a range-based for, manually iterate over the users and advance the iterator beforehand, so we do not skip any users due to iterator invalidation. Differential Revision: https://reviews.llvm.org/D72657	2020-01-14 20:38:10 +01:00
Teresa Johnson	2cefb93951	[ThinLTO/WPD] Remove an overly-aggressive assert Summary: An assert added to the index-based WPD was trying to verify that we only have multiple vtables for a given guid when they are all non-external linkage. This is too conservative because we may have multiple external vtable with the same guid when they are in comdat. Remove the assert, as we don't have comdat information in the index, the linker should issue an error in this case. See discussion on D71040 for more information. Reviewers: evgeny777, aganea Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72648	2020-01-14 10:57:14 -08:00
Juneyoung Lee	3e32b7e127	[InstCombine] Let combineLoadToNewType preserve ABI alignment of the load (PR44543) Summary: If aligment on `LoadInst` isn't specified, load is assumed to be ABI-aligned. And said aligment may be different for different types. So if we change load type, but don't pay extra attention to the aligment (i.e. keep it unspecified), we may either overpromise (if the default aligment of the new type is higher), or underpromise (if the default aligment of the new type is smaller). Thus, if no alignment is specified, we need to manually preserve the implied ABI alignment. This addresses https://bugs.llvm.org/show_bug.cgi?id=44543 by making combineLoadToNewType preserve ABI alignment of the load. Reviewers: spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72710	2020-01-15 03:20:53 +09:00
Dmitri Gribenko	2948ec5ca9	Removed PointerUnion3 and PointerUnion4 aliases in favor of the variadic template	2020-01-14 18:56:29 +01:00
Florian Hahn	192cce10f6	Revert "Recommit "[GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing."" This reverts commit `a03d7b0f24`. As discussed in D68298, this causes a compile-time regression, in case the DTs requested are not used elsewhere in GlobalOpt. We should only get the DTs if they are available here, but this seems not possible with the legacy pass manager from a module pass.	2020-01-14 14:50:07 +00:00
Benjamin Kramer	df186507e1	Make helper functions static or move them into anonymous namespaces. NFC.	2020-01-14 14:06:37 +01:00
Hiroshi Yamauchi	7b9f8e17d1	[PGO][CHR] Guard against 0-to-0 branch weight and avoid division by zero crash. Summary: This fixes a crash in internal builds under SamplePGO. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72653	2020-01-13 14:38:58 -08:00
Teresa Johnson	31441a3e00	[ThinLTO/WPD] Fix index-based WPD for alias vtables Summary: A recent fix in D69452 fixed index based WPD in the presence of available_externally vtables. It added a cast of the vtable def summary to a GlobalVarSummary. However, in some cases one def may be an alias, in which case we need to get the base object before casting, otherwise we will crash. Reviewers: evgeny777, steven_wu, aganea Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71040	2020-01-13 13:38:26 -08:00
Simon Pilgrim	2740b2d5d5	Fix uninitialized value clang static analyzer warning. NFC.	2020-01-11 16:02:22 +00:00
Nuno Lopes	87407fc03c	DSE: fix bug where we would only check libcalls for name rather than whole decl	2020-01-11 11:57:29 +00:00
Nikita Popov	0e322c8a1f	[InstCombine] Preserve nuw on sub of geps (PR44419) Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the nuw on sub of geps. We only do this if the offset has a multiplication as the final operation, as we can't be sure the operations is nuw in the other cases without more thorough analysis. Differential Revision: https://reviews.llvm.org/D72048	2020-01-11 11:01:12 +01:00
Andrew Paverd	bdd88b7ed3	Add support for __declspec(guard(nocf)) Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167	2020-01-10 16:04:12 +00:00
Simon Pilgrim	2e66405d8d	Don't use dyn_cast_or_null if we know the pointer is nonnull. Fix clang static analyzer null dereference warning by using dyn_cast instead.	2020-01-10 10:32:36 +00:00
Benjamin Kramer	498856fca5	[LV] Silence unused variable warning in Release builds. NFC.	2020-01-10 11:21:27 +01:00
Gil Rapaport	8647a72c4a	[LV] VPValues for memory operation pointers (NFCI) Memory instruction widening recipes use the pointer operand of their load/store ingredient for generating the needed GEPs, making it difficult to feed these recipes with pointers based on other ingredients or none at all. This patch modifies these recipes to use a VPValue for the pointer instead, in order to reduce ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. The recipes are constructed with VPValues bound to these ingredients, maintaining current behavior. Differential revision: https://reviews.llvm.org/D70865	2020-01-10 09:24:59 +02:00
Whitney Tsang	d27a15fed7	[NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass. Summary: This patch changes LoopUnrollAndJamPass to a function pass, and keeps the loops traversal order same as defined in FunctionToLoopPassAdaptor LoopPassManager.h. The next patch will change the loop traversal to outer to inner order, so more loops can be transform. Discussion in llvm-dev mailing list: https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: dmgreen Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D72230	2020-01-09 16:18:36 +00:00
@raghesh (Raghesh Aloor)	6c04ef472a	[InstCombine] Z / (1.0 / Y) => (Y * Z) This is a special case of Z / (X / Y) => (Y * Z) / X, with X = 1.0. The m_OneUse check is avoided because even in the case of the multiple uses for 1.0/Y, the number of instructions remain the same and a division is replaced by a multiplication. Differential Revision: https://reviews.llvm.org/D72319	2020-01-09 10:52:39 -05:00
Florian Hahn	ccf24225e3	[Matrix] Update shape propagation to iterate until done. This patch updates the shape propagation to iterate until no new shape information is discovered. As initial seed for the forward propagation, we use the matrix intrinsic instructions. Both propagateShapeForward and propagateShapeBackward return new work lists, with the instructions to be used for the next iteration. When propagating forward, we record all instructions we added new shape information for. When propagating backward, we record all users of instructions we added new shape information for. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70901	2020-01-09 10:52:52 +00:00
Florian Hahn	7adf6644f5	[Matrix] Propagate and use shape information for loads. This patch extends to shape propagation to also include load instructions and implements shape aware lowering for vector loads. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70900	2020-01-09 10:21:20 +00:00
Evgeniy Brevnov	f0abe820ee	[LoopUtils][NFC] Minor refactoring in getLoopEstimatedTripCount.	2020-01-09 16:49:15 +07:00
Florian Hahn	459ad8e97e	[Matrix] Implement back-propagation of shape information. This patch extends the shape propagation for matrix operations to also propagate the shape of instructions to their operands. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70899	2020-01-09 09:48:07 +00:00
Sjoerd Meijer	8f1887456a	[LV] Still vectorise when tail-folding can't find a primary inducation variable This addresses a vectorisation regression for tail-folded loops that are counting down, e.g. loops as simple as this: void foo(char A, char B, char C, uint32_t N) { while (N > 0) { C++ = A++ + B++; N--; } } These are loops that can be vectorised, but when tail-folding is requested, it can't find a primary induction variable which we do need for predicating the loop. As a result, the loop isn't vectorised at all, which it is able to do when tail-folding is not attempted. So, this adds a check for the primary induction variable where we decide how to lower the scalar epilogue. I.e., when there isn't a primary induction variable, a scalar epilogue loop is allowed (i.e. don't request tail-folding) so that vectorisation could still be triggered. Having this check for the primary induction variable make sense anyway, and in addition, in a follow-up of this I will look into discovering earlier the primary induction variable for counting down loops, so that this can also be tail-folded. Differential revision: https://reviews.llvm.org/D72324	2020-01-09 09:14:00 +00:00
Johannes Doerfert	a4088c75cc	[Attributor][FIX] Carefully change invokes to calls (after manifest) Before we manually inserted unreachable early but that could lead to broken PHI nodes. Now we use the existing late modification functionality.	2020-01-08 19:32:38 -06:00
Johannes Doerfert	1e46eb74be	[Attributor][FIX] Avoid dangling value pointers during code modification When we replace instructions with unreachable we delete instructions. We now avoid dangling pointers to those deleted instructions in the `ToBeChangedToUnreachableInsts` set. Other modification collections might need to be updated in the future as well.	2020-01-08 19:32:37 -06:00
Kazu Hirata	2d258ed931	Revert "[JumpThreading] Thread jumps through two basic blocks" It looks like my patch breaks the sanitizer-windows build: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324 This reverts commit `ead815924e`.	2020-01-08 13:58:39 -08:00
Kazu Hirata	ead815924e	[JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-01-08 06:57:36 -08:00
Kadir Cetinkaya	b212eb7159	Revert "[InstCombine] fold zext of masked bit set/clear" This reverts commit `a041c4ec6f`. This looks like a non-trivial change and there has been no code reviews (at least there were no phabricator revisions attached to the commit description). It is also causing a regression in one of our downstream integration tests, we haven't been able to come up with a minimal reproducer yet.	2020-01-08 11:21:21 +01:00
Philip Reames	312a532dc0	[GVN/FP] Considate logic for reasoning about equality vs equivalance for floats Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't. Differential Revision: https://reviews.llvm.org/D67126	2020-01-07 16:05:04 -08:00
Sanjay Patel	f8962571f7	[InstCombine] try to pull 'not' of select into compare operands not (select ?, (cmp TPred, ?, ?), (cmp FPred, ?, ?) --> select ?, (cmp TPred', ?, ?), (cmp FPred', ?, ?) If both sides of the select are cmps, we can remove an instruction. The case where only side is a cmp is deferred to a possible follow-on patch. We have a more general 'isFreeToInvert' analysis, but I'm not seeing a way to use that more widely without inducing infinite looping (opposing transforms). Here, we flip the compare predicates directly, so we should not have any danger by creating extra intermediate 'not' ops. Alive proofs: https://rise4fun.com/Alive/jKa Name: both select values are compares - invert predicates %tcmp = icmp sle i32 %x, %y %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = icmp sgt i32 %x, %y %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Name: false val is compare - invert/not %fcmp = icmp ugt i32 %z, %w %sel = select i1 %cond, i1 %tcmp, i1 %fcmp %not = xor i1 %sel, true => %tcmp_not = xor i1 %tcmp, -1 %fcmp_not = icmp ule i32 %z, %w %not = select i1 %cond, i1 %tcmp_not, i1 %fcmp_not Differential Revision: https://reviews.llvm.org/D72007	2020-01-07 10:44:23 -05:00
Simon Pilgrim	bd1dc6a3eb	Fix "use of uninitialized variable" static analyzer warnings. NFCI.	2020-01-07 10:55:37 +00:00
Fangrui Song	6904cd9486	Add Triple::isX86() Reviewed By: craig.topper, skan Differential Revision: https://reviews.llvm.org/D72247	2020-01-06 15:51:02 -08:00
James Henderson	d68904f957	[NFC] Fix trivial typos in comments Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.	2020-01-06 10:50:26 +00:00
Brian Gesiak	83a9321f60	[Coroutines] Remove corresponding phi values when apply simplifyTerminatorLeadingToRet Summary: In addMustTailToCoroResumes, we set musttail on those resume instructions that are followed by a ret instruction. This is done by simplifyTerminatorLeadingToRet which replace a sequence of branches leading to a ret with a clone of the ret. However it forgets to remove corresponding PHI values that come from basic block of replaced branch, and may cause jumpthreading pass hangs (https://bugs.llvm.org/show_bug.cgi?id=43720) This patch fix this issue Test Plan: cppcoro library with O3+flto check-llvm Reviewers: modocache, GorNishanov, lewissbaker Reviewed By: modocache Subscribers: mehdi_amini, EricWF, hiraditya, dexonsmith, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71826 Patch by junparser (JunMa)!	2020-01-05 18:26:30 -05:00
Florian Hahn	b8a3c34eee	Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)." This reverts commit `51ef53f3bd`, as it breaks some bots.	2020-01-04 18:44:38 +00:00
Florian Hahn	51ef53f3bd	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-01-04 18:29:35 +00:00
Florian Hahn	99f74a64a2	[SCEV] Remove unused ScalarEvolutionExpander.h includes (NFC).	2020-01-04 18:29:35 +00:00
Roman Lebedev	6d05bc2e3a	[NFCI][InstCombine] Refactor 'sink negation into select if that folds one hand of select to 0' fold I would think it's better than having two practically identical folds next to eachother, but then generalization isn't all that pretty due to the fact that we need to produce different `sub` each time.. This change is no-functional-changes-intended refactoring.	2020-01-04 17:30:51 +03:00
Roman Lebedev	772ede3d5d	[InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 (PR44426) This decreases use count of %Op0, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal) %Op0 = %TrueVal %o = select i1 %Cond, i8 %Op0, i8 %FalseVal %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %FalseVal %r = select i1 %Cond, i8 0, i8 %n Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0 %Op0 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op0 %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %TrueVal %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/aHRt https://bugs.llvm.org/show_bug.cgi?id=44426	2020-01-04 17:30:51 +03:00
Roman Lebedev	4d8e47ca18	[InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426) This decreases use count of %Op1, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1) %Op1 = %TrueVal %o = select i1 %Cond, i8 %Op1, i8 %FalseVal %r = sub i8 %o, %Op1 => %n = sub i8 %FalseVal, %Op1 %r = select i1 %Cond, i8 0, i8 %n Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0 %Op1 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op1 %r = sub i8 %o, %Op1 => %n = sub i8 %TrueVal, %Op1 %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/avL https://bugs.llvm.org/show_bug.cgi?id=44426	2020-01-04 17:30:51 +03:00
Alexey Lapshin	831bfcea47	[Transforms][GlobalSRA] huge array causes long compilation time and huge memory usage. Summary: For artificial cases (huge array, few usages), Global SRA optimization creates a lot of redundant data. It creates an instance of GlobalVariable for each array element. For huge array, that means huge compilation time and huge memory usage. Following example compiles for 10 minutes and requires 40GB of memory. namespace { char LargeBuffer[64 * 1024 * 1024]; } int main ( void ) { LargeBuffer[0] = 0; printf("\n "); return LargeBuffer[0] == 0; } The fix is to avoid Global SRA for large arrays. Reviewers: craig.topper, rnk, efriedma, fhahn Reviewed By: rnk Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71993	2020-01-04 16:42:38 +03:00
Roman Lebedev	7973aa05f6	[NFC][InstCombine] '(Op1 & С) - Op1' -> '-(Op1 & ~C)' fold (PR44427) This decreases use count of Op1, potentially allows us to further hoist said 'neg' later on, and results in marginally better X86 codegen. Name: (Op1 & С) - Op1 -> -(Op1 & ~C) %o = and i64 %Op1, C1 %r = sub i64 %o, %Op1 => %n = and i64 %Op1, ~C1 %r = sub i64 0, %n https://rise4fun.com/Alive/rwgA https://godbolt.org/z/R_RMfM https://bugs.llvm.org/show_bug.cgi?id=44427	2020-01-03 21:25:48 +03:00
Roman Lebedev	cc0216bedb	[NFC][InstCombine] '(X & (- Y)) - X' -> '- (X & (Y - 1))' fold (PR44448) Name: (X & (- Y)) - X -> - (X & (Y - 1)) (PR44448) %negy = sub i8 0, %y %unbiasedx = and i8 %negy, %x %r = sub i8 %unbiasedx, %x => %ymask = add i8 %y, -1 %xmasked = and i8 %ymask, %x %r = sub i8 0, %xmasked https://rise4fun.com/Alive/OIpla This decreases use count of %x, may allow us to later hoist said negation even further, and results in marginally nicer X86 codegen. See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 20:27:29 +03:00
Johannes Doerfert	d2d2fb19f7	[Attributor][FIX] Allow dead users of rewritten function If we replace a function with a new one because we rewrite the signature, dead users may still refer to the old version. With this patch we reuse the code that deals with dead functions, which the old versions are, to avoid problems.	2020-01-03 10:43:40 -06:00
Johannes Doerfert	6b9ee2d6cd	[Attributor][NFC] Unify the way we delete dead functions	2020-01-03 10:43:40 -06:00
Johannes Doerfert	c90681b681	[Attributor][FIX] Don't crash on ptr2int/int2ptr instructions An integer isn't allowed in getAlignmentForValue so we need to stop at a ptr2int instruction during exploration.	2020-01-03 10:43:40 -06:00
Johannes Doerfert	412a0101a9	[Attributor][FIX] Do not derive nonnull and dereferenceable w/o access An inbounds GEP results in poison if the value is not "inbounds", not in UB. We accidentally derived nonnull and dereferenceable from these inbounds GEPs even in the absence of accesses that would make the poison to UB.	2020-01-03 10:43:40 -06:00
Johannes Doerfert	a4b3588ba2	[Attributor][FIX] Return CHANGED once a pessimistic fixpoint is reached.	2020-01-03 10:43:40 -06:00
Ankit	369a919514	Fix for a dangling point bug in DeadStoreElimination pass The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326	2020-01-03 14:28:44 +00:00
Sanjay Patel	1640582743	[InstCombine] replace undef elements in vector constant when doing icmp folds (PR44383) As shown in P44383: https://bugs.llvm.org/show_bug.cgi?id=44383 ...we can't safely propagate a vector constant through this icmp fold if that vector constant contains undefined elements. We know that each defined element of the constant is safe though, so find the first of those and replicate it into the formerly undef lanes. Differential Revision: https://reviews.llvm.org/D72101	2020-01-03 09:16:57 -05:00
Hideto Ueno	5fc02dc0a7	Revert "[Attributor] AAValueConstantRange: Value range analysis using constant range" This reverts commit `e996303431`.	2020-01-03 11:03:56 +09:00
Sanjay Patel	88fc5fdef6	[InstCombine] remove uses before deleting instructions (PR43723) This is a less ambitious alternative to previous attempts to fix this bug with: rG56b2aee1875a rGef02831f0a4e rG56b2aee1875a ...because those all failed bot testing with use-after-free or other problems. The original crashing/assert problem is still showing up on various fuzzers, so I've added a new minimal test based on another one of those failures. Instead of trying to manage and coordinate the logic in isAllocSiteRemovable() with the deletion loops, just loosen the existing code that handles casts and GEP by replacing with undef to allow other opcodes. That means that no instructions with uses should assert on deletion, and there are hopefully no non-obvious sanitizer bugs induced.	2020-01-02 09:47:36 -05:00
Brian Gesiak	9ce0ff2eef	[Coroutines] const-ify internal helpers (NFC) Several helpers internal to llvm/Transforms/Coroutines do not use 'const' for parameters that are not modified. Add const where possible.	2020-01-01 21:57:49 -05:00
Brian Gesiak	2fcf7691df	[Coroutines] Rename "legacy" passes (NFC) A series of patches beginning with https://reviews.llvm.org/D71898 propose to add an implementation of the coroutine passes to the new pass manager. As part of these changes, the coroutine passes that implement the legacy pass manager interface are renamed, to `<PassName>Legacy`. This mirrors similar changes that have been made to many other passes in LLVM as they've been transitioned to support both old and new pass managers. This commit splits out the renaming portion of that patch and commits it in advance as an NFC (no functional change intended) commit. It renames: * `CoroEarly` => `CoroEarlyLegacy` * `CoroSplit` => `CoroSplitLegacy` * `CoroElide` => `CoroElideLegacy` * `CoroCleanup` => `CoroCleanupLegacy`	2020-01-01 21:41:16 -05:00
Nikita Popov	8dd9a13619	[InstCombine] Preserve inbounds when merging with zero-index GEP (PR44423) This addresses https://bugs.llvm.org/show_bug.cgi?id=44423. If one of the GEPs is inbounds and the other is zero-index, we can also preserve inbounds. Differential Revision: https://reviews.llvm.org/D72060	2020-01-01 23:04:28 +01:00
Nikita Popov	6ba5f8c4ac	[InstCombine] Fix incorrect inbounds on GEP of GEP (PR44425) This fixes https://bugs.llvm.org/show_bug.cgi?id=44425. We need to drop inbounds if one of the GEPs is not inbounds. This was already done when creating a new GEP, but not when modifying in place. Differential Revision: https://reviews.llvm.org/D72059	2020-01-01 22:10:55 +01:00
Hideto Ueno	e996303431	[Attributor] AAValueConstantRange: Value range analysis using constant range This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point. One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument). The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty). Currently, AAValueConstantRange is created when AAValueSimplify cannot simplify the value. Supported - BinaryOperator(add, sub, ...) - CmpInst(icmp eq, ...) - !range metadata `AAValueConstantRange` is not intended to extend to polyhedral range value analysis. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D71620	2020-01-01 15:35:56 +09:00
Craig Topper	374e0299cf	[X86][InstCombine] Add constant folding and simplification support for pdep and pext The instructions use a mask to either pack disjoint bits together(pext) or spread bits to disjoint locations(pdep). If the mask is all 0s then no bits are extracted or deposited. If the mask is all ones, then the source value is written to the result since no compression or expansion happens. Otherwise if both the source and mask are constant we can walk the bits in the source/mask and calculate the result. There other crazier things we could do like computeKnownBits or turning pext into shift/and if only a single contiguous range of bits is extracted. Fixes PR44389 Differential Revision: https://reviews.llvm.org/D71952	2019-12-31 15:06:47 -08:00
Sanjay Patel	a041c4ec6f	[InstCombine] fold zext of masked bit set/clear This does not solve PR17101, but it is one of the underlying diffs noted here: https://bugs.llvm.org/show_bug.cgi?id=17101#c8 We could ease the one-use checks for the 'clear' (no 'not' op) half of the transform, but I do not know if that asymmetry would make things better or worse. Proofs: https://rise4fun.com/Alive/uVB Name: masked bit set %sh1 = shl i32 1, %y %and = and i32 %sh1, %x %cmp = icmp ne i32 %and, 0 %r = zext i1 %cmp to i32 => %s = lshr i32 %x, %y %r = and i32 %s, 1 Name: masked bit clear %sh1 = shl i32 1, %y %and = and i32 %sh1, %x %cmp = icmp eq i32 %and, 0 %r = zext i1 %cmp to i32 => %xn = xor i32 %x, -1 %s = lshr i32 %xn, %y %r = and i32 %s, 1	2019-12-31 12:35:10 -05:00
Nikita Popov	7adb5c2aca	Revert "[InstCombine] Fix infinite loop due to bitcast <-> phi transforms" This reverts commit `27a0795943`. Seems to break test-suite.	2019-12-31 17:42:57 +01:00
Nikita Popov	27a0795943	[InstCombine] Fix infinite loop due to bitcast <-> phi transforms Fix for https://bugs.llvm.org/show_bug.cgi?id=44245. The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up fighting against each other, because optimizeBitCastFromPhi() assumes that bitcasts of loads will get folded. This doesn't happen here, because a dangling phi node prevents the one-use fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering. This patch fixes the issue by adding manually removing the old phis. Differential Revision: https://reviews.llvm.org/D71164	2019-12-31 16:17:14 +01:00
Connor Abbott	fb114694e9	[InstCombine] Don't rewrite phi-of-bitcast when the phi has other users Judging by the existing comments, this was the intention, but the transform never actually checked if the existing phi's would be removed. See https://bugs.llvm.org/show_bug.cgi?id=44242 for an example where this causes much worse code generation on AMDGPU. Differential Revision: https://reviews.llvm.org/D71209	2019-12-31 12:15:02 +01:00
Ilya Biryukov	4f82af81a0	[Attributor] Suppress unused warnings when assertions are disabled. NFC	2019-12-31 10:21:52 +01:00
Johannes Doerfert	751336340d	[Attributor] Function signature rewrite infrastructure As part of the Attributor manifest we want to change the signature of functions. This patch introduces a fairly generic interface to do so. As a first, very simple, use case, we remove unused arguments. A second use case, pointer privatization, will be committed with this patch as well. A lot of the code and ideas are taken from argument promotion and we run all argument promotion tests through this framework as well. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D68765	2019-12-31 02:31:33 -06:00
Johannes Doerfert	dada8132af	[Attributor] Propagate known align from arguments to call sites arguments Since the information is known we can simply use it at the call site. This is especially useful for callbacks but also helps regular calls. The test changes are mechanical.	2019-12-31 01:33:22 -06:00
Johannes Doerfert	b1b441d22d	[Attributor] Use abstract call sites to determine associated arguments This is the second step after D67871 to make use of abstract call sites. In this patch the argument we associate with a abstract call site argument can be the one in the callback callee instead of the one in the callback broker. Caveat: We cannot allow no-alias arguments for problematic callbacks: As described in [1], adding no-alias (or restrict) to arguments could break synchronization as the synchronization effect, e.g., a barrier, does not "alias" with the pointer anymore. This disables no-alias annotation for potentially problematic arguments until we implement the fix described in [1]. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D68008 [1] Compiler Optimizations for OpenMP, J. Doerfert and H. Finkel, International Workshop on OpenMP 2018, http://compilers.cs.uni-saarland.de/people/doerfert/par_opt18.pdf	2019-12-31 01:33:22 -06:00
Johannes Doerfert	2888019871	[Attributor] Annotate the memory behavior of call site arguments Especially for callbacks, annotating the call site arguments is important. Doing so exposed a too strong dependence of AAMemoryBehavior on AANoCapture since we handle the case of potentially captured pointers explicitly. The changes to the tests are all mechanical.	2019-12-31 01:33:21 -06:00
Sanjay Patel	987eb8e26c	[InstCombine] propagate sign argument through nested copysigns This is another optimization suggested in PR44153: https://bugs.llvm.org/show_bug.cgi?id=44153	2019-12-30 11:06:02 -05:00
Evgeniy Brevnov	948e745270	[LV][NFC] Keep dominator tree up to date during vectorization.	2019-12-30 18:38:41 +07:00
Evgeniy Brevnov	1b6286b945	[LV][NFC] Some refactoring and renaming to facilitate next change.	2019-12-30 18:38:41 +07:00
Hideto Ueno	34fe8d0451	[Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifest Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71972	2019-12-30 17:08:48 +09:00
Hideto Ueno	ef4febd85b	[Attributor] AAUndefinedBehavior: Check for branches on undef value. A branch is considered UB if it depends on an undefined / uninitialized value. At this point this handles simple UB branches in the form: `br i1 undef, ...` We query `AAValueSimplify` to get a value for the branch condition, so the branch can be more complicated than just: `br i1 undef, ...`. Patch By: Stefanos Baziotis (@baziotis) Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D71799	2019-12-29 17:43:00 +09:00
Gil Rapaport	d62bf16131	[LV] Use getMask() when printing recipe [NFCI] Use dedicated API for getting the mask instead of duplicating it. Differential Revision: https://reviews.llvm.org/D71964	2019-12-29 08:50:40 +02:00
Florian Hahn	dc2c9b0fcf	[Matrix] Propagate and use shape info for binary operators. This patch extends the current shape propagation and shape aware lowering to also support binary operators. Those operators are uniform with respect to their shape (shape of the input operands is the same as the shape of their result). Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70898	2019-12-27 15:50:47 +00:00
Fangrui Song	7a7334663c	Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821 Intrinsic has incorrect argument type! i32 (i32) @llvm.setjmp wipes tear	2019-12-27 00:00:14 -08:00
Hideto Ueno	cb5eb13eaf	[Attributor] Add helper to change an instruction to `unreachable` inst Summary: Calling `changeToUnreachable` in `manifest` from different places might cause really unpredictable problems. As other deleting functions are doing, we need to change these instructions after all `manifest`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71910	2019-12-27 02:39:37 +09:00
Whitney Tsang	d1f41b2ca9	[NFC][LoopFusion] Fix printing of the guard branch. Reviewer: kbarton, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71878	2019-12-26 02:45:29 +00:00
Hideto Ueno	1d5d074aef	[Attributor] Reach optimistic fixpoint in AAValueSimplify when the value is constant or undef Summary: As discussed in D71799, we have found that it is more useful to reach an optimistic fixpoint in AAValueSimpify when the value is constant or undef. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: baziotis, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71852	2019-12-25 14:18:34 +09:00
Johannes Doerfert	5732f56bbd	[Attributor] UB Attribute now handles all instructions that access memory through a pointer Summary: Follow-up on: https://reviews.llvm.org/D71435 We basically use `checkForAllInstructions` to loop through all the instructions in a function that access memory through a pointer: load, store, atomicrmw, atomiccmpxchg Note that we can now use the `getPointerOperand()` that gets us the pointer operand for an instruction that belongs to the aforementioned set. Question: This function returns `nullptr` if the instruction is `volatile`. Why? Guess: Because if it is volatile, we don't want to do any transformation to it. Another subtle point is that I had to add AtomicRMW, AtomicCmpXchg to `initializeInformationCache()`. Following `checkAllInstructions()` path, that seemed the most reasonable place to add it and correct the fact that these instructions were ignored (they were not in `OpcodeInstMap` etc.). Is that ok? Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert, sstefan1 Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71787	2019-12-24 19:25:08 -06:00
Johannes Doerfert	58f324a468	[Attributor] Function level undefined behavior attribute _Eventually_, this attribute will be assigned to a function if it contains undefined behavior. As a first small step, I tried to make it loop through the load instructions in a function (eventually, the plan is to check if a load instructions causes undefined behavior, because e.g. dereferences a null pointer - Also eventually, this won't happen in initialize() but in updateImpl()). Patch By: Stefanos Baziotis (@baziotis) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D71435	2019-12-24 19:23:08 -06:00
Florian Hahn	8d6f59b78a	[Matrix] Use fmuladd for matrix.multiply if allowed. If the matrix.multiply calls have the contract fast math flag, we can use fmuladd. This als adds a command line option to force fmuladd generation. We can retire this option once there is a clang-level option. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70951	2019-12-23 14:49:14 +01:00
Florian Hahn	109e4e3851	[Matrix] Add forward shape propagation and first shape aware lowerings. This patch adds infrastructure for forward shape propagation to LowerMatrixIntrinsics. It also updates the pass to make use of the shape information to break up larger vector operations and to eliminate unnecessary conversion operations between columnwise matrixes and flattened vectors: if shape information is available for an instruction, lower the operation to a set of instructions operating on columns. For example, a store of a matrix is broken down into separate stores for each column. For users that do not have shape information (e.g. because they do not yet support shape information aware lowering), we pack the result columns into a flat vector and update those users. It also adds shape aware lowering for the first non-intrinsic instruction: vector stores. Example: For %c = call <4 x double> @llvm.matrix.transpose(<4 x double> %a, i32 2, i32 2) store <4 x double> %c, <4 x double>* %Ptr We generate the code below without shape propagation. Note %9 which combines the columns of the transposed matrix into a flat vector. %split = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 0, i32 1> %split1 = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 2, i32 3> %1 = extractelement <2 x double> %split, i64 0 %2 = insertelement <2 x double> undef, double %1, i64 0 %3 = extractelement <2 x double> %split1, i64 0 %4 = insertelement <2 x double> %2, double %3, i64 1 %5 = extractelement <2 x double> %split, i64 1 %6 = insertelement <2 x double> undef, double %5, i64 0 %7 = extractelement <2 x double> %split1, i64 1 %8 = insertelement <2 x double> %6, double %7, i64 1 %9 = shufflevector <2 x double> %4, <2 x double> %8, <4 x i32> <i32 0, i32 1, i32 2, i32 3> store <4 x double> %9, <4 x double>* %Ptr With this patch, we propagate the 2x2 shape information from the transpose to the store and we generate the code below. Note that we store the columns directly and do not need an extra shuffle. %9 = bitcast <4 x double>* %Ptr to double* %10 = bitcast double* %9 to <2 x double>* store <2 x double> %4, <2 x double>* %10, align 8 %11 = getelementptr double, double* %9, i32 2 %12 = bitcast double* %11 to <2 x double>* store <2 x double> %8, <2 x double>* %12, align 8 Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70897	2019-12-23 13:51:56 +01:00
Dinar Temirbulatov	a755ccefe6	[SLP] Replace NeedToGather variable with enum.	2019-12-23 08:21:53 +01:00
Mark de Wever	098d3347e7	[Transforms] Fixes -Wrange-loop-analysis warnings This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71810	2019-12-22 19:20:17 +01:00
Sanjay Patel	9cdcd81d3f	[InstCombine] enhance fold for copysign with known sign arg This is another optimization suggested in PRPR44153: https://bugs.llvm.org/show_bug.cgi?id=44153	2019-12-22 10:07:01 -05:00
Sanjay Patel	79c7fa31f3	[InstCombine] check alloc size in bitcast of geps fold (PR44321) We missed a constraint in D44833 when folding a bitcast into a GEP with vector/array types. If the alloc sizes specified by the datalayout don't match, this could miscompile as shown in: https://bugs.llvm.org/show_bug.cgi?id=44321 Differential Revision: https://reviews.llvm.org/D71771	2019-12-21 10:31:21 -05:00
Sanjay Patel	19f9f374d9	[SimplifyLibCalls] require fast-math-flags for pow(X, -0.5) transforms As discussed in PR44330: https://bugs.llvm.org/show_bug.cgi?id=44330 ...the transform from pow(X, -0.5) libcall/intrinsic to reciprocal square root can result in small deviations from the expected result due to differences in the pow() implementation and/or the extra rounding step from the division. This patch proposes to allow that difference with either the 'approximate functions' or 'reassociate' FMF: http://llvm.org/docs/LangRef.html#fast-math-flags In practice, this likely means that the code is compiled with all of 'fast' (-ffast-math), but I have preserved the existing specializations for -0.0/-INF that enable generating safe code if those special values are allowed simultaneously with allowing approximation/reassociation. The question about whether a similar restriction is needed for the non-reciprocal case -- pow(X, 0.5) -- is deferred. That transform is allowed without FMF currently, and this patch does not change that behavior. Differential Revision: https://reviews.llvm.org/D71706	2019-12-21 10:00:53 -05:00
Jakub Kuderski	c431c407eb	[InstCombine] Improve infinite loop detection Summary: This patch limits the default number of iterations performed by InstCombine. It also exposes a new option that allows to specify how many iterations is considered getting stuck in an infinite loop. Based on experiments performed on real-world C++ programs, InstCombine seems to perform at most ~8-20 iterations, so treating 1000 iterations as an infinite loop seems like a safe choice. See D71145 for details. The two limits can be specified via command line options. Reviewers: spatel, lebedev.ri, nikic, xbolva00, grosser Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71673	2019-12-20 16:15:04 -05:00
Ayal Zaks	e498be5738	[LV] Strip wrap flags from vectorized reductions A sequence of additions or multiplications that is known not to wrap, may wrap if it's order is changed (i.e., reassociated). Therefore when vectorizing integer sum or product reductions, their no-wrap flags need to be removed. Fixes PR43828 Patch by Denis Antrushin Differential Revision: https://reviews.llvm.org/D69563	2019-12-20 14:48:53 +02:00
Vedant Kumar	caaacb8399	HotColdSplitting: Do not outline within noreturn functions A function marked `noreturn` may contain unreachable terminators: these should not be considered cold, as the function may be a trampoline. rdar://58068594	2019-12-19 14:06:24 -08:00
Bjorn Pettersson	89e3bb4502	[ConstantHoisting] Ignore unreachable bb:s when collecting candidates Summary: Ignore looking at blocks that are unreachable from entry when collecting candidates for hosting. Normally the consthoist pass is executed in the llc pipeline, just after unreachableblockelim. So it is abnormal to have code that is unreachable from the entry block. But when running the pass as part of opt, for example as part of fuzzy testing, we might trigger various kinds of asserts when collecting candidates if we include unreachable blocks in that analysis. It seems like a waste of time to hoist constants in unreachble blocks, so the solution is to simply ignore such blocks when collecting the hoisting candidates. The two added test cases used to end up in two different asserts, and the intention with the checks is just to verify that we no longer fail. Fixes: PR43903 Reviewers: spatel Reviewed By: spatel Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71678	2019-12-19 15:07:55 +01:00
David Green	a59cc5e128	[InstCombine] Canonicalize select immediates In certain situations after inlining and simplification we end up with code that is _almost_ a min/max pattern, but contains constants that have been demand-bit optimised to the wrong values, ending up with code like: %1 = icmp slt i32 %shr, -128 %2 = select i1 %1, i32 128, i32 %shr %.inv = icmp sgt i32 %shr, 127 %spec.select.i = select i1 %.inv, i32 127, i32 %2 %conv7 = trunc i32 %spec.select.i to i8 This should be turned into a min/max pattern, but the -128 in the first select was instead transformed into 128, as only the bottom byte was ever demanded. To fix this, I've put in further canonicalisation for the immediates of selects, preferring to use the same value as the icmp if available. Differential Revision: https://reviews.llvm.org/D71516	2019-12-19 12:36:46 +00:00
Piotr Sobczak	40b5a0f7c8	Revert "[InstCombine][AMDGPU] Trim more components of *buffer_load" Revert D70315, as it breaks gfx8 for some reason. This reverts commit `65f94b3380`.	2019-12-18 22:04:44 +01:00
Kit Barton	3db1cf7a1e	[LoopFusion] Use the LoopInfo::isRotatedForm method (NFC). Loop fusion previously had a method to check whether a loop was in rotated form. This method has been moved into the LoopInfo class. This patch removes the old isRotated method from loop fusion, in favour of the new one in LoopInfo.	2019-12-18 15:04:25 -05:00
Jakub Kuderski	3d29c41ad5	[InstCombine] Insert instructions before adding them to worklist Summary: This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions. It also adds a check in `Worklist::Add` that ensures that all added instructions have parents. Simple test case that illustrates the difference when run with `--debug-only=instcombine`: ``` define i32 @test35(i32 %a, i32 %b) { %1 = or i32 %a, 1135 %2 = or i32 %1, %b ret i32 %2 } ``` Before this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: <badref> = or i32 %2, 1135 ... ``` With this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: %3 = or i32 %2, 1135 ... ``` Reviewers: fhahn, davide, spatel, foad, grosser, nikic Reviewed By: nikic Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71093	2019-12-18 14:55:41 -05:00
Jakub Kuderski	406b6019cd	[InstCombine] Allow to limit the max number of iterations Summary: This patch teaches InstCombine to accept a new parameter: maximum number of iterations over functions. InstCombine tries to simplify instructions by iterating over the whole function until the function stops changing. As a consequence, the last iteration before reaching a fixpoint visits all instructions in the worklist and never performs any rewrites. Bounding the number of iterations can have 2 benefits: * In case the users of the pass can make a good guess about the number of required iterations, we can save the time normally spent on the last iteration that doesn't change anything. * When the wants to use InstCombine as a cleanup pass, it may be enough to run just a few iterations and stop even before reaching a fixpoint. This can be also useful for implementing a lightweight pass pipeline (think `-O1`). This patch does not change the behavior of opt or Clang -- limiting the number of iterations is entirely opt-in. Reviewers: fhahn, davide, spatel, foad, nlopes, grosser, lebedev.ri, nikic, xbolva00 Reviewed By: spatel Subscribers: craig.topper, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71145	2019-12-18 13:48:54 -05:00
stozer	89d19d60ad	Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel This reverts commit `1f3dd83cc1`, reapplying commit `bb1b0bc4e5`. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.	2019-12-18 16:26:42 +00:00
Whitney Tsang	9883d7edc6	[LoopUtils] Updated deleteDeadLoop() to handle loop nest. Reviewer: kariddi, sanjoy, reames, Meinersbur, bmahjour, etiotto, kbarton Reviewed By: Meinersbur Subscribers: mgorny, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D70939	2019-12-18 15:59:45 +00:00
stozer	1f3dd83cc1	Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel" Reverted due to build failure on windows bots. This reverts commit `bb1b0bc4e5`.	2019-12-18 11:46:10 +00:00
stozer	bb1b0bc4e5	[DebugInfo] Correctly handle salvaged casts and split fragments at ISel Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.	2019-12-18 11:09:18 +00:00
Anna Welker	7cd1cfdd6b	[NFC][TTI] Add Alignment for isLegalMasked[Gather/Scatter] Add an extra parameter so alignment can be taken under consideration in gather/scatter legalization. Differential Revision: https://reviews.llvm.org/D71610	2019-12-18 09:14:39 +00:00
Whitney Tsang	36bdc3dc35	[LoopFusion] Move instructions from FC0.Latch to FC1.Latch. Summary:This PR move instructions from FC0.Latch bottom up to the beginning of FC1.Latch as long as they are proven safe. To illustrate why this is beneficial, let's consider the following example: Before Fusion: header1: br header2 header2: br header2, latch1 latch1: br header1, preheader3 preheader3: br header3 header3: br header4 header4: br header4, latch3 latch3: br header3, exit3 After Fusion (before this PR): header1: br header2 header2: br header2, latch1 latch1: br header3 header3: br header4 header4: br header4, latch3 latch3: br header1, exit3 Note that preheader3 is removed during fusion before this PR. Notice that we cannot fuse loop2 with loop4 as there exists block latch1 in between. This PR move instructions from latch1 to beginning of latch3, and remove block latch1. LoopFusion is now able to fuse loop nest recursively. After Fusion (after this PR): header1: br header2 header2: br header3 header3: br header4 header4: br header2, latch3 latch3: br header1, exit3 Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: kbarton, Meinersbur Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71165	2019-12-17 22:10:23 +00:00
Stefan Stipanovic	fff8ec9813	[Attributor] H2S fix. Summary: Fixing issues that were noticed in D71521 Reviewers: jdoerfert, lebedev.ri, uenoku Subscribers: Differential Revision: https://reviews.llvm.org/D71564	2019-12-17 20:41:09 +01:00
Piotr Sobczak	65f94b3380	[InstCombine][AMDGPU] Trim more components of buffer_load Summary: Add trimming of unused components of s_buffer_load. Extend trimming of buffer_load to also include unused components at the beginning of vectors and update offset. Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70315	2019-12-17 17:50:07 +01:00
Guillaume Chatelet	531c1161b9	Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" Summary: This is a resubmit of D71473. This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: aaron.ballman, courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71547	2019-12-17 10:07:46 +01:00
Whitney Tsang	ec4749e3b8	Revert "[LoopUtils] Updated deleteDeadLoop() to handle loop nest." This reverts commit `cd09fee3d6`. This reverts commit `c066ff11d8`.	2019-12-17 03:51:41 +00:00
Johannes Doerfert	0bc3336ac1	[Attributor][NFC] Clang format the Attributor The Attributor is always kept formatted so diffs are cleaner. Sometime we get out of sync for various reasons so we need to format the file once in a while.	2019-12-16 21:03:18 -06:00
Whitney Tsang	c066ff11d8	[LoopUtils] Updated deleteDeadLoop() to handle loop nest. Reviewer: kariddi, sanjoy, reames, Meinersbur, bmahjour, etiotto, kbarton Reviewed By: Meinersbur Subscribers: mgorny, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D70939	2019-12-17 01:06:14 +00:00
Kit Barton	ff07fc66d9	[LoopFusion] Restrict loop fusion to rotated loops. Summary: This patch restricts loop fusion to only consider rotated loops as valid candidates. This simplifies the analysis and transformation and aligns with other loop optimizations. Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel Reviewed By: Meinersbur Subscribers: ormris, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71025	2019-12-16 15:17:29 -05:00
Craig Topper	02f644c59a	[InstCombine] Teach removeBitcastsFromLoadStoreOnMinMax not to change the size of a store. We can change the type as long as we don't change the size. Fixes PR44306 Differential Revision: https://reviews.llvm.org/D71532	2019-12-16 12:12:54 -08:00
Guillaume Chatelet	4658da10e4	Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" This reverts commit `181ab91efc`.	2019-12-16 15:19:49 +01:00
Guillaume Chatelet	181ab91efc	[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove Summary: This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71473	2019-12-16 13:35:55 +01:00
Bjorn Pettersson	e5f07080b8	[BasicBlockUtils] Fix dbg.value elimination problem in MergeBlockIntoPredecessor Summary: In commit `d60f34c20a` (llvm-svn 317128, PR35113) MergeBlockIntoPredecessor was changed into discarding some dbg.value intrinsics referring to PHI values, post-splice due to loop rotation. That elimination of dbg.value intrinsics did not consider which dbg.value to keep depending on the context (e.g. if the variable is changing its value several times inside the basic block). In the past that hasn't been such a big problem since CodeGenPrepare::placeDbgValues has moved the dbg.value to be next to the PHI node anyway. But after commit `00e238896c` CodeGenPrepare isn't doing that any longer, so we need to be more careful when avoiding duplicate dbg.value intrinsics in MergeBlockIntoPredecessor. This patch replaces the code that tried to avoid duplicate dbg.values by using the RemoveRedundantDbgInstrs helper. Reviewers: aprantl, jmorse, vsk Reviewed By: aprantl, vsk Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71480	2019-12-16 11:41:21 +01:00
Bjorn Pettersson	1c49553c19	[BasicBlockUtils] Add utility to remove redundant dbg.value instrs Summary: Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the goal to remove redundant dbg intrinsics from a basic block. This can be useful after various transforms, as it might be simpler to do a filtering of dbg intrinsics after the transform than during the transform. One primary use case would be to replace a too aggressive removal done by MergeBlockIntoPredecessor, seen at loop rotate (not done in this patch). The elimination algorithm currently focuses on dbg.value intrinsics and is doing two iterations over the BB. First we iterate backward starting at the last instruction in the BB. Whenever a consecutive sequence of dbg.value instructions are found we keep the last dbg.value for each variable found (variable fragments are identified using the {DILocalVariable, FragmentInfo, inlinedAt} triple as given by the DebugVariable helper class). Next we iterate forward starting at the first instruction in the BB. Whenever we find a dbg.value describing a DebugVariable (identified by {DILocalVariable, inlinedAt}) we save the {DIValue, DIExpression} that describes that variables value. But if the variable already was mapped to the same {DIValue, DIExpression} pair we instead drop the second dbg.value. To ease the process of making lit tests for this utility a new pass is introduced called RedundantDbgInstElimination. It can be executed by opt using -redundant-dbg-inst-elim. Reviewers: aprantl, jmorse, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71478	2019-12-16 11:41:21 +01:00
Johannes Doerfert	139c9ef45a	[Attributor] Annotate call sites of declarations with a callback Even if a declaration is called, if there is a callback we might need the information during CG-SCC traversal (D70767).	2019-12-13 23:51:59 -06:00
Johannes Doerfert	3d347e2835	[Attributor][NFC] Simplify debug printing for abstract attributes This also fixes a type in the debug printing of AANoAlias.	2019-12-13 23:51:59 -06:00
Johannes Doerfert	5d34602da4	[Attributor] Only replace instruction operands This was part of D70767. When we replace the value of (call/invoke) instructions we do not want to disturb the old call graph so we will only replace instruction uses until we get rid of the old PM. Accepted as part of D70767.	2019-12-13 22:16:38 -06:00
Francesco Petrogalli	19f73f0d1b	Revert "[VectorUtils] Introduce the Vector Function Database (VFDatabase)." This reverts commit `0be81968a2`. The VFDatabase needs some rework to be able to handle vectorization and subsequent scalarization of intrinsics in out-of-tree versions of the compiler. For more details, see the discussion in https://reviews.llvm.org/D67572.	2019-12-13 19:42:04 +00:00
Fangrui Song	193da743db	[profile] Fix a crash when -fprofile-remapping-file= triggers an error Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D71485	2019-12-13 11:38:20 -08:00
Hiroshi Yamauchi	ed50e6060b	[PGO][PGSO] Enable size optimizations in code gen / target passes for cold code. Summary: Split off of D67120. Reviewers: davidxl Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71288	2019-12-13 11:01:19 -08:00
Nicola Zaghen	97572775d2	Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-13 14:30:21 +00:00
Evgenii Stepanov	dabd2622a8	hwasan: add tag_offset DWARF attribute to optimized debug info Summary: Support alloca-referencing dbg.value in hwasan instrumentation. Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in loclist format. Reviewers: pcc Subscribers: srhines, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70753	2019-12-12 16:18:54 -08:00
Johannes Doerfert	6abd01e462	[Attributor][FIX] Do treat byval arguments special When we reason about the pointer argument that is byval we actually reason about a local copy of the value passed at the call site. This was not the case before and we wrongly introduced attributes based on the surrounding function. AAMemoryBehaviorArgument, AAMemoryBehaviorCallSiteArgument and AANoCaptureCallSiteArgument are made aware of byval now. The code to skip "subsuming positions" for reasoning follows a common pattern and we should refactor it. A TODO was added. Discovered by @efriedma as part of D69748.	2019-12-12 16:04:21 -06:00
Florian Hahn	526244b187	[Matrix] Add first set of matrix intrinsics and initial lowering pass. This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456	2019-12-12 15:42:18 +00:00
Hideto Ueno	4ecf25545c	[Attributor][NFC] Fix comments and unnecessary comma	2019-12-12 13:42:40 +00:00
Hideto Ueno	827bade262	[Attributor] [NFC] Use `checkForAllUses` helpr in `AAHeapToStackImpl::updateImpl` Summary: Remove `Worklist` iteration and make use `checkForAllUses`. There is no test chage. Reviewers: sstefan1, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71352	2019-12-12 13:27:53 +00:00
Hideto Ueno	63599bd072	[Attributor][NFC] Refactoring `AANoFreeArgument::updateImpl` Summary: Refactoring `AANoFreeArgument::updateImpl`. There is no test change. Reviewers: sstefan1, jdoerfert Reviewed By: sstefan1 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71349	2019-12-12 13:27:53 +00:00
Nicola Zaghen	f798eb21ec	Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same." This reverts commit `5f6208778f`. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.	2019-12-12 10:29:54 +00:00
Nicola Zaghen	5f6208778f	[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-12 10:07:01 +00:00
Wenlei He	d275a06487	[AutoFDO] Statistic for context sensitive profile guided inlining Summary: AutoFDO compilation has two places that do inlining - the sample profile loader that does inlining with context sensitive profile, and the regular inliner as CGSCC pass. Ideally we want most inlining to come from sample profile loader as that is driven by context sensitive profile and also retains context sensitivity after inlining. However the reality is most of the inlining actually happens during regular inliner. To track the number of inline instances from sample profile loader and help move more inlining to sample profile loader, I'm adding statistics and optimization remarks for sample profile loader's inlining. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70584	2019-12-11 21:37:21 -08:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Reid Kleckner	85ba5f637a	Rename TTI::getIntImmCost for instructions and intrinsics Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381	2019-12-11 18:00:20 -08:00
Nikita Popov	8db5143b1a	[InstCombine] Optimize overflow check base on uadd.with.overflow result Fix for https://bugs.llvm.org/show_bug.cgi?id=40846. This adds a combine for cases where a (a + b) < a style overflow check is performed, but with a + b being the result of uadd.with.overflow, so the overflow result is also already available and we can just use it. Subsequently GVN/CSE will deduplicate the extracts. We can run into this situation if you have both a uadd.with.overflow and a manual add + overflow check in the same function (on the same operands), in which case GVN will rewrite the add to the with.overflow result and leave you with this pattern. The implementation is a bit ugly because I'm handling the various canonicalization edge cases. This does not yet handle the negated version of this pattern. Differential Revision: https://reviews.llvm.org/D58644	2019-12-11 20:52:04 +01:00
Nikita Popov	b361d3bbcd	[MergeFuncs] Remove incorrect attribute copying Fix for https://bugs.llvm.org/show_bug.cgi?id=44236. This code was originally introduced in rG36512330041201e10f5429361bbd79b1afac1ea1. However, the attribute copying was done in the wrong place (in general call replacement, not thunk generation) and a proper fix was implemented in D12581. Previously this code was just unnecessary but harmless (because FunctionComparator ensured that the attributes of the two functions are exactly the same), but since byval was changed to accept a type this copying is actively wrong and may result in malformed IR. Differential Revision: https://reviews.llvm.org/D71173	2019-12-11 20:09:54 +01:00
Guillaume Chatelet	0a0d54b357	[Alignment][NFC] Introduce Align in IRBuilder Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71343	2019-12-11 14:41:23 +01:00
Guillaume Chatelet	3491109587	Rollback assumeAligned in MemorySanitizer Summary: Rollback of parts of D71213. After digging more into the code I think we should leave 0 when creating the instructions (CreateMemcpy, CreateMaskedStore, CreateMaskedLoad). It's probably fine for MemorySanitizer because Alignement is resolved but I'm having a hard time convincing myself it has no impact at all (although tests are passing). Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71332	2019-12-11 14:25:21 +01:00
Guillaume Chatelet	8a7c52bc22	[Alignment][NFC] Introduce Align in SROA Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71277	2019-12-11 09:34:38 +01:00
Vlad Tsyrklevich	636c93ed11	Revert "Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty..." This reverts commit `f2ba93971c`, it was causing build timeouts on sanitizer-x86_64-linux-autoconf such as http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/44917	2019-12-10 16:03:17 -08:00
Francesco Petrogalli	0be81968a2	[VectorUtils] Introduce the Vector Function Database (VFDatabase). This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572	2019-12-10 16:36:44 +00:00
Sanjay Patel	396d18aeb6	[InstCombine] replace shuffle's insertelement operand if inserted scalar is not demanded This pattern is noted as a regression from: D70246 ...where we removed an over-aggressive shuffle simplification. SimplifyDemandedVectorElts fails to catch this case when the insert has multiple uses, so I'm proposing to pattern match the minimal sequence directly. This fold does not conflict with any of our current shuffle undef/poison semantics. Differential Revision: https://reviews.llvm.org/D71220	2019-12-10 10:10:05 -05:00
Guillaume Chatelet	1b2842bf90	[Alignment][NFC] CreateMemSet use MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71213	2019-12-10 15:17:44 +01:00
stozer	f2ba93971c	Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty... basic blocks Originally applied in `72ce759928`. Fixed a build failure caused by incorrect use of cast instead of dyn_cast. This reverts commit `8b0780f795`.	2019-12-10 13:33:32 +00:00
Djordje Todorovic	9b9e995819	[DebugInfo][EarlyCSE] Use the salvageDebugInfoOrMarkUndef(); NFC Use the newest API. Differential Revision: https://reviews.llvm.org/D71061	2019-12-09 13:57:35 +01:00
David Green	be7a107070	[ARM] Teach the Arm cost model that a Shift can be folded into other instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966	2019-12-09 10:24:33 +00:00
Florian Hahn	c491949694	[LV] Pick correct BB as insert point when fixing PHI for FORs. Currently we fail to pick the right insertion point when PreviousLastPart of a first-order-recurrence is a PHI node not in the LoopVectorBody. This can happen when PreviousLastPart is produce in a predicated block. In that case, we should pick the insertion point in the BB the PHI is in. Fixes PR44020. Reviewers: hsaito, fhahn, Ayal, dorit Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D71071	2019-12-07 19:32:00 +00:00
Florian Hahn	c25de56905	[SimplifyCFG] Account for N being null. Fixes a crash, e.g. http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15119/	2019-12-07 17:23:42 +00:00
Rodrigo Caetano Rocha	d714aa0dfd	[SimplifyCFG] Handle AssumptionCache being null. AssumptionCache can be null in SimplifyCFGOptions. However, FoldCondBranchOnPHI() was not properly handling that when passing a null AssumptionCache to simplifyCFG. Patch by Rodrigo Caetano Rocha <rcor.cs@gmail.com> Reviewers: fhahn, lebedev.ri, spatel Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D69963	2019-12-07 16:54:49 +00:00
Florian Hahn	e60b36cf92	[VPlan] Rename VPlanHCFGTransforms to VPlanTransforms (NFC). The file is intended to gather various VPlan transformations, not only CFG related transforms. Actually, the only transformation there is not CFG related. Reviewers: Ayal, gilr, hsaito, rengolin Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D70732	2019-12-07 08:56:35 +00:00
Teresa Johnson	c8e36862f5	[WPD] Remove unused parameter (NFC) Remove unused parameter.	2019-12-06 13:14:21 -08:00
Wenlei He	7b61ae68ec	[AutoFDO] Inline replay for cold/small callees from sample profile loader Summary: Sample profile loader of AutoFDO tries to replay previous inlining using context sensitive profile. The replay only repeats inlining if the call site block is hot. As a result it punts inlining of small functions, some of which can be beneficial for size, and will still be inlined by CSGCC inliner later. The oscillation between sample profile loader's inlining and regular CGSSC inlining cause unnecessary loss of context-sensitive profile. It doesn't have much impact for inline decision itself, but it negatively affects post-inline profile quality as CGSCC inliner have to scale counts which is not as accurate as the original context sensitive profile, and bad post-inline profile can misguide code layout. This change added regular Inline Cost calculation for sample profile loader, so we can inline small functions upfront under switch -sample-profile-inline-size. In addition -sample-profile-cold-inline-threshold is added so we can tune the separate size threshold - currently the default is chosen to be the same as regular inliner's cold call-site threshold. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70750	2019-12-06 11:44:45 -08:00
Sanjay Patel	43e2a901e1	Revert "[InstCombine] reduce code duplication; NFC" This reverts commit `db57396584`. At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.	2019-12-06 14:24:14 -05:00
Sanjay Patel	b6d6f5470f	Revert "[InstCombine] improve readability; NFC" This reverts commit `7250ef3613`. At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.	2019-12-06 14:20:44 -05:00
Sanjay Patel	142a75a9b1	Revert "[InstCombine] reduce indentation; NFC" This reverts commit `8bf8ef7116`. At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.	2019-12-06 14:19:02 -05:00
Sanjay Patel	8bf8ef7116	[InstCombine] reduce indentation; NFC	2019-12-06 13:26:45 -05:00
Sanjay Patel	7250ef3613	[InstCombine] improve readability; NFC CreateIntCast returns the input if its type matches, so need to duplicate that check.	2019-12-06 13:26:45 -05:00
Sanjay Patel	db57396584	[InstCombine] reduce code duplication; NFC	2019-12-06 13:26:45 -05:00
Sanjay Patel	6bb62a9d97	[InstCombine] improve readability; NFC	2019-12-06 13:26:44 -05:00
Gil Rapaport	39ccc099c9	[LV] Record GEP widening decisions in recipe (NFCI) InnerLoopVectorizer's code called during VPlan execution still relies on original IR's def-use relations to decide which vector code to generate, limiting VPlan transformations ability to modify def-use relations and still have ILV generate the vector code. This commit moves GEP operand queries controlling how GEPs are widened to a dedicated recipe and extracts GEP widening code to its own ILV method taking those recorded decisions as arguments. This reduces ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. Differential revision: https://reviews.llvm.org/D69067	2019-12-06 13:41:19 +02:00
Daniil Suchkov	c4d8c6319f	[LCSSA] Don't use VH callbacks to invalidate SCEV when creating LCSSA phis In general ValueHandleBase::ValueIsRAUWd shouldn't be called when not all uses of the value were actually replaced, though, currently formLCSSAForInstructions calls it when it inserts LCSSA-phis. Calls of ValueHandleBase::ValueIsRAUWd were added to LCSSA specifically to update/invalidate SCEV. In the best case these calls duplicate some of the work already done by SE->forgetValue, though in case when SCEV of the value is SCEVUnknown, SCEV replaces the underlying value of SCEVUnknown with the new value (i.e. acts like LCSSA-phi actually fully replaces the value it is created for), which leads to SCEV being corrupted because LCSSA-phi rarely dominates all uses of its inputs. Fixes bug https://bugs.llvm.org/show_bug.cgi?id=44058. Reviewers: fhahn, efriedma, reames, sanjoy.google Reviewed By: fhahn Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70593	2019-12-06 13:21:49 +07:00
Teresa Johnson	54a3c2a81e	[ThinLTO] Add option to disable readonly/writeonly attribute propagation Summary: Add an option to allow the attribute propagation on the index to be disabled, to allow a workaround for issues (such as that fixed by D70977). Also move the setting of the WithAttributePropagation flag on the index into propagateAttributes(), and remove some old stale code that predated this flag and cleared the maybe read/write only bits when we need to disable the propagation (previously only when importing disabled, now also when the new option disables it). Reviewers: evgeny777, steven_wu Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70984	2019-12-05 16:33:54 -08:00
Wenlei He	532196d811	[AutoFDO] Top-down Inlining for specialization with context-sensitive profile Summary: AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case: Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing. This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits, tejohnson Tags: #llvm Differential Revision: https://reviews.llvm.org/D70655	2019-12-05 16:07:01 -08:00
Wenlei He	e503fd85d3	[AutoFDO] Properly merge context-sensitive profile of inlinee back to outlined function Summary: When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases: - Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile. - Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay. A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70653	2019-12-05 15:57:55 -08:00
Florian Hahn	19071173fc	Revert "[DSE] Fix for a dangling point bug in DeadStoreElimination." The commit causes a failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20911 This reverts commit `1847fd9d85`.	2019-12-05 19:29:21 +00:00
Evgenii Stepanov	6f89cbc429	LowerDbgDeclare: look through bitcasts. Summary: Emit a value debug intrinsic (with OP_deref) when an alloca address is passed to a function call after going through a bitcast. This generates an FP or SP-relative location for the local variable in the following case: int x; use((void *)&x; Reviewers: aprantl, vsk, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70752	2019-12-05 11:19:07 -08:00
Bob Haarman	055779a9ac	Revert "[InstCombine] keep assumption before sinking calls" Summary: This reverts commit `c3b06d0c39`. Reason for revert: Caused miscompiles when inserting assume for undef. Also adds a test to prevent similar breakage in future. Fixes PR44154. Reviewers: rnk, jdoerfert, efriedma, xbolva00 Reviewed By: rnk Subscribers: thakis, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70933	2019-12-05 10:39:34 -08:00
Roman Lebedev	796fa662f1	[InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`) Summary: D68408 proposes to greatly improve our negation sinking abilities. But in current canonicalization, we produce `sub A, zext(B)`, which we will consider non-canonical and try to sink that negation, undoing the existing canonicalization. So unless we explicitly stop producing previous canonicalization, we will have two conflicting folds, and will end up endlessly looping. This inverts canonicalization, and adds back the obvious fold that we'd miss: * `sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y)` https://rise4fun.com/Alive/xx4 * `sext(bool) + C -> bool ? C - 1 : C` https://rise4fun.com/Alive/fBl It is obvious that `@ossfuzz_9880()` / `@lshr_out_of_range()`/`@ashr_out_of_range()` (oss-fuzz 4871) are no longer folded as much, though those aren't really worrying. Reviewers: spatel, efriedma, t.p.northover, hfinkel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71064	2019-12-05 21:21:30 +03:00
Ankit	1847fd9d85	[DSE] Fix for a dangling point bug in DeadStoreElimination. The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Patch by Ankit <quic_aankit@quicinc.com> Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326	2019-12-05 17:53:58 +00:00
Sanjay Patel	3c6b5d3674	[InstCombine] narrow select with FP casts Select doesn't change values, so truncate of extended operand cancels out.	2019-12-05 11:12:44 -05:00
Sanjay Patel	51e420c27e	[InstCombine] add FMF guard to builder in fptrunc transform; NFC This makes no difference currently because we don't apply FMF to FP casts, but that may change. This could also be a place to add a fold for select with fptrunc, so it will make that patch easier/smaller.	2019-12-05 10:55:07 -05:00
Roman Lebedev	09311459e3	[InstCombine] Extend `0 - (X sdiv C) -> (X sdiv -C)` fold to non-splat vectors Split off from https://reviews.llvm.org/D68408	2019-12-05 15:48:29 +03:00
Teresa Johnson	e420c0c78e	[ThinLTO] Fix importing of writeonly variables in distributed ThinLTO Summary: D69561/dde5893 enabled importing of readonly variables with references, however, it introduced a bug relating to importing/internalization of writeonly variables with references. A fix for this was added in D70006/7f92d66. But this didn't work in distributed ThinLTO mode. The reason is that the fix (importing the writeonly var with a zeroinitializer) was only applied when there were references on the writeonly var summary. In distributed ThinLTO mode, where we only have a small slice of the index, we will not have the references on the importing side if we are not importing those referenced values. Rather than changing this handshaking (which will require a lot of other changes, since that's how we know what to import in the distributed backend clang invocation), we can simply always give the writeonly variable a zero initializer. Reviewers: evgeny777, steven_wu Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70977	2019-12-04 14:59:27 -08:00
Tozer	8b0780f795	Revert "[DebugInfo] Recover debug intrinsics when killing duplicated/empty basic blocks" This reverts commit `72ce759928`. Reverted due to build failure.	2019-12-04 18:47:08 +00:00
Vedant Kumar	f208b70fbc	Revert "[Coverage] Revise format to reduce binary size" This reverts commit `e18531595b`. On Windows, there is an error: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54963/steps/stage%201%20check/logs/stdio error: C:\b\slave\sanitizer-windows\build\stage1\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-merging.cpp.tmp.v1.o: Failed to load coverage: Malformed coverage data	2019-12-04 10:35:14 -08:00
Vedant Kumar	e18531595b	[Coverage] Revise format to reduce binary size Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2019-12-04 10:10:55 -08:00
Florian Hahn	e8a5c17211	[LoopInterchange] Improve inner exit loop safety checks. The PHI node checks for inner loop exits are too permissive currently. As indicated by an existing comment, we should only allow LCSSA PHI nodes that are part of reductions or are only used outside of the loop nest. We ensure this by checking the users of the LCSSA PHIs. Specifically, it is not safe to use an exiting value from the inner loop in the latch of the outer loop. It also moves the inner loop exit check before the outer loop exit check. Fixes PR43473. Reviewers: efriedma, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D68144	2019-12-04 17:46:01 +00:00
Francesco Petrogalli	a249551bb2	[llvm][Transform] Remove unused variable. [NFCI] The variable prevents compiling when using -Werror=unused-variable.	2019-12-04 17:40:30 +00:00
Hiroshi Yamauchi	62d429972e	[PGO][PGSO] Distinguish queries from unit tests and explicitly enable for the existing IR passes only. NFC. Summary: This is one more prep step necessary before the code gen pass instrumentation code could go in. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70988	2019-12-04 09:35:50 -08:00
stozer	72ce759928	[DebugInfo] Recover debug intrinsics when killing duplicated/empty basic blocks When basic blocks are killed, either due to being empty or to being an if.then or if.else block whose complement contains identical instructions, some of the debug intrinsics in that block are lost. This patch sinks those intrinsics into the single successor block, setting them Undef if necessary to prevent debug info from falling out-of-date. Differential Revision: https://reviews.llvm.org/D70318	2019-12-04 16:01:49 +00:00
Florian Hahn	4a9cde5a79	[SimpleLoopUnswitch] Invalidate the topmost loop with ExitBB as exiting. SCEV caches the exiting blocks when computing exit counts. In SimpleLoopUnswitch, we split the exit block of the loop to unswitch. Currently we only invalidate the loop containing that exit block, but if that block is the exiting block for a parent loop, we have stale cache entries. We have to invalidate the top-most loop that contains the exit block as exiting block. We might also be able to skip invalidating the loop containing the exit block, if the exit block is not an exiting block of that loop. There are also 2 more places in SimpleLoopUnswitch, that use a similar problematic approach to get the loop to invalidate. If the patch makes sense, I will also update those places to a similar approach (they deal with multiple exit blocks, so we cannot directly re-use getTopMostExitingLoop). Fixes PR43972. Reviewers: skatkov, reames, asbirlea, chandlerc Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D70786	2019-12-04 11:32:09 +00:00
Ehud Katz	2b6b8cb10c	[APFloat] Prevent construction of APFloat with Semantics and FP value Constructor invocations such as `APFloat(APFloat::IEEEdouble(), 0.0)` may seem like they accept a FP (floating point) value, but the overload they reach is actually the `integerPart` one, not a `float` or `double` overload (which only exists when `fltSemantics` isn't passed). This may lead to possible loss of data, by the conversion from `float` or `double` to `integerPart`. To prevent future mistakes, a new constructor overload, which accepts any FP value and marked with `delete`, to prevent its usage. Fixes PR34095. Differential Revision: https://reviews.llvm.org/D70425	2019-12-04 12:02:04 +02:00
Craig Topper	5ebbabc1af	[InstCombine] Revert `aafde063aa` and `6749dc3446` related to bitcast handling of x86_mmx This reverts these two commits [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double. [InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement We're seeing at least one internal test failure related to a bitcast that was previously before an inline assembly block containing emms being placed after it. This leads to the mmx state ending up not empty after the emms. IR has no way to make any specific guarantees about this. Reverting these patches to get back to previous behavior which at least worked for this test.	2019-12-03 14:02:22 -08:00
Ayal Zaks	6ed9cef25f	[LV] Scalar with predication must not be uniform Fix PR40816: avoid considering scalar-with-predication instructions as also uniform-after-vectorization. Instructions identified as "scalar with predication" will be "vectorized" using a replicating region. If such instructions are also optimized as "uniform after vectorization", namely when only the first of VF lanes is used, such a replicating region becomes erroneous - only the first instance of the region can and should be formed. Fix such cases by not considering such instructions as "uniform after vectorization". Differential Revision: https://reviews.llvm.org/D70298	2019-12-03 19:50:24 +02:00
Anton Afanasyev	a315519c17	[SLP] Enhance SLPVectorizer to vectorize different combinations of aggregates Summary: Make SLPVectorize to recognize homogeneous aggregates like `{<2 x float>, <2 x float>}`, `{{float, float}, {float, float}}`, `[2 x {float, float}]` and so on. It's a follow-up of https://reviews.llvm.org/D70068. Merged `findBuildVector()` and `findBuildAggregate()` to one `findBuildAggregate()` function making it recursive to recognize multidimensional aggregates. Aggregates required to be homogeneous. Reviewers: RKSimon, ABataev, dtemirbulatov, spatel, vporpo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70587	2019-12-03 19:29:27 +03:00
Florian Hahn	e9c68422de	[VPlan] Add dump function to VPlan class. This adds a dump() function to VPlan, which uses the existing operator<<. This method provides a convenient way to dump a VPlan while debugging, e.g. from lldb. Reviewers: hsaito, Ayal, gilr, rengolin Reviewed By: hsaito Differential Revision: https://reviews.llvm.org/D70920	2019-12-03 11:59:10 +00:00
Johannes Altmanninger	09667bc192	[asan] Remove debug locations from alloca prologue instrumentation Summary: This fixes https://llvm.org/PR26673 "Wrong debugging information with -fsanitize=address" where asan instrumentation causes the prologue end to be computed incorrectly: findPrologueEndLoc, looks for the first instruction with a debug location to determine the prologue end. Since the asan instrumentation instructions had debug locations, that prologue end was at some instruction, where the stack frame is still being set up. There seems to be no good reason for extra debug locations for the asan instrumentations that set up the frame; they don't have a natural source location. In the debugger they are simply located at the start of the function. For certain other instrumentations like -fsanitize-coverage=trace-pc-guard the same problem persists - that might be more work to fix, since it looks like they rely on locations of the tracee functions. This partly reverts `aaf4bb2394` "[asan] Set debug location in ASan function prologue" whose motivation was to give debug location info to the coverage callback. Its test only ensures that the call to @__sanitizer_cov_trace_pc_guard is given the correct source location; as the debug location is still set in ModuleSanitizerCoverage::InjectCoverageAtBlock, the test does not break. So -fsanitize-coverage is hopefully unaffected - I don't think it should rely on the debug locations of asan-generated allocas. Related revision: `3c6c14d14b` "ASAN: Provide reliable debug info for local variables at -O0." Below is how the X86 assembly version of the added test case changes. We get rid of some .loc lines and put prologue_end where the user code starts. ```diff --- 2.master.s 2019-12-02 12:32:38.982959053 +0100 +++ 2.patch.s 2019-12-02 12:32:41.106246674 +0100 @@ -45,8 +45,6 @@ .cfi_offset %rbx, -24 xorl %eax, %eax movl %eax, %ecx - .Ltmp2: - .loc 1 3 0 prologue_end # 2.c:3:0 cmpl $0, __asan_option_detect_stack_use_after_return movl %edi, 92(%rbx) # 4-byte Spill movq %rsi, 80(%rbx) # 8-byte Spill @@ -57,9 +55,7 @@ callq __asan_stack_malloc_0 movq %rax, 72(%rbx) # 8-byte Spill .LBB1_2: - .loc 1 0 0 is_stmt 0 # 2.c:0:0 movq 72(%rbx), %rax # 8-byte Reload - .loc 1 3 0 # 2.c:3:0 cmpq $0, %rax movq %rax, %rcx movq %rax, 64(%rbx) # 8-byte Spill @@ -72,9 +68,7 @@ movq %rax, %rsp movq %rax, 56(%rbx) # 8-byte Spill .LBB1_4: - .loc 1 0 0 # 2.c:0:0 movq 56(%rbx), %rax # 8-byte Reload - .loc 1 3 0 # 2.c:3:0 movq %rax, 120(%rbx) movq %rax, %rcx addq $32, %rcx @@ -99,7 +93,6 @@ movb %r8b, 31(%rbx) # 1-byte Spill je .LBB1_7 # %bb.5: - .loc 1 0 0 # 2.c:0:0 movq 40(%rbx), %rax # 8-byte Reload andq $7, %rax addq $3, %rax @@ -118,7 +111,8 @@ movl %ecx, (%rax) movq 80(%rbx), %rdx # 8-byte Reload movq %rdx, 128(%rbx) - .loc 1 4 3 is_stmt 1 # 2.c:4:3 +.Ltmp2: + .loc 1 4 3 prologue_end # 2.c:4:3 movq %rax, %rdi callq f movq 48(%rbx), %rax # 8-byte Reload ``` Reviewers: eugenis, aprantl Reviewed By: eugenis Subscribers: ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70894	2019-12-03 11:24:17 +01:00
Bill Wendling	87f146767e	Place the "cold" code piece into the same section as the original function Summary: This cropped up in the Linux kernel where cold code was placed in an incompatible section. Reviewers: compnerd, vsk, tejohnson Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70925	2019-12-02 15:24:59 -08:00
Hiroshi Yamauchi	8cdfdfeee6	[PGO][PGSO] Add an optional query type parameter to shouldOptimizeForSize. Summary: In case of a need to distinguish different query sites for gradual commit or debugging of PGSO. NFC. Reviewers: davidxl Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70510	2019-12-02 13:54:13 -08:00
Florian Hahn	fe459ce65a	[VPlan] Move graph traits (NFC). By defining the graph traits right after the VPBlockBase definitions, we can make use of them earlier in the file. Reviewers: hsaito, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D70733	2019-12-02 18:23:11 +00:00
Sanjay Patel	af4e59949c	[InstCombine] fix undef propagation for vector urem transform (PR44186) As described here: https://bugs.llvm.org/show_bug.cgi?id=44186 The match() code safely allows undef values, but we can't safely propagate a vector constant that contains an undef to the new compare instruction.	2019-12-02 12:17:38 -05:00
Simon Tatham	01aefae4a1	[ARM,MVE] Add an InstCombine rule permitting VPNOT. Summary: If a user writing C code using the ACLE MVE intrinsics generates a predicate and then complements it, then the resulting IR will use the `pred_v2i` IR intrinsic to turn some `<n x i1>` vector into a 16-bit integer; complement that integer; and convert back. This will generate machine code that moves the predicate out of the `P0` register, complements it in an integer GPR, and moves it back in again. This InstCombine rule replaces `i2v(~v2i(x))` with a direct complement of the original predicate vector, which we can already instruction- select as the VPNOT instruction which complements P0 in place. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70484	2019-12-02 16:20:30 +00:00
Roman Lebedev	0f22e783a0	[InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant() (PR44100) rL341831 moved one-use check higher up, restricting a few folds that produced a single instruction from two instructions to the case where the inner instruction would go away. Original commit message: > InstCombine: move hasOneUse check to the top of foldICmpAddConstant > > There were two combines not covered by the check before now, > neither of which actually differed from normal in the benefit analysis. > > The most recent seems to be because it was just added at the top of the > function (naturally). The older is from way back in 2008 (r46687) > when we just didn't put those checks in so routinely, and has been > diligently maintained since. From the commit message alone, there doesn't seem to be a deeper motivation, deeper problem that was trying to solve, other than 'fixing the wrong one-use check'. As i have briefly discusses in IRC with Tim, the original motivation can no longer be recovered, too much time has passed. However i believe that the original fold was doing the right thing, we should be performing such a transformation even if the inner `add` will not go away - that will still unchain the comparison from `add`, it will no longer need to wait for `add` to compute. Doing so doesn't seem to break any particular idioms, as least as far as i can see. References https://bugs.llvm.org/show_bug.cgi?id=44100	2019-12-02 18:06:15 +03:00
Sanjay Patel	af0babc90a	[InstCombine] fold copysign with constant sign argument to (fneg+)fabs If the sign of the sign argument is known (this could be extended to use ValueTracking), then we can use fneg+fabs to clear/set the sign bit of the magnitude argument. http://llvm.org/docs/LangRef.html#llvm-copysign-intrinsic This transform is already done in DAGCombiner, but we can do it sooner in IR as suggested in PR44153: https://bugs.llvm.org/show_bug.cgi?id=44153 We have effectively no analysis for copysign in IR, so we are taking the unusual step of increasing the number of IR instructions for the negative constant case. Differential Revision: https://reviews.llvm.org/D70792	2019-12-02 09:23:12 -05:00
Bjorn Pettersson	a9d6b0e544	[InstCombine] Fix big-endian miscompile of (bitcast (zext/trunc (bitcast))) Summary: optimizeVectorResize is rewriting patterns like: %1 = bitcast vector %src to integer %2 = trunc/zext %1 %dst = bitcast %2 to vector Since bitcasting between integer an vector types gives different integer values depending on endianness, we need to take endianness into account. As it happens the old implementation only produced the correct result for little endian targets. Fixes: https://bugs.llvm.org/show_bug.cgi?id=44178 Reviewers: spatel, lattner, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70844	2019-12-02 11:05:25 +01:00
David Green	59b56e5c57	[InstCombine] Expand usub_sat patterns to handle constants The constants come through as add %x, -C, not a sub as would be expected. They need some extra matchers to canonicalise them towards usub_sat. Differential Revision: https://reviews.llvm.org/D69514	2019-11-30 16:58:01 +00:00
David Green	3a1bef5616	[InstCombine] Adjust usub_sat fold one use checks This adjusts the one use checks in the the usub_sat fold code to not increase instruction count, but otherwise do the fold. Reviewed as a part of D69514.	2019-11-30 16:58:00 +00:00
Hideto Ueno	6c742fdbf4	[Attributor] Deduce dereferenceable based on accessed bytes map Summary: This patch introduces the deduction based on load/store instructions whose pointer operand is a non-inbounds GEP instruction. For example if we have, ``` void f(int *u){ u[0] = 0; u[1] = 1; u[2] = 2; } ``` then u must be dereferenceable(12). This patch is inspired by D64258 Reviewers: jdoerfert, spatel, hfinkel, RKSimon, sstefan1, xbolva00, dtemirbulatov Reviewed By: jdoerfert Subscribers: jfb, lebedev.ri, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70714	2019-11-29 06:55:58 +00:00
Hideto Ueno	dfedae5001	[Attributor] Remove dereferenceable_or_null when nonull is present Summary: This patch prevents the simultaneous presence of `dereferenceable` and `dereferenceable_or_null` attribute Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70789	2019-11-29 06:45:07 +00:00
Dávid Bolvanský	40963b2bf0	Revert "[Attributor] Move pass after InstCombine to futher eliminate null pointer checks" This reverts commit `7ca7d62c6e`. Commited accidentally.	2019-11-27 22:45:47 +01:00
Dávid Bolvanský	7ca7d62c6e	[Attributor] Move pass after InstCombine to futher eliminate null pointer checks Summary: PR44149 Reviewers: jdoerfert Subscribers: mehdi_amini, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70737	2019-11-27 22:36:51 +01:00
Hideto Ueno	0f4383faa7	[Attributor] Handle special case when offset equals zero in nonnull deduction	2019-11-27 14:45:16 +00:00
Eric Christopher	fd39b1bb20	Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there."" This reapplies: `8ff85ed905` Original commit message: As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410 This reverts commit `c9ddb02659`.	2019-11-26 20:28:52 -08:00
Dávid Bolvanský	0e32fbd223	[InstCombine] Fixed std::min on some bots. NFCI	2019-11-26 11:06:31 +01:00
Dávid Bolvanský	bb7b8540f0	[InstCombine] Optimize some memccpy calls to memcpy/null Summary: return memccpy(d, "helloworld", 'r', 20) => return memcpy(d, "helloworld", 8 /* pos of 'r' in string */), d + 8 Reviewers: efriedma, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68089	2019-11-26 10:54:47 +01:00
Hideto Ueno	78a750276f	[Attributor] Track a GEP Instruction in align deduction Summary: This patch enables us to track GEP instruction in align deduction. If a pointer `B` is defined as `A+Offset` and known to have alignment `C`, there exists some integer Q such that ``` A + Offset = C * Q = B ``` So we can say that the maximum power of two which is a divisor of gcd(Offset, C) is an alignment. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70392	2019-11-26 07:55:28 +00:00
Muhammad Omair Javaid	c9ddb02659	Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there." This reverts commit `8ff85ed905`. This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu Following tests were failing: lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410	2019-11-26 09:32:13 +05:00
Eric Christopher	8ff85ed905	As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410	2019-11-25 17:16:46 -08:00
Sanjay Patel	35827164c4	[InstCombine] remove shuffle mask canonicalization that creates undef elements This is NFC-intended because SimplifyDemandedVectorElts() does the same transform later. As discussed in D70641, we may want to change that behavior, so we need to isolate where it happens.	2019-11-25 13:33:56 -05:00
Whitney Tsang	aaf7f05a96	[NFC][LoopFusion] Use isControlFlowEquivalent() from CodeMoverUtils. Reviewer: kbarton, jdoerfert, Meinersbur, bmahjour, etiotto Reviewed By: Meinersbur Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D70619	2019-11-25 17:54:42 +00:00
Sanjay Patel	e85d2e4981	[InstCombine] prevent infinite loop from conflicting shuffle mask transforms The pattern in question is currently not possible because we aggressively (wrongly) transform mask elements to undef values if they choose from an undef operand. That, however, would change if we tighten our semantics for shuffles as discussed in D70641. Adding this check gives us the flexibility to make that change with minimal overhead for current definitions.	2019-11-25 12:00:41 -05:00
Sanjay Patel	fc31b58eff	[InstCombine] simplify code for shuffle mask canonicalization; NFC We never use the local 'Mask' before returning, so that was dead code.	2019-11-25 11:11:12 -05:00
Sanjay Patel	847aabf11f	[InstCombine] remove dead code from shuffle mask canonicalization; NFC	2019-11-25 10:54:18 -05:00
Sanjay Patel	20684092ab	[InstCombine] simplify loop for shuffle mask canonicalization; NFC	2019-11-25 10:41:50 -05:00
OCHyams	2de23c8364	[DebugInfo@O2][Utils] Undef instead of delete dbg.values in helper func Summary: Related bug: https://bugs.llvm.org/show_bug.cgi?id=40648 Static helper function rewriteDebugUsers in Local.cpp deletes dbg.value intrinsics when it cannot move or rewrite them, or salvage the deleted instruction's value. It should instead undef them in this case. This patch fixes that and I've added a test which covers the failing test case in bz40648. I've updated the unit test Local.ReplaceAllDbgUsesWith to check for this behaviour (and fixed a typo in the test which would cause the old test to always pass). Reviewers: aprantl, vsk, djtodoro, probinson Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70604	2019-11-25 10:55:14 +00:00
Florian Hahn	9d24933f79	Recommit `f0c2a5a` "[LV] Generalize conditions for sinking instrs for first order recurrences." This version contains 2 fixes for reported issues: 1. Make sure we do not try to sink terminator instructions. 2. Make sure we bail out, if we try to sink an instruction that needs to stay in place for another recurrence. Original message: If the recurrence PHI node has a single user, we can sink any instruction without side effects, given that all users are dominated by the instruction computing the incoming value of the next iteration ('Previous'). We can sink instructions that may cause traps, because that only causes the trap to occur later, but not on any new paths. With the relaxed check, we also have to make sure that we do not have a direct cycle (meaning PHI user == 'Previous), which indicates a reduction relation, which potentially gets missed by ReductionDescriptor. As follow-ups, we can also sink stores, iff they do not alias with other instructions we move them across and we could also support sinking chains of instructions and multiple users of the PHI. Fixes PR43398. Reviewers: hsaito, dcaballe, Ayal, rengolin Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D69228	2019-11-24 21:21:55 +00:00
Florian Hahn	9a432161c6	[LoopInterchange] Adjust assertions when updating successors. Currently the assertion in updateSuccessor is overly strict in some cases and overly relaxed in other cases. For branches to the inner and outer loop preheader it is too strict, because they can either be unconditional branches or conditional branches with duplicate targets. Both cases are fine and we can allow updating multiple successors. On the other hand, we have to at least update one successor. This patch adds such an assertion.	2019-11-24 19:37:16 +00:00
Sanjay Patel	f575f12c64	[InstCombine] remove identity shuffle simplification for mask with undefs And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that pattern. That preserves some of the old optimizations in IR. Given a shuffle that includes undef elements in an otherwise identity mask like: define <4 x float> @shuffle(<4 x float> %arg) { %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3> ret <4 x float> %shuf } We were simplifying that to the input operand. But as discussed in PR43958: https://bugs.llvm.org/show_bug.cgi?id=43958 ...that means that per-vector-element poison that would be stopped by the shuffle can now leak to the result. Also note that we still have (and there are tests for) the same transform with no undef elements in the mask (a fully-defined identity mask). I don't think there's any controversy about that case - it's a valid transform under any interpretation of shufflevector/undef/poison. Looking at a few of the diffs into codegen, I don't see any difference in final asm. So depending on your perspective, that's good (no real loss of optimization power) or bad (poison exists in the DAG, so we only partially fixed the bug). Differential Revision: https://reviews.llvm.org/D70246	2019-11-24 10:06:26 -05:00
Davide Italiano	c32f0ff92f	[InstCombine] Fix call guard difference with dbg Patch by Chris Ye! Differential Revision: https://reviews.llvm.org/D68004	2019-11-22 13:35:53 -08:00
Tsang Whitney W.H	ae8a8c2db6	[CodeMoverUtils] Added an API to check if an instruction can be safely moved before another instruction. Summary:Added an API to check if an instruction can be safely moved before another instruction. In future PRs, we will like to add support of moving instructions between blocks that are not control flow equivalent, and add other APIs to enhance usability, e.g. moving basic blocks, moving list of instructions... Loop Fusion will be its first user. When there is intervening code in between two loops, fusion is currently unable to fuse them. Loop Fusion can use this utility to check if the intervening code can be safely moved before or after the two loops, and move them, then it can successfully fuse them. Reviewer:kbarton,jdoerfert,Meinersbur,bmahjour,etiotto Reviewed By:bmahjour Subscribers:mgorny,hiraditya,llvm-commits Tag:LLVM Differential Revision:https://reviews.llvm.org/D70049	2019-11-22 21:29:08 +00:00
Anton Afanasyev	80cd6b6e04	[SLP] Enhance SLPVectorizer to vectorize vector aggregate Summary: Vector aggregate is homogeneous aggregate of vectors like `{ <2 x float>, <2 x float> }`. This patch allows `findBuildAggregate()` to consider vector aggregates as well as scalar ones. For instance, `{ <2 x float>, <2 x float> }` maps to `<4 x float>`. Fixes vector part of llvm.org/PR42022 Reviewers: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70068	2019-11-22 20:01:59 +03:00
Kazu Hirata	a195556628	[JumpThreading] NFC: Don't cache F.hasProfileData() Summary: With this patch, we no longer cache F.hasProfileData(). We simply call the function again. I'm doing this because: - JumpThreadingPass also has a member variable named HasProfileData, which is very confusing, - the function is very lightweight, and - this patch makes JumpThreading::runOnFunction more consistent with JumpThreadingPass::run. Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70602	2019-11-22 08:51:14 -08:00
Kazu Hirata	1a58be2ac5	[JumpThreading] Use profile data even with the new pass manager Summary: Without this patch, the jump threading pass ignores profiling data whenever we invoke the pass with the new pass manager. Specifically, JumpThreadingPass::run calls runImpl with class variable HasProfileData always set to false. In turn, runImpl sets HasProfileData to false again: HasProfileData = HasProfileData_; In the end, we don't use profiling data at all with the new pass manager. This patch fixes the problem by passing F.hasProfileData() to runImpl. The bug appears to have been introduced at: https://reviews.llvm.org/D41461 which removed local variable HasProfileData in JumpThreadingPass::run even though there was one more use left in the same function. As a result, the remaining use ended referring to the class variable instead. Note that F.hasProfileData is an extremely lightweight function, so I don't see the need to cache its result. Once this patch is approved, I'm planning to stop caching the result of F.hasProfileData in runOnFunction. Reviewers: wmi, eli.friedman Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70509	2019-11-22 08:21:48 -08:00
Pankaj Gode	04945c92ce	[WIP][Attributor] AAReachability Attribute Summary: Working towards Johannes's suggestion for fixme, in Attributor's Noalias attribute deduction. (ii) Check whether the value is captured in the scope using AANoCapture. FIXME: This is conservative though, it is better to look at CFG and // check only uses possibly executed before this call site. A Reachability abstract attribute answers the question "does execution at point A potentially reach point B". If this question is answered with false for all other uses of the value that might be captured, we know it is not yet captured and can continue with the noalias deduction. Currently, information AAReachability provides is completely pessimistic. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: uenoku, sstefan1, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D70233	2019-11-22 18:40:47 +05:30
Pankaj Gode	b9a26a80c8	Test commit.	2019-11-22 14:46:43 +05:30
Alina Sbirlea	fa09dddd70	[LoopInstSimplify] Move MemorySSA verification under flag. The verification inside loop passes should be done under the VerifyMemorySSA flag (enabled by EXPESIVE_CHECKS or explicitly with opt), in order to not add to compile time during regular builds.	2019-11-21 17:01:24 -08:00
Philip Reames	dfb7a9091a	[LoopPred] Robustly handle partially unswitched loops We may end up with a case where we have a widenable branch above the loop, but not all widenable branches within the loop have been removed. Since a widenable branch inhibit SCEVs ability to reason about exit counts (by design), we have a tradeoff between effectiveness of this optimization and allowing future widening of the branches within the loop. LoopPred is thought to be one of the most important optimizations for range check elimination, so let's pay the cost.	2019-11-21 15:44:36 -08:00
Philip Reames	8293f74345	Further cleanup manipulation of widenable branches [NFC] This is a follow on to `aaea24802b`. In post commit discussion, Artur and I realized we could cleanup the code using Uses; this patch does so.	2019-11-21 15:07:30 -08:00
Vedant Kumar	844d97f650	Clang-trunk Generates Wrong Debug values with -O1 Bit-Tracking Dead Code Elimination (bdce) do not mark dbg.value as undef after deleting instruction. which shows invalid state of variable in debugger. This patches fixes this by marking the dbg.value as undef which depends on dead instruction. This fixes https://bugs.llvm.org/show_bug.cgi?id=41925 Patch by kamlesh kumar! Differential Revision: https://reviews.llvm.org/D70040	2019-11-21 13:53:10 -08:00
Kazu Hirata	4f5d931c58	[JumpThreading] Refactor ThreadEdge Summary: This patch moves various checks from ThreadEdge to new function TryThreadEdge The rational behind this is that I'd like to use ThreadEdge without its checks in my upcoming patch. This patch preserves lightweight checks as assertions in ThreadEdge. ThreadEdge does not repeat the cost check, however. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70338	2019-11-21 12:38:22 -08:00
Tom Stellard	ab411801b8	[cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179	2019-11-21 10:48:08 -08:00
Philip Reames	aaea24802b	Broaden the definition of a "widenable branch" As a reminder, a "widenable branch" is the pattern "br i1 (and i1 X, WC()), label %taken, label %untaken" where "WC" is the widenable condition intrinsics. The semantics of such a branch (derived from the semantics of WC) is that a new condition can be added into the condition arbitrarily without violating legality. Broaden the definition in two ways: Allow swapped operands to the br (and X, WC()) form Allow widenable branch w/trivial condition (i.e. true) which takes form of br i1 WC() The former is just general robustness (e.g. for X = non-instruction this is what instcombine produces). The later is specifically important as partial unswitching of a widenable range check produces exactly this form above the loop. Differential Revision: https://reviews.llvm.org/D70502	2019-11-21 10:46:16 -08:00
Sanjay Patel	4ae0a13256	[InstCombine] add assert in SimplifyDemandedVectorElts and improve readability; NFC	2019-11-21 11:16:36 -05:00
Sjoerd Meijer	901cd3b3f6	[LV] PreferPredicateOverEpilog respecting option Follow-up of cb47b8783: don't query TTI->preferPredicateOverEpilogue when option -prefer-predicate-over-epilog is set to false, i.e. when we prefer not to predicate the loop. Differential Revision: https://reviews.llvm.org/D70382	2019-11-21 14:06:10 +00:00
Ilya Biryukov	aa981c1802	Reland 9f3fdb0d7fab: [Driver] Use VFS to check if sanitizer blacklists exist With updates to various LLVM tools that use SpecialCastList. It was tempting to use RealFileSystem as the default, but that makes it too easy to accidentally forget passing VFS in clang code.	2019-11-21 11:56:09 +01:00
Ilya Biryukov	9f3fdb0d7f	Revert "[Driver] Use VFS to check if sanitizer blacklists exist" This reverts commit `ba6f906854`. Commit caused compilation errors on llvm tests. Will fix and re-land.	2019-11-21 11:31:14 +01:00
Ilya Biryukov	ba6f906854	[Driver] Use VFS to check if sanitizer blacklists exist Summary: This is a follow-up to `590f279c45`, which moved some of the callers to use VFS. It turned out more code in Driver calls into real filesystem APIs and also needs an update. Reviewers: gribozavr2, kadircet Reviewed By: kadircet Subscribers: ormris, mgorny, hiraditya, llvm-commits, jkorous, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70440	2019-11-21 11:00:30 +01:00
David Stenberg	3889ff82bf	[DebugInfo] Refactor DIExpression [SZ]Ext creation into function [NFC] Summary: Also, replace the SmallVector with a normal C array. Reviewers: vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70498	2019-11-21 10:44:04 +01:00
James Y Knight	e47d6da8a5	D'oh. Fix assert after `a84922916e`. (Which was attempting to fix unused variable warning in NDEBUG mode after `8ba56f322a`)	2019-11-20 22:22:51 -05:00
James Y Knight	a84922916e	Fix unused variable warning in NDEBUG mode after `8ba56f322a`	2019-11-20 22:05:05 -05:00
Alina Sbirlea	5c5cf899ef	[MemorySSA] Moving at the end often means before terminator. Moving accesses in MemorySSA at InsertionPlace::End, when an instruction is moved into a block, almost always means insert at the end of the block, but before the block terminator. This matters when the block terminator is a MemoryAccess itself (an invoke), and the insertion must be done before the terminator for the update to be correct. Insert an additional position: InsertionPlace:BeforeTerminator and update current usages where this applies. Resolves PR44027.	2019-11-20 17:11:00 -08:00
Alina Sbirlea	da4baa2a6c	[MemorySSA] Update analysis when the terminator is a memory instruction. Update MemorySSA when moving the terminator instruction, as that may be a memory touching instruction. Resolves PR44029.	2019-11-20 16:36:52 -08:00
Eric Christopher	714aabacfb	Temporarily Revert "[SLP] allow forming 2-way reduction patterns" and update testcases. After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit `8a0aa5310b`.	2019-11-20 16:00:53 -08:00
Eric Christopher	8a0aa5310b	Temporarily Revert "Temporarily Revert "[SLP] allow forming 2-way reduction patterns"" as there were testcase changes after that need to also be reverted. This reverts commit `cd8748a15f`.	2019-11-20 15:39:47 -08:00
Eric Christopher	cd8748a15f	Temporarily Revert "[SLP] allow forming 2-way reduction patterns" After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit `7ff57705ba`.	2019-11-20 15:19:31 -08:00
Philip Reames	8ba56f322a	Move widenable branch formation into makeGuardControlFlowExplicit helper This is mostly NFC, but I removed the setting of the guard's calling convention onto the WC call. Why? Because it was untested, and was producing an ill defined output as the declaration's convention wasn't been changed leaving a mismatch which is UB.	2019-11-20 12:54:05 -08:00
Philip Reames	28a91473e3	[GuardWidening] Remove WidenFrequentBranches transform This code has never been enabled. While it is tested, it's complicating some refactoring. If we decide to re-implement this, doing it in SimplifyCFG would probably make more sense anyways.	2019-11-19 15:15:52 -08:00
Philip Reames	70c68a6b0e	[NFC] Factor out utilities for manipulating widenable branches With the widenable condition construct, we have the ability to reason about branches which can be 'widened' (i.e. made to fail more often). We've got a couple o transforms which leverage this. This patch just cleans up the API a bit. This is prep work for generalizing our definition of a widenable branch slightly. At the moment "br i1 (and A, wc()), ..." is considered widenable, but oddly, neither "br i1 (and wc(), B), ..." or "br i1 wc(), ..." is. That clearly needs addressed, so first, let's centralize the code in one place.	2019-11-19 14:43:13 -08:00
Philip Reames	f3eb5dee57	[LoopPred] Generalize profitability check to handle unswitch output Unswitch (and other loop transforms) like to generate loop exit blocks with unconditional successors, and phi nodes (LCSSA, or simple multiple exiting blocks sharing an exit). Generalize the "likely very rare exit" check slightly to handle this form.	2019-11-19 14:06:36 -08:00
Duncan P. N. Exon Smith	3279724905	llvm/ObjCARC: Eliminate inlined AutoreleaseRV calls Pair up inlined AutoreleaseRV calls with their matching RetainRV or ClaimRV. - RetainRV cancels out AutoreleaseRV. Delete both instructions. - ClaimRV is a peephole for RetainRV+Release. Delete AutoreleaseRV and replace ClaimRV with Release. This avoids problems where more aggressive inlining triggers memory regressions. This patch is happy to skip over non-callable instructions and non-ARC intrinsics looking for the pair. It is likely sound to also skip over opaque function calls, but that's harder to reason about, and it's not relevant to the goal here: if there's an opaque function call splitting up a pair, it's very unlikely that a handshake would have happened dynamically without inlining. Note that this patch also subsumes the previous logic that looked backwards from ReleaseRV. https://reviews.llvm.org/D70370 rdar://problem/46509586	2019-11-19 12:02:01 -08:00
Sanjay Patel	0a8e7ca402	[SLP] fix miscompile on min/max reductions with extra uses (PR43948) (2nd try) The 1st attempt was reverted because it revealed an existing bug where we could produce invalid IR (use of value before definition). That should be fixed with: rG39de82ecc9c2 The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148	2019-11-19 14:57:35 -05:00
Sanjay Patel	39de82ecc9	[SLP] fix insertion point for min/max reduction As discussed in D70148 (and caused a revert of the original commit): if we insert at the select, then we can produce invalid IR because the replacement for the compare may have uses before the select.	2019-11-19 10:50:10 -05:00
evgeny	4ef9315c4b	[ThinLTO] Make ValueInfo::operator bool() explicit Differential revision: https://reviews.llvm.org/D70383	2019-11-19 12:46:09 +03:00
Evgeniy Brevnov	4a64d710ae	[NFC] Test commit. Please ignore. As a test commit I fixed a misspelling in one of comments in SLP vectorizer.	2019-11-19 15:41:57 +07:00
Eric Christopher	6f1cc4151a	Temporarily revert "[SLP] fix miscompile on min/max reductions with extra uses (PR43948)" as it causes an ICE on valid. A testcase was followed up on the original thread. This reverts commit `a3e61946c5`.	2019-11-18 14:41:37 -08:00
Teresa Johnson	cc1b0bc24d	[ThinLTO] Avoid extra index lookup during promotion Summary: Pass down the already accessed ValueInfo to shouldPromoteLocalToGlobal, to avoid an unnecessary extra index lookup. Add some assertion checking to confirm we have a non-empty VI when expected. Also some misc cleanup, merging the two versions of doImportAsDefinition, since one was only called by the other, and unnecessarily passed in a member variable. Reviewers: steven_wu, pcc, evgeny777 Reviewed By: evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70337	2019-11-18 12:55:53 -08:00
Teresa Johnson	3be6dbca3b	[ThinLTO] Promotion handling cleanup (NFC) Summary: Clean up the code that does GV promotion in the ThinLTO backends. Specifically, we don't need to check whether we are importing since that is already checked and handled correctly in shouldPromoteLocalToGlobal. Simply call shouldPromoteLocalToGlobal, and if it returns true we are guaranteed that we are promoting, whether or not we are importing (or in the exporting module). This also makes the handling in getName() consistent with that in getLinkage(), which checks the DoPromote parameter regardless of whether we are importing or exporting. Reviewers: steven_wu, pcc, evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70327	2019-11-18 11:59:36 -08:00
Philip Reames	ad5a84c883	[LoopPred/WC] Use a dominating widenable condition to remove analyze loop exits This implements a version of the predicateLoopExits transform from IndVarSimplify extended to exploit widenable conditions - and thus be much wider in scope of legality. The code structure ends up being almost entirely different, so I chose to duplicate this into the LoopPredication pass instead of trying to reuse the code in the IndVars. The core notions of the transform are as follows: If we have a widenable condition which controls entry into the loop, we're allowed to widen it arbitrarily. Given that, it's simply a profitability question as to what conditions to fold into the widenable branch. To avoid pass ordering issues, we want to avoid widening cases that would otherwise be dischargeable. Or... widen in a form which can still be discharged. Thus, we phrase the transform as selecting one analyzeable exit from the set of analyzeable exits to keep. This avoids creating pass ordering complexities. Since none of the above proves that we actually exit through our analyzeable exits - we might exit through something else entirely - we limit ourselves to cases where a) the latch is analyzeable and b) the latch is predicted taken, and c) the exit being removed is statically cold. Differential Revision: https://reviews.llvm.org/D69830	2019-11-18 11:23:29 -08:00
Simon Tatham	f4f77aa53e	[ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i. If you're writing C code using the ACLE MVE intrinsics that passes the result of a vcmp as input to a predicated intrinsic, e.g. mve_pred16_t pred = vcmpeqq(v1, v2); v_out = vaddq_m(v_inactive, v3, v4, pred); then clang's codegen for the compare intrinsic will create calls to `@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an `mve_pred16_t` integer representation, and then the next intrinsic will call `@llvm.arm.mve.pred.i2v` to convert it straight back again. This will be visible in the generated code as a `vmrs`/`vmsr` pair that move the predicate value pointlessly out of `p0` and back into it again. To prevent that, I've added InstCombine rules to remove round trips of the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine about the known and demanded bits of those intrinsics. As a result, you now get just the generated code you wanted: vpt.u16 eq, q1, q2 vaddt.u16 q0, q3, q4 Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70313	2019-11-18 10:39:30 +00:00
Duncan P. N. Exon Smith	783cb86b61	llvm/ObjCARC: Split OptimizeIndividualCallImpl out of OptimizeIndividualCalls, NFC Split out a helper function for the individual call optimizations and skip useless calls to it (where the instruction is not an ARC intrinsic). Besides reducing indentation (and possibly speeding up compile time in some small way), an upcoming patch will add additional calls and expand out the `switch`.	2019-11-17 21:54:27 -08:00
Duncan P. N. Exon Smith	a937a588dd	llvm/ObjCARC: Use continue to reduce some nesting, NFC	2019-11-17 18:22:35 -08:00
Sanjay Patel	5d67d81f48	[InstCombine] prevent crashing/assert on shift constant expression (PR44028) The binary operator cast implies an instruction, but the matcher for shift does not: https://bugs.llvm.org/show_bug.cgi?id=44028	2019-11-17 17:31:09 -05:00
Stefan Stipanovic	a516fbac52	[Attributor] Use nofree argument attribute for heap-to-stack conversion Reviewers: jdoerfert, uenoku Subscribers: Differential Revision: https://reviews.llvm.org/D70140	2019-11-17 21:35:04 +01:00
Sanjay Patel	ebf9bf2cbc	[SimplifyCFG] propagate fast-math-flags (FMF) from phi to select Similar to/extension of D70208 (rGee0882bdf866), but this one may finally allow closing motivating bugs. This is another step towards having FMF apply only to FP values rather than those + fcmp. See PR38086 for one of the original discussions/motivations: https://bugs.llvm.org/show_bug.cgi?id=38086 And the test here is derived from PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535 Currently, we lose FMF when converting any phi to select in SimplifyCFG. There are a small number of similar changes needed to correct within SimplifyCFG, so it should be quick to patch this pass up. FMF was extended to select and phi with: D61917 D67564	2019-11-17 11:23:44 -05:00
David Green	08390c52a2	[InstCombine] Canonicalize ssub.with.overflow with clamp to ssub.sat Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats. Differential Revision: https://reviews.llvm.org/D69753	2019-11-17 10:45:11 +00:00
David Green	03fce6b12e	[InstCombine] Canonicalize sadd.with.overflow with clamp to sadd.sat This adds to D69245, adding extra signed patterns for folding from a sadd_with_overflow to a sadd_sat. These are more complex than the unsigned patterns, as the overflow can occur in either direction. For the add case, the positive overflow can only occur if both of the values are positive (same for both the values being negative). So there is an extra select on whether to use the positive or negative overflow limit. Differential Revision: https://reviews.llvm.org/D69252	2019-11-17 10:42:39 +00:00
Reid Kleckner	631be5c0d4	Remove Support/Options.h, it is unused It was added in 2014 in `732e0aa9fb` with one use in Scalarizer.cpp. That one use was then removed when porting to the new pass manager in 2018 in `b6f76002d9`. While the RFC and the desire to get off of static initializers for cl::opt all still stand, this code is now dead, and I think we should delete this code until someone is ready to do the migration. There were many clients of CommandLine.h that were it transitively through LLVMContext.h, so I cleaned that up in `4c1a1d3cf9`. Reviewers: beanz Differential Revision: https://reviews.llvm.org/D70280	2019-11-15 13:32:52 -08:00
Sanjay Patel	ee0882bdf8	[SimplifyCFG] propagate fast-math-flags (FMF) from phi to select This is another step towards having FMF apply only to FP values rather than those + fcmp. See PR38086 for one of the original discussions/motivations: https://bugs.llvm.org/show_bug.cgi?id=38086 And the test here is derived from PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535 Currently, we lose FMF when converting any phi to select in SimplifyCFG. There are a small number of similar changes needed to correct within SimplifyCFG, so it should be quick to patch this pass up. FMF was extended to select and phi with: D61917 D67564 Differential Revision: https://reviews.llvm.org/D70208	2019-11-15 16:14:35 -05:00
Richard Smith	7889d8e7eb	Revert "[LoadStoreVectorize] Use '\|\|' instead of '\|' between sides with function calls. NFCI." This broke two tests. Presumably the non-short-circuting '\|' was intentional here. This reverts commit `f7efea0ded`.	2019-11-15 12:49:35 -08:00
Alexandre Ganea	478ad94c8e	[GCOV] Skip artificial functions from being emitted This is a patch to support D66328, which was reverted until this lands. Enable a compiler-rt test that used to fail previously with D66328. Differential Revision: https://reviews.llvm.org/D67283	2019-11-15 14:23:11 -05:00
Francesco Petrogalli	d6de5f12d4	[SVFS] Inject TLI Mappings in VFABI attribute. This patch introduces a function pass to inject the scalar-to-vector mappings stored in the TargetLIbraryInfo (TLI) into the Vector Function ABI (VFABI) variants attribute. The test is testing the injection for three vector libraries supported by the TLI (Accelerate, SVML, MASSV). The pass does not change any of the analysis associated to the function. Differential Revision: https://reviews.llvm.org/D70107	2019-11-15 18:42:56 +00:00
Fangrui Song	8bcd01f48a	[ThinLTO] Fix -Wunused-function in NDEBUG builds after llvmorg-10-init-9933-g3d708bf5c26	2019-11-15 10:00:23 -08:00
Dávid Bolvanský	f7efea0ded	[LoadStoreVectorize] Use '\|\|' instead of '\|' between sides with function calls. NFCI. Fixes warning from PVS Studio	2019-11-15 18:51:13 +01:00
evgeny	3d708bf5c2	Recommit "[ThinLTO] Add correctness check for RO/WO variable import" ValueInfo has user-defined 'operator bool' which allows incorrect implicit conversion to GlobalValue::GUID (which is unsigned long). This causes bugs which are hard to track and should be removed in future.	2019-11-15 16:13:19 +03:00
Mikael Holmen	1587c7e86f	[Scalarizer] Treat values from unreachable blocks as undef Summary: When scalarizing PHI nodes we might try to examine/rewrite InsertElement nodes in predecessors. If those predecessors are unreachable from entry, then the IR in those blocks could have unexpected properties resulting in infinite loops in Scatterer::operator[]. By simply treating values originating from instructions in unreachable blocks as undef we do not need to analyse them further. This fixes PR41723. Reviewers: bjope Reviewed By: bjope Subscribers: bjope, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70171	2019-11-15 11:13:37 +01:00
Francis Visoiu Mistrih	a4c76be506	[InstCombine] Don't use getFirstNonPHI in FoldIntegerTypedPHI getFirstNonPHI iterates over all the instructions in a block until it finds a non-PHI. Then, the loop starts from the beginning of the block and goes through all the instructions until it reaches the instruction found by getFirstNonPHI. Instead of doing that, just stop when a non-PHI is found. This reduces the compile-time of a test case discussed in https://reviews.llvm.org/D47023 by 13x. Not entirely sure how to come up with a test case for this since it's a compile time issue that would significantly slow down running the tests. Differential Revision: https://reviews.llvm.org/D70016	2019-11-14 17:52:01 -08:00
Reid Kleckner	4c1a1d3cf9	Add missing includes needed to prune LLVMContext.h include, NFC These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280	2019-11-14 15:23:15 -08:00
Alexey Bataev	bfa32573bf	Revert "Temporarily Revert:" This reverts commit e511c4b0dff1692c267addf17dce3cebe8f97faa: Temporarily Revert: "[SLP] Generalization of stores vectorization." "[SLP] Fix -Wunused-variable. NFC" "[SLP] Vectorize jumbled stores." after fixing the problem with compile time.	2019-11-14 16:38:20 -05:00
Sanjay Patel	385572ccfe	[InstCombine] remove duplicate code for simplifying a shuffle; NFCI The transform is already handled by InstSimplify or earlier in InstCombine, so trying to do it again is not necessary.	2019-11-14 13:12:25 -05:00
Benjamin Kramer	360f661733	Revert "[ThinLTO] Add correctness check for RO/WO variable import" This reverts commit `a2292cc537`. Breaks clang selfhost w/ThinLTO.	2019-11-14 16:07:13 +01:00
Simon Pilgrim	edfc94e296	GCOVProfiling - fix uninitialized variable warnings + make getFuncChecksum() const. NFCI.	2019-11-14 14:21:18 +00:00
Simon Pilgrim	39c0829a55	WholeProgramDevirt - fix uninitialized variable warnings. NFCI.	2019-11-14 14:21:18 +00:00
Simon Pilgrim	8c09e472d5	Fix uninitialized variable warning. NFCI.	2019-11-14 14:21:17 +00:00
Simon Pilgrim	ba229113a9	SROA - fix uninitialized variable warnings. NFCI.	2019-11-14 14:21:17 +00:00
Sjoerd Meijer	cb47b87830	[LV] PreferPredicateOverEpilog respecting predicate loop hint The vectoriser queries TTI->preferPredicateOverEpilogue to determine if tail-folding is preferred for a loop, but it was not respecting loop hint 'predicate' that can disable this, which has now been added. This showed that we were incorrectly initialising loop hint 'vectorize.predicate.enable' with 0 (i.e. FK_Disabled) but this should have been FK_Undefined, which has been fixed. Differential Revision: https://reviews.llvm.org/D70125	2019-11-14 13:10:44 +00:00
Daniil Suchkov	4c9d0da838	Revert "[InstCombine] Fold PHIs with equal incoming pointers" This reverts commit `a2f6ae9abf`. It is reverted due to clang-cmake-armv7-selfhost buildbot failure.	2019-11-14 17:42:01 +07:00
Daniil Suchkov	a2f6ae9abf	[InstCombine] Fold PHIs with equal incoming pointers This is a resubmission of `bbb29738b5` that was reverted due to clang tests failures. It includes the fix and additional IR tests for the missed case. Summary: In case when all incoming values of a PHI are equal pointers, this transformation inserts a definition of such a pointer right after definition of the base pointer and replaces with this value both PHI and all it's incoming pointers. Primary goal of this transformation is canonicalization of this pattern in order to enable optimizations that can't handle PHIs. Non-inbounds pointers aren't currently supported. Reviewers: spatel, RKSimon, lebedev.ri, apilipenko Reviewed By: apilipenko Tags: #llvm Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D68128	2019-11-14 17:04:32 +07:00
evgeny	a2292cc537	[ThinLTO] Add correctness check for RO/WO variable import This patch adds an assertion check for exported read/write-only variables to be also in import list for module. If they aren't we may face linker errors, because read/write-only variables are internalized in their source modules. The patch also changes export lists to store ValueInfo instead of GUID for performance considerations. Differential revision: https://reviews.llvm.org/D70128	2019-11-14 12:24:05 +03:00
Dimitry Andric	3db6783d8a	Check result of emitStrLen before passing it to CreateGEP Summary: This fixes PR43081, where the transformation of `strchr(p, 0) -> p + strlen(p)` can cause a segfault, if `-fno-builtin-strlen` is used. In that case, `emitStrLen` returns nullptr, which CreateGEP is not designed to handle. Also add the minimized code from the PR as a test case. Reviewers: xbolva00, spatel, jdoerfert, efriedma Reviewed By: efriedma Subscribers: lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70143	2019-11-14 08:04:36 +01:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Hiroshi Yamauchi	3f0969daf9	[PGO][PGSO] Temporarily disable the large working set size behavior. Summary: This temporarily disables the large working set size behavior in profile guided size optimization due to internal benchmark regressions. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70207	2019-11-13 14:00:47 -08:00
Sanjay Patel	a3e61946c5	[SLP] fix miscompile on min/max reductions with extra uses (PR43948) The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148	2019-11-13 15:57:35 -05:00
Sanjay Patel	e9bf7a60a0	[SLP] reduce code duplication for min/max vs. other reductions; NFCI	2019-11-13 11:26:08 -05:00
Sanjay Patel	3d6b53980c	[InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select As noted by the FIXME comment, this is not correct based on our current FMF semantics. We should be propagating FMF from the final value in a sequence (in this case the 'select'). So the behavior even without this patch is wrong, but we did not allow FMF on 'select' until recently. But if we do the correct thing right now in this patch, we'll inevitably introduce regressions because we have not wired up FMF propagation for 'phi' and 'select' in other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a better incremental way to make progress. That said, the potential extra damage over the existing wrong behavior from this patch is very limited. AFAIK, the only way to have different FMF on IR in the same function is if we have LTO inlined IR from 2 modules that were compiled using different fast-math settings. As seen in the tests, we may actually see some improvements with this patch because adding the FMF to the 'select' allows matching to min/max intrinsics that were previously missed (in the common case, the 'fcmp' and 'select' should have identical FMF to begin with). Next steps in the transition: Make similar changes in instcombine as needed. Enable phi-to-select FMF propagation in SimplifyCFG. Remove dependencies on fcmp with FMF. Deprecate FMF on fcmp. Differential Revision: https://reviews.llvm.org/D69720	2019-11-13 10:38:42 -05:00
Simon Pilgrim	d1bd5e476b	SLPVectorizer - make comparison operators + isInSchedulingRegion const Fixes cppcheck warnings.	2019-11-13 14:40:19 +00:00
Florian Hahn	f7499011ca	[InstCombine] Avoid moving ops that do restrict undef across shuffles. I think we have to be a bit more careful when it comes to moving ops across shuffles, if the op does restrict undef. For example, without this patch, we would move 'and %v, <0, 0, -1, -1>' over a 'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first 2 lanes of the result are undef after the combine, but they really should be 0, unless I am missing something. For ops that do fold to undef on undef operands, the current behavior should be fine. I've add conservative check OpDoesRestrictUndef, maybe there's a better existing utility? Reviewers: spatel, RKSimon, lebedev.ri Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D70093	2019-11-13 13:40:34 +00:00
Daniil Suchkov	cba4a27745	Temporarily revert "[InstCombine] Fold PHIs with equal incoming pointers" Revert due to sanitizer-windows buildbot failure. This reverts commit `bbb29738b5`.	2019-11-13 17:14:11 +07:00
Daniil Suchkov	bbb29738b5	[InstCombine] Fold PHIs with equal incoming pointers In case when all incoming values of a PHI are equal pointers, this transformation inserts a definition of such a pointer right after definition of the base pointer and replaces with this value both PHI and all it's incoming pointers. Primary goal of this transformation is canonicalization of this pattern in order to enable optimizations that can't handle PHIs. Non-inbounds pointers aren't currently supported. Reviewers: spatel, RKSimon, lebedev.ri, apilipenko Reviewed By: apilipenko Tags: #llvm Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D68128	2019-11-13 17:00:34 +07:00
Alina Sbirlea	4ae74cc99f	[GVNHoist] Preserve AAResults. Resolves PR38906, PR40898.	2019-11-12 14:10:04 -08:00
Tom Weaver	41c3f76dcd	[DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass. Reviewed By: aprantl, vsk Differential revision: https://reviews.llvm.org/D69943	2019-11-12 15:17:04 +00:00
Diana Picus	7f1dcc8952	[InstCombine] Skip scalable vectors in combineLoadToOperationType Don't try to canonicalize loads to scalable vector types to loads of integers. This removes one assertion when trying to use a TypeSize as a parameter to DataLayout::isLegalInteger. It does not handle the second part of the function (which looks at bitcasts). This patch also contains a NFC fix for Load Analysis, where a variable initialization that would cause the same assertion is moved closer to its use. This allows us to run the new test for InstCombine without having to teach LocationSize to play nicely with scalable vectors. Differential Revision: https://reviews.llvm.org/D70075	2019-11-12 12:27:09 +01:00
Florian Hahn	1ee93240c0	[LoopInterchange] Only skip PHIs with incoming values from the inner loop. Currently we have limited support for outer loops with multiple basic blocks after the inner loop exit. But the current checks for creating PHIs for loop exit values only assumes the header and latches of the outer loop. It is better to just skip incoming values defined in the original inner loops. Those are handled earlier. Reviewers: efriedma, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70059	2019-11-12 10:30:51 +00:00
Hideto Ueno	88b04ef832	[Attributor] Use must-be-executed-context in align deduction Summary: This patch introduces align attribute deduction for callsite argument, function argument, function returned and floating value based on must-be-executed-context. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69797	2019-11-12 06:41:19 +00:00
Vasileios Porpodas	6a18a95487	[SLP] Look-ahead operand reordering heuristic. Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for examples). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: xbolva00, Carrot, hiraditya, phosek, rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897	2019-11-11 21:06:51 -08:00
Francesco Petrogalli	e9a06e0606	[VFABI] Read/Write functions for the VFABI attribute. The attribute is stored at the `FunctionIndex` attribute set, with the name "vector-function-abi-variant". The get/set methods of the attribute have assertion to verify that: 1. Each name in the attribute is a valid VFABI mangled name. 2. Each name in the attribute correspond to a function declared in the module. Differential Revision: https://reviews.llvm.org/D69976	2019-11-12 03:40:42 +00:00
aqjune	4187cb138b	Add InstCombine/InstructionSimplify support for Freeze Instruction Summary: - Add llvm::SimplifyFreezeInst - Add InstCombiner::visitFreeze - Add llvm tests Reviewers: majnemer, sanjoy, reames, lebedev.ri, spatel Reviewed By: reames, lebedev.ri Subscribers: reames, lebedev.ri, filcab, regehr, trentxintong, llvm-commits Differential Revision: https://reviews.llvm.org/D29013	2019-11-12 12:13:26 +09:00
Sanjay Patel	29f5d1670c	Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (3rd try)" This reverts commit `3db8a3ef86`. This caused a different memory-sanitizer failure than earlier attempts, but it's still not right.	2019-11-11 09:56:03 -05:00
Sanjay Patel	3db8a3ef86	[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (3rd try) Re-try because earlier attempts were reverted due to use-after-free. Hopefully, diagnosed correctly this time - we replace/remove the invariant.start first rather than the invariant.end to avoid angering worklist-based iteration. We gather a set of white-listed instructions in isAllocSiteRemovable() and then replace/erase them. But we don't know in general if the instructions in the set have uses amongst themselves, so order of deletion makes a difference. There's already a special-case for the llvm.objectsize intrinsic, so add another for llvm.invariant.start. Should fix: https://bugs.llvm.org/show_bug.cgi?id=43723 Differential Revision: https://reviews.llvm.org/D69977	2019-11-11 09:29:40 -05:00
Tom Weaver	9f48a160dd	Revert "[DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass." This reverts commit `1984a27db5`.	2019-11-11 14:13:33 +00:00
Tom Weaver	1984a27db5	[DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass. Reviewed By: aprantl, vsk Differential revision: https://reviews.llvm.org/D69943	2019-11-11 13:47:13 +00:00
Tom Weaver	75af15d81e	[NFC][TEST_COMMIT] Add fullstop to comment.	2019-11-11 13:38:34 +00:00
Jay Foad	9323ef4ecc	[InstCombine] Simplify binary op when only one operand is a select Summary: SimplifySelectsFeedingBinaryOp simplified binary ops when both operands were selects with the same condition. This patch extends it to handle these cases where only one operand is a select: X op (C ? P : Q) -> C ? (X op P) : (X op Q) // if X op P and X op Q both simplify (C ? P : Q) op Y -> C ? (P op Y) : (Q op Y) // if P op Y and Q op Y both simplify For example: X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0 Reviewers: mcberg2017, majnemer, craig.topper, qcolombet, mcrosier Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64713	2019-11-11 10:01:59 +00:00
Craig Topper	aafde063aa	[InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double. The _m64 type is represented in IR as <1 x i64>. The x86-64 ABI on Linux passes <1 x i64> as a double. MMX intrinsics use x86_mmx type in IR.These things result in a lot of bitcasts in mmx code. There's another instcombine that tries to turn bitcast <1 x i64> to double into extractelement and a bitcast. The combine here tries to reverse this extractelement conversion if we see an mmx type.	2019-11-10 16:25:25 -08:00
Sanjay Patel	d115b9fd4a	Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (2nd try)" This reverts commit `56b2aee187`. Still causes a use-after-free on sanitizer bots.	2019-11-10 18:47:49 -05:00
Sanjay Patel	56b2aee187	[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (2nd try) Re-try rGef02831f0a4e (reverted due to use-after-free), but bail out completely if we encounter an unexpected llvm.invariant.start. We gather a set of white-listed instructions in isAllocSiteRemovable() and then replace/erase them. But we don't know in general if the instructions in the set have uses amongst themselves, so order of deletion makes a difference. There's already a special-case for the llvm.objectsize intrinsic, so add another for llvm.invariant.end. Should fix: https://bugs.llvm.org/show_bug.cgi?id=43723 Differential Revision: https://reviews.llvm.org/D69977	2019-11-10 17:26:36 -05:00
Stefan Stipanovic	c250ebf7bc	getArgOperandNo helper function. Summary: A helper function to get argument number of a arg operand Use. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66844	2019-11-10 21:45:11 +01:00
Sanjay Patel	b0ac26a632	Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723)" This reverts commit `ef02831f0a`. Sanitizer bots fail with this change.	2019-11-10 11:18:05 -05:00
Sanjay Patel	ef02831f0a	[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) We gather a set of white-listed instructions in isAllocSiteRemovable() and then replace/erase them. But we don't know in general if the instructions in the set have uses amongst themselves, so order of deletion makes a difference. There's already a special-case for the llvm.objectsize intrinsic, so add another for llvm.invariant.end. Should fix: https://bugs.llvm.org/show_bug.cgi?id=43723 Differential Revision: https://reviews.llvm.org/D69977	2019-11-10 09:18:11 -05:00
Simon Pilgrim	4ff246fef2	Remove unused variable (which allows us to remove vector include). NFC.	2019-11-10 12:16:23 +00:00
Gil Rapaport	7f152543e4	[LV] Apply sink-after & interleave-groups as VPlan transformations (NFCI) This recommits `11ed1c0239` (reverted in `9f08ce0d21` for failing an assert) with a fix: tryToWidenMemory() now first checks if the widening decision is to interleave, thus maintaining previous behavior where tryToInterleaveMemory() was called first, giving priority to interleave decisions over widening/scalarization. This commit adds the test case that exposed this bug as a LIT.	2019-11-09 20:52:25 +02:00
Jay Foad	d162e02cee	Refactor SimplifySelectsFeedingBinaryOp for D64713. NFC.	2019-11-09 09:28:22 +00:00
Teresa Johnson	b11391bb47	ThinLTO : Import always_inline functions irrespective of the threshold Summary: A user can force a function to be inlined by specifying the always_inline attribute. Currently, thinlto implementation is not aware of always_inline functions and does not guarantee import of such functions, which in turn can prevent inlining of such functions. Patch by Bharathi Seshadri <bseshadr@cisco.com> Reviewers: tejohnson Reviewed By: tejohnson Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70014	2019-11-08 17:02:01 -08:00
Gil Rapaport	9f08ce0d21	Revert "[LV] Apply sink-after & interleave-groups as VPlan transformations (NFCI)" This reverts commit `11ed1c0239` - causes an assert failure.	2019-11-08 22:17:11 +02:00
evgeny	7f92d66f37	[ThinLTO] Fix bug when importing writeonly variables Patch enables import of write-only variables with non-trivial initializers to fix linker errors. Initializers of imported variables are converted to 'zeroinitializer' to avoid promotion of referenced objects. Differential revision: https://reviews.llvm.org/D70006	2019-11-08 20:50:34 +03:00
Kazu Hirata	9aff5e1c18	[JumpThreading] Fix a comment typo (NFC) Reviewers: kazu Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70013	2019-11-08 09:29:46 -08:00
Philip Reames	8d22100f66	[LICM] Support hosting of dynamic allocas out of loops This patch implements a correct, but not terribly useful, transform. In particular, if we have a dynamic alloca in a loop which is guaranteed to execute, and provably not captured, we hoist the alloca out of the loop. The capture tracking is needed so that we can prove that each previous stack region dies before the next one is allocated. The transform decreases the amount of stack allocation needed by a linear factor (e.g. the iteration count of the loop). Now, I really hope no one is actually using dynamic allocas. As such, why this patch? Well, the actual problem I'm hoping to make progress on is allocation hoisting. There's a large draft patch out for review (https://reviews.llvm.org/D60056), and this patch was the smallest chunk of testable functionality I could come up with which takes a step vaguely in that direction. Once this is in, it makes motivating the changes to capture tracking mentioned in TODOs testable. After that, I hope to extend this to trivial malloc free regions (i.e. free dominating all loop exits) and allocation functions for GCed languages. Differential Revision: https://reviews.llvm.org/D69227	2019-11-08 08:19:48 -08:00
Philip Reames	787dba7aae	[LICM] Hoisting of widenable conditions out of loops The change itself is straight forward and obvious, but ... there's an existing test checking for exactly the opposite. Both I and Artur think this is simply conservatism in the initial implementation. If anyone bisects a problem to this, a counter example will be very interesting. Differential Revision: https://reviews.llvm.org/D69907	2019-11-08 08:19:48 -08:00
Gil Rapaport	11ed1c0239	[LV] Apply sink-after & interleave-groups as VPlan transformations (NFCI) This recommits `100e797adb` (reverted in `009e032634` for failing an assert). While the root cause was independently reverted in `eaff300401`, this commit includes a LIT to make sure IVDescriptor's SinkAfter logic does not try to sink branch instructions.	2019-11-08 15:25:14 +02:00
Daniil Suchkov	7b9f5401a6	[NFC][IndVarS] Adjust a comment (test commit)	2019-11-08 14:51:36 +07:00
Craig Topper	6749dc3446	[InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement x86_mmx is conceptually a vector already. Don't introduce an extra conversion between it and scalar i64. I'm using VectorType::isValidElementType which checks for floating point, integer, and pointers to hopefully make this more readable than just blacklisting x86_mmx. Differential Revision: https://reviews.llvm.org/D69964	2019-11-07 15:14:13 -08:00
Daniel Sanders	25ee861372	[debugify] Move the Debugify pass from tools/opt to lib/Transform/Utils Summary: I need to make use of this pass from a driver program that isn't opt. Therefore this patch moves this pass into the LLVM library so that it is available for use elsewhere. There was one function I kept in tools/opt which is exportDebugifyStats() this is because it's serializing the statistics into a human readable format and this seemed more in keeping with opt than a library function Reviewers: vsk, aprantl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69926	2019-11-07 14:41:54 -08:00
Vedant Kumar	a087b78bc4	Wrong debug info generated at -O2 (-O0 is correct) Instcombiner pass was erasing trivially dead instruction without updating dependent llvm.dbg.value. which was not showing programmer current state of variables while debugging. As a part of this fix I did following, Iterate throught all the users (llvm.dbg) of a instruction which is trivially dead and set each if them undef, Before deleting the instruction. Now user will see optimized out, when try to print those variables. This fixes https://bugs.llvm.org/show_bug.cgi?id=43893 This is my first fix to llvm. Patch by kamlesh kumar! Differential Revision: https://reviews.llvm.org/D69809	2019-11-07 11:19:41 -08:00
Sanjay Patel	d9ccb6367a	[InstCombine] canonicalize shift+logic+shift to reduce dependency chain shift (logic (shift X, C0), Y), C1 --> logic (shift X, C0+C1), (shift Y, C1) This is an IR translation of an existing SDAG transform added here: rL370617 So we again have 9 possible patterns with a commuted IR variant of each pattern: https://rise4fun.com/Alive/VlI https://rise4fun.com/Alive/n1m https://rise4fun.com/Alive/1Vn Part of the motivation is to allow easier recognition and subsequent canonicalization of bswap patterns as discussed in PR43146: https://bugs.llvm.org/show_bug.cgi?id=43146 We had to delay this transform because it used to allow the SLP vectorizer to create awful reductions out of simple load-combines. That problem was fixed with: rL375025 (we'll bring back load combining in IR someday...) The backend is also better equipped to deal with these patterns now using hooks like TLI.getShiftAmountThreshold(). The only remaining potential controversy is that the -reassociate pass tends to reverse this kind of pattern (to help GVN?). But since -reassociate doesn't do anything with these specific patterns, there is no conflict currently. Finally, there's a new pass proposal at D67383 for general tree-height-reduction reassociation, and it could use a cost model to decide how to optimally rearrange these kinds of ops for a target. That patch appears to be stalled. Differential Revision: https://reviews.llvm.org/D69842	2019-11-07 12:09:45 -05:00
evgeny	dde589389f	[ThinLTO] Import readonly vars with refs Patch allows importing declarations of functions and variables, referenced by the initializer of some other readonly variable. Differential revision: https://reviews.llvm.org/D69561	2019-11-07 15:13:35 +03:00
Sanjay Patel	7ff57705ba	[SLP] allow forming 2-way reduction patterns We have a vector compare reduction problem seen in PR39665 comment 2: https://bugs.llvm.org/show_bug.cgi?id=39665#c2 Or slightly reduced here: define i1 @cmp2(<2 x double> %a0) { %a = fcmp ogt <2 x double> %a0, <double 1.0, double 1.0> %b = extractelement <2 x i1> %a, i32 0 %c = extractelement <2 x i1> %a, i32 1 %d = and i1 %b, %c ret i1 %d } SLP would not attempt to turn this into a vector reduction because there is an artificial lower limit on that transform. We can not completely remove that limit without inducing regressions though, so this patch just hacks an extra attempt at creating a 2-way reduction to the end of the analysis. As shown in the test file, we are still not getting some of the motivating cases, so follow-on patches will be needed to solve those cases. Differential Revision: https://reviews.llvm.org/D59710	2019-11-07 06:08:42 -05:00
Eric Christopher	009e032634	Temporarily Revert "[LV] Apply sink-after & interleave-groups as VPlan transformations (NFC)" as it's causing assert failures. This reverts commit `100e797adb`.	2019-11-06 21:58:28 -08:00
Wenlei He	ba1dfae054	Keep import function list for inlinee profile update Summary: When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining. Reviewers: wmi, davidxl, tejohnson Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69736	2019-11-06 18:36:00 -08:00
Eric Christopher	e511c4b0df	Temporarily Revert: "[SLP] Generalization of stores vectorization." "[SLP] Fix -Wunused-variable. NFC" "[SLP] Vectorize jumbled stores." As they're causing significant (10-30x) compile time regressions on vectorizable code. The primary cause of the compile-time regression is `f228b53716`. This reverts commits: `f228b53716` `5503455ccb` `21d498c9c0`	2019-11-06 16:06:15 -08:00
Philip Reames	8748be7750	[LoopPred] Enable new transformation by default The basic idea of the transform is to convert variant loop exit conditions into invariant exit conditions by changing the iteration on which the exit is taken when we know that the trip count is unobservable. See the original patch which introduced the code for a more complete explanation. The individual parts of this have been reviewed, the result has been fuzzed, and then further analyzed by hand, but despite all of that, I will not be suprised to see breakage here. If you see problems, please don't hesitate to revert - though please do provide a test case. The most likely class of issues are latent SCEV bugs and without a reduced test case, I'll be essentially stuck on reducing them. (Note: A bunch of tests were opted out of the new transform to preserve coverage. That landed in a previous commit to simplify revert cycles if they turn out to be needed.)	2019-11-06 15:41:57 -08:00
Kazu Hirata	f0f73ed8b0	[JumpThreading] Factor out code to clone instructions (NFC) Summary: This patch factors out code to clone instructions -- partly for readability and partly to facilitate an upcoming patch of my own. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69861	2019-11-06 14:16:48 -08:00
Philip Reames	686f449e3d	[WC] Fix a subtle bug in our definition of widenable branch We had a subtle, but nasty bug in our definition of a widenable branch, and thus in the transforms which used that utility. Specifically, we returned true for any branch which included a widenable condition within it's condition, regardless of whether that widenable condition also had other uses. The problem is that the result of the WC() call is defined to be one particular value. As such, all users must agree as to what that value is. If we widen a branch without also updating all other users of the WC in the same way, we have broken the required semantics. Most of the textual diff is updating existing transforms not to leave dead uses hanging around. They're largely NFC as the dead instructions would be immediately deleted by other passes. The reason to make these changes is so that the transforms preserve the widenable branch form. In practice, we don't get bitten by this only because it isn't profitable to CSE WC() calls and the lowering pass from guards uses distinct WC calls per branch. Differential Revision: https://reviews.llvm.org/D69916	2019-11-06 14:16:34 -08:00
Philip Reames	9bfa5ab3d1	[LoopPred] Fix two subtle issues found by inspection This patch fixes two issues noticed by inspection when going to enable the loop predication code in IndVarSimplify. Issue 1 - Both the LoopPredication transform, and the already on by default optimizeLoopExits transform, modify the exit count of the exits they modify. (either to 0 or Infinity) Looking at the code more closely, this was not reflected into SCEV and we were instead running later transforms with incorrect SCEVs. Fixing this requires forgetting the loop, weakening a too strong assert, and updating SCEV to not pessimize results when a loop is provable untaken. I haven't been able to find a test case to demonstrate the miscompile. Issue 2 - For modules without a data layout, we can end up with unsized pointer typed exit counts. Just bail out of this case. I think these are the last two issues which need addressed before we enable this by default. The code has already survived a decent amount of fuzzing without revealing either of the above. Differential Revision: https://reviews.llvm.org/D69695	2019-11-06 14:04:45 -08:00
Roman Lebedev	4fe94d0331	[LoopUnroll] countToEliminateCompares(): fix handling of [in]equality predicates (PR43840) Summary: I believe this bisects to https://reviews.llvm.org/D44983 (`[LoopUnroll] Only peel if a predicate becomes known in the loop body.`) While that revision did contain tests that showed arguably-subpar peeling for [in]equality predicates that [not] happen in the middle of the loop, it also disabled peeling for the first loop iteration, because latch would be canonicalized to [in]equality comparison.. That was intentional as per https://reviews.llvm.org/D44983#1059583. I'm not 100% sure that i'm using correct checks here, but this fix appears to be going in the right direction.. Let me know if i'm missing some checks here.. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=43840 \| PR43840 ]]. Reviewers: fhahn, mkazantsev, efriedma Reviewed By: fhahn Subscribers: xbolva00, hiraditya, zzheng, llvm-commits, fhahn Tags: #llvm Differential Revision: https://reviews.llvm.org/D69617	2019-11-06 15:08:59 +03:00
Sjoerd Meijer	6c2a4f5ff9	[TTI][LV] preferPredicateOverEpilogue We have two ways to steer creating a predicated vector body over creating a scalar epilogue. To force this, we have 1) a command line option and 2) a pragma available. This adds a third: a target hook to TargetTransformInfo that can be queried whether predication is preferred or not, which allows the vectoriser to make the decision without forcing it. While this change behaves as a non-functional change for now, it shows the required TTI plumbing, usage of this new hook in the vectoriser, and the beginning of an ARM MVE implementation. I will follow up on this with: - a complete MVE implementation, see D69845. - a patch to disable this, i.e. we should respect "vector_predicate(disable)" and its corresponding loophint. Differential Revision: https://reviews.llvm.org/D69040	2019-11-06 10:14:20 +00:00
Alina Sbirlea	4b698645d3	[LoopRotationUtils] Check values are newly inserted into maps. This is a cleanup that came up in D63680. All values added to the ValueMaps should be newly added.	2019-11-05 13:40:10 -08:00
Sergey Dmitriev	82588e05cc	[SLP] - Add couple safety checks to TreeEntry::dump(). NFC Summary: Check for MainOp and AltOp for NULL before dereferencing or issue NULL. Reviewers: Vasilis, dtemirbulatov, RKSimon, ABataev Reviewed By: ABataev Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69812	2019-11-05 09:57:30 -08:00
Kazu Hirata	893afb9ca1	[JumpThreading] Factor out code to merge basic blocks (NFC) Summary: This patch factors out code to merge a basic block with its sole successor -- partly for readability and partly to facilitate an upcoming patch of my own. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69852	2019-11-05 09:46:57 -08:00
Gil Rapaport	100e797adb	[LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) This recommits `2be17087f8` (reverted in `d3ec06d219` for heap-use-after-free) with a fix in IAI's reset() which was not clearing the set of interleave groups after deleting them.	2019-11-05 17:29:13 +02:00
Francis Visoiu Mistrih	47d1029788	[ObjC][ARC] Ignore lifetime markers between *ReturnValue calls When eliminating a pair of `llvm.objc.autoreleaseReturnValue` followed by `llvm.objc.retainAutoreleasedReturnValue` we need to make sure that the instructions in between are safe to ignore. Other than bitcasts and useless GEPs, it's also safe to ignore lifetime markers for both static allocas (lifetime.start/lifetime.end) and dynamic allocas (stacksave/stackrestore). These get added by the inliner as part of the return sequence and can prevent the transformation from happening in practice. Differential Revision: https://reviews.llvm.org/D69833	2019-11-05 06:39:22 -08:00
Kazu Hirata	0016c1f400	[JumpThreading] Factor out common code to update the SSA form (NFC) Summary: This patch factors out common code to update the SSA form in JumpThreading.cpp -- partly for readability and partly to facilitate an coming patch of my own. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69811	2019-11-05 06:15:44 -08:00
Simon Pilgrim	77debf51ab	[GVN] Fix uninitialized variable warnings. NFCI.	2019-11-05 14:10:32 +00:00
Simon Pilgrim	1842fe6be3	Add missing GVN =operator. NFCI. Fixes PVS Studio warning that the 'ValueTable' class implements a copy constructor, but lacks the '=' operator.	2019-11-05 13:41:50 +00:00
Roman Lebedev	ccf1a5f4bb	[InstCombine] dropRedundantMaskingOfLeftShiftInput(): truncation (PR42563) Summary: That fold keeps growing and growing :( I think this may be one of the last pieces for it. Since D67677/D67725, the fold knowns the general form of the pattern - where some masking is needed: https://rise4fun.com/Alive/F5R https://rise4fun.com/Alive/gslRa But there is one more huge piece missing - if you are extracting some bits, it is not impossible that the origin is wider than the extraction, i.e. there may be a truncation. And we don't deal with that yet. But we can, and the generalization remains fully identical: https://rise4fun.com/Alive/Uar https://rise4fun.com/Alive/5SW After a preparatory cleanup i think the diff looks rather clean. One missing piece is that in some patterns (especially pat. b), `-1` only needs to be `-1` in final type, but that is for later.. https://bugs.llvm.org/show_bug.cgi?id=42563 Reviewers: spatel, nikic Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69125	2019-11-05 12:41:26 +03:00
Philip Reames	6ff439b57f	[SimplifyCFG] Use a (trivially) dominanting widenable branch to remove later slow path blocks This transformation is a variation on the GuardWidening transformation we have checked in as it's own pass. Instead of focusing on merge (i.e. hoisting and simplifying) two widenable branches, this transform makes the observation that simply removing a second slowpath block (by reusing an existing one) is often a very useful canonicalization. This may lead to later merging, or may not. This is a useful generalization when the intermediate block has loads whose dereferenceability is hard to establish. As noted in the patch, this can be generalized further, and will be. Differential Revision: https://reviews.llvm.org/D69689	2019-11-04 11:03:28 -08:00
Amy Huang	ab76cfdd20	Recommit "[CodeView] Add option to disable inline line tables." This reverts commit `004ed2b0d1`. Original commit hash `6d03890384` Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. https://reviews.llvm.org/D67723	2019-11-04 09:15:26 -08:00
Alexey Bataev	b80c41cd3c	[SLP]Fix PR43799: Crash on different sizes of GEP indices. Summary: If the GEP instructions are going to be vectorized, the indices in those GEP instructions must be of the same type. Otherwise, the compiler may crash when trying to build the vector constant. Reviewers: RKSimon, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69627	2019-11-04 10:36:26 -05:00
Benjamin Kramer	d3ec06d219	Revert "[LV] Apply sink-after & interleave-groups as VPlan transformations (NFC)" This reverts commit `2be17087f8`. Fails ASAN.	2019-11-04 15:04:42 +01:00
David Spickett	91167e22ec	[hwasan] Remove lazy thread-initialisation This was an experiment made possible by a non-standard feature of the Android dynamic loader. It required introducing a flag to tell the compiler which ABI was being targeted. This flag is no longer needed, since the generated code now works for both ABI's. We leave that flag untouched for backwards compatibility. This also means that if we need to distinguish between targeted ABI's again we can do that without disturbing any existing workflows. We leave a comment in the source code and mention in the help text to explain this for any confused person reading the code in the future. Patch by Matthew Malcomson Differential Revision: https://reviews.llvm.org/D69574	2019-11-04 10:58:46 +00:00
Gil Rapaport	2be17087f8	[LV] Apply sink-after & interleave-groups as VPlan transformations (NFC) The sink-after and interleave-group vectorization decisions were so far applied to VPlan during initial VPlan construction, which complicates VPlan construction – also because of their inter-dependence. This patch refactors buildVPlanWithRecipes() to construct a simpler initial VPlan and later apply both these vectorization decisions, in order, as VPlan-to-VPlan transformations. Differential Revision: https://reviews.llvm.org/D68577	2019-11-04 10:37:39 +02:00
Dávid Bolvanský	058b5028de	Reland '[InstructionCombining] Fixed null check after dereferencing warning. NFCI.'	2019-11-03 20:34:54 +01:00
Dávid Bolvanský	5b37c018d5	Revert "[InstructionCombining] Fixed null check after dereferencing warning. NFCI." This reverts commit `8308187fd9`. This exposed a bug.	2019-11-03 20:31:05 +01:00
Dávid Bolvanský	d825ed24d2	Revert "[InstructionCompares] Fixed null check after dereferencing warning. NFCI." This reverts commit `b8685cf304`.	2019-11-03 20:24:01 +01:00
Dávid Bolvanský	b8685cf304	[InstructionCompares] Fixed null check after dereferencing warning. NFCI.	2019-11-03 20:13:45 +01:00
Dávid Bolvanský	8308187fd9	[InstructionCombining] Fixed null check after dereferencing warning. NFCI.	2019-11-03 20:10:46 +01:00
Dávid Bolvanský	8262a5b701	[CHR] Fixed null check after dereferencing warning. NFCI.	2019-11-03 20:06:38 +01:00
Dávid Bolvanský	914128ab12	[LoopUnrollRuntime] Fixed null check after dereferencing warning. NFCI.	2019-11-03 20:05:18 +01:00
Dávid Bolvanský	60cb193a40	[LoopUnrollAndJam] Fixed null check after dereferencing warning. NFCI.	2019-11-03 20:02:54 +01:00
Simon Pilgrim	81ba611e88	Ensure VPlanPrinter::Depth is initialized to fix static analyzer warning. NFCI.	2019-11-03 11:17:05 +00:00
Johannes Doerfert	77a6b358b5	[Attributor][NFCI] Do not track unnecessary dependences If we do not look at assumed information there is no need to track dependences.	2019-11-02 15:26:30 -05:00
Johannes Doerfert	680f638027	[Attributor][NFCI] Distinguish optional and required dependences Dependences between two abstract attributes SRC and TRG come naturally in two flavors: Either (1) "some" information of SRC is required for TRG to derive information, or (2) SRC is just an optional way for TRG to derive information. While it is not strictly necessary to distinguish these types explicitly, it can help us to converge faster, in terms of iterations, and also cut down the number of `AbstractAttribute::update` calls. As far as I can tell, we only use optional dependences for liveness so far but that might change in the future. With this change the Attributor can be informed about the "dependence class" and it will perform appropriate actions when an Attribute is set to an invalid state, thus one that cannot be used by others to derive information from.	2019-11-02 15:26:22 -05:00
Stefan Stipanovic	f35740d6e9	NoFree argument attribute. Summary: Deducing nofree atrribute for function arguments. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67886	2019-11-02 19:40:48 +01:00
Stefan Stipanovic	5fb1782918	Revert "NoFree argument attribute." This reverts commit `c12efa2ed0`.	2019-11-02 17:31:02 +01:00
Stefan Stipanovic	c12efa2ed0	NoFree argument attribute. Summary: Deducing nofree atrribute for function arguments. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67886	2019-11-02 16:35:38 +01:00
Roman Lebedev	c4b757be02	Revert BCmp Loop Idiom recognition transform (PR43870) As discussed in https://bugs.llvm.org/show_bug.cgi?id=43870, this transform is missing a crucial legality check: the old (non-countable) loop would early-return upon first mismatch, but there is no such guarantee for bcmp/memcmp. We'd need to ensure that [PtrA, PtrA+NBytes) and [PtrB, PtrB+NBytes) are fully dereferenceable memory regions. But that would limit the transform to constant loop trip counts and would further cripple it because dereferenceability analysis is very partial. Furthermore, even if all that is done, every single test would need to be rewritten from scratch. So let's just give up.	2019-11-02 12:48:03 +03:00
Johannes Doerfert	2d77b0cad0	[Attributor] Ignore BlockAddress users in call site traversal BlockAddress users will not "call" the function so they do not qualify as call sites in the first place. When we delete a function with BlockAddress users we need to first remove the body so they are properly discarded.	2019-11-02 01:23:18 -05:00
Johannes Doerfert	07d16424f2	[Attributor][FIX] Do not try to cast if a cast is not required When we replace constant returns at the call site we did issue a cast in the hopes it would be a no-op if the types are equal. Turns out that is not the case and we have to check it ourselves first. Reused an IPConstantProp test for coverage. No functional change to the test wrt. IPConstantProp.	2019-11-02 00:54:00 -05:00
Johannes Doerfert	c7ab19dbb0	[Attributor][FIX] Transform invoke of nounwind to call + br %normal_dest Even if the invoked function may-return, we can replace it with a call and branch if it is nounwind. We had almost everything in place to do this but did not which actually caused a crash when we removed the landingpad from the actually dead unwind block. Exposed by the IPConstantProp tests.	2019-11-02 00:54:00 -05:00
Johannes Doerfert	3cbe3314b4	[Attributor][FIX] Make "known" and "assumed" liveness explicit We did merge "known" and "assumed" liveness information into a single set which caused various kinds of problems, especially because we did not properly record when something was actually known. With this patch we properly track the "known" bit and distinguish dead ends we know from the ones we still need to explore in future updates.	2019-11-02 00:49:29 -05:00
Johannes Doerfert	1b6041a9e8	[Attributor] `willreturn` + `noreturn` = UB We gave up on `noreturn` if `willreturn` was known for a while but we now again try to always derive `noreturn`. This is useful because a function that is `noreturn` + `willreturn` is basically dead as executing it would lead to undefined behavior (UB). This came up in the IPConstantProp cases where a function only contained a unreachable but was not marked `noreturn` which caused missed opportunities down the line.	2019-11-02 00:35:22 -05:00
Johannes Doerfert	e360ee6265	[Attributor][FIX] Make AAValueSimplifyArgument aware of thread-dependent constants As in IPConstantProp, thread-dependent constants need not be propagated over callbacks. Took the comment and test from there, see also D56447.	2019-11-02 00:32:39 -05:00
Johannes Doerfert	ed47a9cde4	[Attributor][FIX] Handle the default case of a switch In D69605 only the "cases" of a switch were handled but if none matched we did not make the default case life. This is fixed now and properly tested (with code from IPConstantProp/user-with-multiple-uses.ll).	2019-11-02 00:30:31 -05:00
Johannes Doerfert	15cd90a2c4	[Attributor][FIX] Make value simplification aware of "complicated" attributes We cannot simply replace arguments that carry attributes like `nest`, `inalloca`, `sret`, and `byval`. Except for the last one, which we can replace if it is not written, we bail for now.	2019-11-02 00:29:17 -05:00
Johannes Doerfert	c36e2ebf9f	[Attributor][NFCI] Avoid unnecessary work except for testing Trying to deduce information for declarations and calls sites of declarations is not useful in practice but only for testing. Add a flag that disables this by default but also enable it in the tests. The misc.ll test will verify the flag "works" as expected.	2019-11-02 00:28:24 -05:00
Johannes Doerfert	0437bfcc83	[Attributor][FIX] NoCapture is not a subsuming property We cannot look at the subsuming positions and take their nocapture bit as shown with the two tests for which we derived nocapture on the call site argument and readonly on the argument of the second before.	2019-11-02 00:26:15 -05:00
Johannes Doerfert	0c7d4d7f3e	[Attributor][NFCI] Remove obsolete code The code in question does not add anything as the class is a subclass of AACallSiteReturnedFromReturnedAndMustBeExecutedContext already.	2019-11-02 00:25:46 -05:00
Teresa Johnson	16ec00eee7	Recommit "[ThinLTO] Handle GUID collision in import global processing"" This recommits `cc0b9647b7` which was reverted in `d39d1a2f87`. I added a fix for an issue found when testing via distributed ThinLTO, and added a test case for that failure.	2019-11-01 13:57:01 -07:00
Teresa Johnson	d39d1a2f87	Revert "[LLD][ThinLTO] Handle GUID collision in import global processing" This reverts commit `cc0b9647b7`. The commit is causing a failure in internal testing. Will recommit with a fix later.	2019-11-01 10:02:58 -07:00
Johannes Doerfert	eb4f41dfe5	[Attributor] Really use the executed-context Before we did not follow casts and geps when we looked at the users of a pointer in the pointers must-be-executed-context. This caused us to fail to determine if it was accessed for sure. With this change we follow such users now. The above extension exposed problems in getKnownNonNullAndDerefBytesForUse which did not always check what the base pointer was. We also did not handle negative offsets as conservative as we have to without explicit loop handling. Finally, we should not derive a huge number if we access a pointer that was traversed backwards first. The problems exposed by this functional change are already tested in the existing test cases as is the functional change. Differential Revision: https://reviews.llvm.org/D69647	2019-10-31 15:09:45 -05:00
Alexey Bataev	70ad617dd6	[SLP] Vectorize jumbled stores. Summary: Patch adds support for vectorization of the jumbled stores. The value operands are vectorized and then shuffled in the right order before store. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43339	2019-10-31 16:02:25 -04:00
Johannes Doerfert	2d6d651e8c	[Attributor] Make AANonNull perform context sensitive queries Summary: In order to get context sensitivity from isKnownNonZero we need to provide a context instruction and a dominator tree. The latter is passed now to which actually allows to remove some initialization code. Tests taken from PR43833. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69595	2019-10-31 14:47:06 -05:00
Craig Topper	6773435624	[IPCP] Bail on extractvalue's with more than 1 index. The replacement code only looks at the first index of the extractvalue. If there are additional indices we'll end up doing a bad replacement. This only happens if the function returns a nested struct. Not sure if clang ever generates such code. The original report came from ispc. Fixes PR43857 Differential Revision: https://reviews.llvm.org/D69656	2019-10-31 10:55:20 -07:00
Sanjay Patel	a2240f57e7	[InstCombine] simplify fcmp+select canonicalization; NFCI We had 2 blocks of code that are nearly identical. Existing regression tests should cover both of the patterns.	2019-10-31 13:13:32 -04:00
David Green	a5f7bc0de7	[InstCombine] Canonicalize uadd.with.overflow to uadd.sat This adds some patterns to transform uadd.with.overflow to uadd.sat (with usub.with.overflow to usub.sat too). The patterns selects from UINTMAX (or 0 for subs) depending on whether the operation overflowed. Signed patterns are a little more involved (they can wrap in two directions), but can be added here in a followup patch too. Differential Revision: https://reviews.llvm.org/D69245	2019-10-31 12:45:38 +00:00
Serguei Katkov	1eb04d289a	[LICM] Invalidate SCEV upon instruction hoisting Since SCEV can cache information about location of an instruction, it should be invalidated when the instruction is moved. There should be similar bug in code sinking part of LICM, it will be fixed in a follow-up change. Patch Author: Daniil Suchkov Reviewers: asbirlea, mkazantsev, reames Reviewed By: asbirlea Subscribers: hiraditya, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D69370	2019-10-31 17:37:53 +07:00
Haojian Wu	e65ddcafee	Revert "[SLP] Vectorize jumbled stores." This reverts commit `21d498c9c0`. This commit causes some crashes on some targets.	2019-10-31 10:21:24 +01:00
Johannes Doerfert	31784248ee	[Attributor][NFCI] Improve the usage of IntegerStates Setting the upper bound directly in the state can be beneficial and simplifies the logic. This also exposed more copy&paste type errors.	2019-10-31 01:05:52 -05:00
Johannes Doerfert	dac2d403a2	[Attributor] Make liveness "edge-based" Summary: If control is transferred to a successor is the key question when it comes to liveness. The new implementation puts that question in the focus and thereby providing a clean way to assume certain CFG edges are dead or instructions will not transfer control. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69605	2019-10-31 00:35:18 -05:00
Johannes Doerfert	cd4aab4a8a	[Attributor] Liveness for values Summary: This patch introduces liveness (AAIsDead) for all positions, thus for all kinds of values. For now, we say an instruction is dead if it would be removed assuming all users are dead. A call site return is different as we just look at the users. If all call site returns have been eliminated, the return values can return undef instead of their original value, eliminating uses. We try to recursively delete dead instructions now and we introduce a simple check interface for use-traversal. This is the idea tried out in D68626 but implemented in the right way. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68925	2019-10-31 00:16:36 -05:00
Johannes Doerfert	5e442a51bc	[Attributor][NFC] Do not delete dead blocks but "clear" them Deleting blocks will require us to deal with dead edges, e.g., `br i1 false, label %live, label %dead` explicitly. For now we just clear the blocks and move on. This will be revisited once we actually fold branches.	2019-10-31 00:09:50 -05:00
Johannes Doerfert	0be9cf2da9	[Attributor] Add "free"-based heap2stack deduction Summary: If there is a unique free of the allocated that has to be reached from the malloc, we can apply the heap-2-stack transformation even if the pointer escapes. Reviewers: hfinkel, sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68958	2019-10-30 20:57:57 -05:00
Johannes Doerfert	2dad729f0c	[Attributor][NFC] Eagerly mark attributes as fixed. If an attribute did not query any optimistic (=non-fixed) information to justify its state, we know the attribute state will not change anymore. Thus, we can indicate an optimistic fixpoint.	2019-10-30 20:47:47 -05:00
Johannes Doerfert	12173e60ec	[Attributor][NFC] Do not record dependences on fixed attributes Since fixed values cannot change, we do not need to wait for it to happen, we will never notify the dependent attribute anyway.	2019-10-30 20:44:03 -05:00
Johannes Doerfert	b2083c5382	[Attributor][NFC] Simplify the IRPosition interface We pretended IRPosition came either as mutable or immutable objects while they are basically always immutable, with a single (existing) unfortunate exceptions. This patch cleans up the uses to deal with the immutable version.	2019-10-30 20:43:05 -05:00
Teresa Johnson	c844f8846a	[ThinLTO/WPD] Fix index-based WPD for available_externally vtables Summary: Clang does not add type metadata to available_externally vtables. When choosing a summary to look at for virtual function definitions, make sure we skip summaries for any available externally vtables as they will not describe any virtual function functions, which are only summarized in the presence of type metadata on the vtable def. Simply look for the corresponding strong def's summary. Also add handling for same-named local vtables with the same GUID because of same-named files without enough distinguishing path. In that case we return a conservative result with no devirtualization. Reviewers: pcc, davidxl, evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69452	2019-10-30 17:59:08 -07:00
Amy Huang	004ed2b0d1	Revert "[CodeView] Add option to disable inline line tables." because it breaks compiler-rt tests. This reverts commit `6d03890384`.	2019-10-30 17:31:12 -07:00
Amy Huang	6d03890384	[CodeView] Add option to disable inline line tables. Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. See https://bugs.llvm.org/show_bug.cgi?id=42344 Reviewers: rnk Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67723	2019-10-30 16:52:39 -07:00
tyker	c3b06d0c39	[InstCombine] keep assumption before sinking calls Summary: in the following C code the branch is not removed by clang in O3. ``` int f1(char* p) { int i1 = __builtin_strlen(p); if (!p) return -1; return i1; } ``` The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null. This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull. This resolves this issue at O3. Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69477	2019-10-31 00:15:19 +01:00
Alexey Bataev	21d498c9c0	[SLP] Vectorize jumbled stores. Summary: Patch adds support for vectorization of the jumbled stores. The value operands are vectorized and then shuffled in the right order before store. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43339	2019-10-30 13:33:52 -04:00
Simon Pilgrim	d52f5ed01a	[SLPVectorizer] Use getAPInt() for comparison. NFCI. Technically integers can assert on getZExtValue() if beyond i64 range, and a fuzzer usually find this.....	2019-10-30 16:16:55 +00:00
Karl-Johan Karlsson	760ed8da98	[AddressSanitizer] Only instrument globals of default address space The address sanitizer ignore memory accesses from different address spaces, however when instrumenting globals the check for different address spaces is missing. This result in assertion failure. The fault was found in an out of tree target. The patch skip all globals of non default address space. Reviewed By: leonardchan, vitalybuka Differential Revision: https://reviews.llvm.org/D68790	2019-10-30 09:32:19 +01:00

... 24 25 26 27 28 ...

25239 Commits