llvm-project

Commit Graph

Author	SHA1	Message	Date
David Green	67121d7b82	[CGP] Enable CodeGenPrepares phi type convertion.	2020-06-21 16:46:16 +01:00
David Green	730ecb63ec	[CGP] Convert phi types If a collection of interconnected phi nodes is only ever loaded, stored or bitcast then we can convert the whole set to the bitcast type, potentially helping to reduce the number of register moves needed as the phi's are passed across basic block boundaries. This has to be done in CodegenPrepare as it naturally straddles basic blocks. The alorithm just looks from phi nodes, looking at uses and operands for a collection of nodes that all together are bitcast between float and integer types. We record visited phi nodes to not have to process them more than once. The whole subgraph is then replaced with a new type. Loads and Stores are bitcast to the correct type, which should then be folded into the load/store, changing it's type. This comes up in the biquad testcase due to the way MVE needs to keep values in integer registers. I have also seen it come up from aarch64 partner example code, where a complicated set of sroa/inlining produced integer phis, where float would have been a better choice. I also added undef and extract element handling which increased the potency in some cases. This adds it with an option that defaults to off, and disabled for 32bit X86 due to potential issues around canonicalizing NaNs. Differential Revision: https://reviews.llvm.org/D81827	2020-06-21 15:54:17 +01:00
Davide Italiano	1cbaf847ab	[CGP] Reset the debug location when promoting zext(s). When the zext gets promoted, it used to retain the original location, which pessimizes the debugging experience causing an unexpected jump in stepping at -Og. Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also contains a full C repro). Differential Revision: https://reviews.llvm.org/D81437	2020-06-17 11:13:13 -07:00
Davide Italiano	c2dccf9d5e	[CodeGenPrepare] Reset the debug location when promoting trunc(s) The promotion machinery in CGP moves instructions retaining debug locations. When the transformation is local, this is mostly correct, but when instructions are moved cross-BBs, this is not always true and causes jumpiness in line tables. This is the first of a series of commits. sext(s) and zext(s) need to be treated similarly. Differential Revision: https://reviews.llvm.org/D81879	2020-06-15 14:25:43 -07:00
Christopher Tetreault	caa2fddce7	[SVE] Eliminate calls to default-false VectorType::get() from CodeGen Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet Reviewed By: spatel, gchatelet Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80313	2020-06-08 10:26:10 -07:00
Nikita Popov	cb5724c71e	[CGP] Remove unnecessary MaybeAlign use (NFC) Stores now always have an alignment.	2020-06-05 23:18:26 +02:00
Philip Reames	382b3023cb	[Statepoints][CGP] Minor parameter type cleanup	2020-06-03 16:00:38 -07:00
Eric Christopher	153a24ab0f	Undo initialization of TRI in CGP as this is unconditionally initialized later.	2020-06-02 15:08:54 -07:00
Eric Christopher	971459c3ef	Fix up clang-tidy warnings around null and pointers.	2020-06-02 13:24:20 -07:00
Simon Pilgrim	b9826c1086	[CGP] Ensure address scaled offset is representable as int64_t AddressingModeMatcher::matchScaledValue was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t Fixes OSSFuzz#22723 which is a followup to rGc479052a74b2 (PR46004 / OSSFuzz#22357)	2020-05-29 12:25:43 +01:00
Philip Reames	87bea912c2	[Statepoint] Replace uses of isX functions with idiomatic isa<X> Now that all of the statepoint related routines have classes with isa support, let's cleanup. I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.	2020-05-27 18:32:28 -07:00
Sanjay Patel	7eed772a27	[PatternMatch] abbreviate vector inst matchers; NFC Readability is not reduced with these opcodes/match lines, so reduce odds of awkward wrapping from 80-col limit.	2020-05-24 09:19:47 -04:00
Simon Pilgrim	c479052a74	[CGP] Ensure address offset is representable as int64_t AddressingModeMatcher::matchAddr was calling getSExtValue for a constant before ensuring that we can actually represent the value as int64_t Fixes PR46004 / OSSFuzz#22357	2020-05-22 17:00:22 +01:00
Mehdi Amini	8697d443ab	Fix warning "defined but not used" for debug function (NFC)	2020-05-17 23:50:18 +00:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Sanjay Patel	5be37cb124	[x86][CGP] try to hoist funnel shift above select-of-splats This is basically the same patch as D63233, but converted to funnel shifts rather than regular shifts. I did not see a way to effectively share code for these 2 cases though. This follows D79718 and D79827 to re-fix PR37426 because that gets canonicalized to funnel shift intrinsics in IR. I did draft an alternative patch as an enhancement to "shouldSinkOperands()", but that was awkward because we have to key the transform from the select, but then look at both its users and its operands.	2020-05-16 10:44:47 -04:00
Sanjay Patel	26e742fd84	[x86][CGP] improve sinking of splatted vector shift amount operand Expands on the enablement of the shouldSinkOperands() TLI hook in: D79718 The last codegen/IR test diff shows what I suspected could happen - we were sinking all splat shift operands into a loop. But that's not what we want in general; we only want to sink the shift amount operand if it is a splat. Differential Revision: https://reviews.llvm.org/D79827	2020-05-14 08:36:03 -04:00
Benjamin Kramer	a8bf2deae4	[CodeGenPrepare] Remove a superflouos variable. NFC. Fixes a -Wunused-variable warning in Release builds.	2020-05-13 18:25:20 +02:00
David Green	fa15255d8a	[ARM] Convert floating point splats to integer Under MVE a vdup will always take a gpr register, not a floating point value. During DAG combine we convert the types to a bitcast to an integer in an attempt to fold the bitcast into other instructions. This is OK, but only works inside the same basic block. To do the same trick across a basic block boundary we need to convert the type in codegenprepare, before the splat is sunk into the loop. This adds a convertSplatType function to codegenprepare to do that, putting bitcasts around the splat to force the type to an integer. There is then some adjustment to the code in shouldSinkOperands to handle the extra bitcasts. Differential Revision: https://reviews.llvm.org/D78728	2020-05-13 15:24:16 +01:00
Sanjay Patel	5f05c2f59a	[CGP] remove duplicate function for finding a splat shuffle; NFC	2020-05-11 16:36:07 -04:00
Wei Mi	aa2ddfc73d	[SampleFDO] For functions without profiles, provide an option to put them in a special text section. For sampleFDO, because the optimized build uses profile generated from previous release, previously we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persuing the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. In https://reviews.llvm.org/D66374, we introduced profile symbol list to discriminate functions being cold versus functions being newly added. This mechanism works quite well for regular use cases in AutoFDO. However, in some case, we can only have a partial profile when optimizing a target. The partial profile may be an aggregated profile collected from many targets. The profile symbol list method used for regular sampleFDO profile is not applicable to partial profile use case because it may be too large and introduce many false positives. To solve the problem for partial profile use case, we provide an option called --profile-unknown-in-special-section. For functions without profile, we will still treat them conservatively in compiler optimizations -- for example, treat them as warm instead of cold in inliner. When we use profile info to add section prefix for functions, we will discriminate functions known to be not cold versus functions without profile (being unknown), and we will put functions being unknown in a special text section called .text.unknown. Runtime system will have the flexibility to decide where to put the special section in order to achieve a balance between performance and memory saving. Differential Revision: https://reviews.llvm.org/D62540	2020-05-08 11:18:09 -07:00
Hiroshi Yamauchi	1b4e3def03	[BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare. Summary: This helps detect some missed BFI updates during CodeGenPrepare. This is debug build only and disabled behind a flag. Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts(). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77417	2020-05-07 11:58:00 -07:00
Sam Parker	40574fefe9	[NFC][CostModel] Add TargetCostKind to relevant APIs Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002	2020-05-05 10:35:54 +01:00
Sam Parker	e9c9329aa4	[TTI] Add TargetCostKind argument to getUserCost There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635	2020-04-28 08:57:45 +01:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
Christopher Tetreault	ccd623eae3	[SVE] Remove calls to isScalable from CodeGen Reviewers: efriedma, sdesmalen, stoklund, sunfish Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77755	2020-04-23 12:58:52 -07:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Craig Topper	fcc9d70260	Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign." This is breaking the clang build. This reverts commit `897409fb56`.	2020-04-20 13:25:06 -07:00
Craig Topper	897409fb56	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 13:08:05 -07:00
Christopher Tetreault	c858debebc	Remove asserting getters from base Type Summary: Remove asserting vector getters from Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D77278	2020-04-17 14:03:31 -07:00
Craig Topper	944cc5e0ab	[SelectionDAGBuilder][CGP][X86] Move some of SDB's gather/scatter uniform base handling to CGP. I've always found the "findValue" a little odd and inconsistent with other things in SDB. This simplfifies the code in SDB to just handle a splat constant address or a 2 operand GEP in the same BB. This removes the need for "findValue" since the operands to the GEP are guaranteed to be available. The splat constant handling is new, but was needed to avoid regressions due to constant folding combining GEPs created in CGP. CGP is now responsible for canonicalizing gather/scatters into this form. The pattern I'm using for scalarizing, a scalar GEP followed by a GEP with an all zeroes index, seems to be subject to constant folding that the insertelement+shufflevector was not. Differential Revision: https://reviews.llvm.org/D76947	2020-04-16 17:49:22 -07:00
Craig Topper	95192f548d	[CallSite removal][TargetLowering] Use CallBase instead of CallSite in TargetLowering::ParseConstraints interface. Differential Revision: https://reviews.llvm.org/D77929	2020-04-12 11:26:25 -07:00
Christopher Tetreault	889f6606ed	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: stoklund, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77272	2020-04-10 14:53:43 -07:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Guozhi Wei	6d20937c29	[CodeGenPrepare] Delete intrinsic call to llvm.assume to enable more tailcall The attached test case is simplified from tcmalloc. Both function calls should be optimized as tailcall. But llvm can only optimize the first call. The second call can't be optimized because function dupRetToEnableTailCallOpts failed to duplicate ret into block case2. There 2 problems blocked the duplication: 1 Intrinsic call llvm.assume is not handled by dupRetToEnableTailCallOpts. 2 The control flow is more complex than expected, dupRetToEnableTailCallOpts can only duplicate ret into its predecessor, but here we have an intermediate block between call and ret. The solutions: 1 Since CodeGenPrepare is already at the end of LLVM IR phase, we can simply delete the intrinsic call to llvm.assume. 2 A general solution to the complex control flow is hard, but for this case, after exit2 is duplicated into case1, exit2 is the only successor of exit1 and exit1 is the only predecessor of exit2, so they can be combined through eliminateFallThrough. But this function is called too late, there is no more dupRetToEnableTailCallOpts after it. We can add an earlier call to eliminateFallThrough to solve it. Differential Revision: https://reviews.llvm.org/D76539	2020-03-31 11:55:51 -07:00
Juneyoung Lee	453eac3f77	Minor fixes to a comment in CodeGenPrepare	2020-03-25 16:34:43 +09:00
Juneyoung Lee	07a41544fd	Minor fix to a comment in CodeGenPrepare.cpp	2020-03-17 01:10:26 +09:00
Juneyoung Lee	6ad63606ea	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not frozen, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-16 12:46:20 +09:00
Juneyoung Lee	4ffe3ac729	Revert "[CodeGenPrepare] Freeze condition when transforming select to br" This reverts commit `10aa7ea951`.	2020-03-16 12:45:54 +09:00
Juneyoung Lee	10aa7ea951	[CodeGenPrepare] Freeze condition when transforming select to br Summary: This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch. If it is not freezed, instsimplify or the later pipeline can potentially exploit undefined behavior. The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp). Reviewers: jdoerfert, spatel, lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76179	2020-03-15 11:10:46 +09:00
Juneyoung Lee	c39cb1c0dd	[CodeGenPrepare] Expand freeze conversion to support fcmp and icmp with null Summary: This is a simple patch that expands https://reviews.llvm.org/D75859 to pointer comparison and fcmp Checked with Alive2 Reviewers: reames, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76048	2020-03-13 17:21:33 +09:00
Huihui Zhang	118abf2017	[SVE] Update API ConstantVector::getSplat() to use ElementCount. Summary: Support ConstantInt::get() and Constant::getAllOnesValue() for scalable vector type, this requires ConstantVector::getSplat() to take in 'ElementCount', instead of 'unsigned' number of element count. This change is needed for D73753. Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74386	2020-03-12 13:22:41 -07:00
Juneyoung Lee	8eb2f865c3	[CodeGenPrepare] Fold br(freeze(icmp x, const)) to br(icmp(freeze x, const)) Summary: This patch helps CodeGenPrepare move freeze into the icmp when it is used by branch. It reenables generation of efficient conditional jumps. This is only done when at least one of icmp's operands is constant to prevent the transformation from increasing # of freeze instructions. Performance degradation of MultiSource/Benchmarks/Ptrdist/yacr2/yacr2.test is resolved with this patch. Checked with Alive2 Reviewers: reames, fhahn, nlopes Reviewed By: reames Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75859	2020-03-12 03:16:15 +09:00
Guozhi Wei	ee9a3eba76	[CodeGenPrepare] Handle ExtractValueInst in dupRetToEnableTailCallOpts As the test case shows if there is an ExtractValueInst in the Ret block, function dupRetToEnableTailCallOpts can't duplicate it into the block containing call. So later no tail call is generated in CodeGen. This patch adds the ExtractValueInst handling code in function dupRetToEnableTailCallOpts and FoldReturnIntoUncondBranch, and later tail call can be generated for this case. Differential Revision: https://reviews.llvm.org/D74242	2020-03-04 11:10:32 -08:00
Florian Hahn	7769030b93	Recommit "[PatternMatch] Match XOR variant of unsigned-add overflow check." This version fixes a buildbot failure cause by picking the wrong insert point for XORs. We cannot pick the XOR binary operator as insert point, as it is not guaranteed that both input operands for the overflow intrinsic are defined before it. This reverts the revert commit `c7fc0e5da6`.	2020-02-23 18:33:18 +00:00
Nikita Popov	a8db806d52	[SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls This changes the SimplifyLibCalls utility to accept an IRBuilderBase, which allows us to pass through the IRBuilder used by InstCombine. This will ensure that new instructions get added to the worklist. The annotated test-case drops from 4 to 2 InstCombine iterations thanks to this. To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard, which is basically the same as the existing InsertPointGuard and FastMathFlagsGuard, but for operand bundles. Also add a setDefaultOperandBundles() method so these can be set outside the constructor. Differential Revision: https://reviews.llvm.org/D74792	2020-02-21 18:26:05 +01:00
Florian Hahn	c7fc0e5da6	Revert "[PatternMatch] Match XOR variant of unsigned-add overflow check." This reverts commit `e01a3d49c2`. and commit `a6a585b803`. This causes a failure on GreenDragon: http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/9597	2020-02-19 19:37:08 +01:00
Florian Hahn	e01a3d49c2	[PatternMatch] Match XOR variant of unsigned-add overflow check. Instcombine folds (a + b <u a) to (a ^ -1 <u b) and that does not match the expected pattern in CodeGenPerpare via UAddWithOverflow. This causes a regression over Clang 7 on both X86 and AArch64: https://gcc.godbolt.org/z/juhXYV This patch extends UAddWithOverflow to also catch the XOR case, if the XOR is only used in the ICMP. This covers just a single case, but I'd like to make sure I am not missing anything before tackling the other cases. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D74228	2020-02-19 15:25:18 +01:00
Florian Hahn	216afd3301	[TargetLower] Update shouldFormOverflowOp check if math is used. On some targets, like SPARC, forming overflow ops is only profitable if the math result is used: https://godbolt.org/z/DxSmdB This patch adds a new MathUsed parameter to allow the targets to make the decision and defaults to only allowing it if the math result is used. That is the conservative choice. This patch also updates AArch64ISelLowering, X86ISelLowering, ARMISelLowering.h, SystemZISelLowering.h to allow forming overflow ops if the math result is not used. On those targets using the overflow intrinsic for the overflow check only generates better code. Reviewers: nikic, RKSimon, lebedev.ri, spatel Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D74722	2020-02-19 11:28:33 +01:00
Jim Lin	466f8843f5	[NFC] Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h,td}	2020-02-18 10:49:13 +08:00
Clement Courbet	15488ff24b	[CodeGen] Fix the computation of the alignment of split stores. Summary: Right now the alignment of the lower half of a store is computed as align/2, which fails for unaligned stores (align = 1), and is overly pessimitic for, e.g. a 8 byte store aligned to 4 bytes. Fixes PR44851 Fixes PR44877 Reviewers: gchatelet, spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74311	2020-02-12 10:37:30 +01:00
Matt Arsenault	23b76096b7	CodeGenPrepare: Reorder check for cold and shouldOptimizeForSize shouldOptimizeForSize is showing up in a profile, spending around 10% of the pass time in one function. This should probably not be so slow, but the much cheaper attribute check should be done first anyway.	2020-02-04 11:23:13 -08:00
Fangrui Song	44cdae68c3	[CodeGenPrepare] Delete dead !DL check Follow-up for D73754 DL is assigned in CodeGenPrepare::runOnFunction and is guaranteed to be non-null.	2020-02-02 09:49:06 -08:00
Fangrui Song	5a56a25b0b	[CodeGenPrepare] Make TargetPassConfig required The code paths in the absence of TargetMachine, TargetLowering or TargetRegisterInfo are poorly tested. As rL285987 said, requiring TargetPassConfig allows us to delete many (untested) checks littered everywhere. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73754	2020-02-02 09:28:45 -08:00
Guillaume Chatelet	59f95222d4	[Alignment][NFC] Use Align with CreateAlignedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73274	2020-01-23 17:34:32 +01:00
Hiroshi Yamauchi	ddbc728828	[PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst. Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146	2020-01-22 08:36:54 -08:00
Sander de Smalen	4cf16efe49	[AArch64][SVE] Add patterns for unpredicated load/store to frame-indices. This patch also fixes up a number of cases in DAGCombine and SelectionDAGBuilder where the size of a scalable vector is used in a fixed-width context (thus triggering an assertion failure). Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D71215	2020-01-22 14:32:27 +00:00
Sander de Smalen	67d4c9924c	Add support for (expressing) vscale. In LLVM IR, vscale can be represented with an intrinsic. For some targets, this is equivalent to the constexpr: getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1 This can be used to propagate the value in CodeGenPrepare. In ISel we add a node that can be legalized to one or more instructions to materialize the runtime vector length. This patch also adds SVE CodeGen support for VSCALE, which maps this node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD] instructions (scaled multiples of 2, 4, or 8 bytes, respectively). Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner Reviewed by: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68203	2020-01-22 10:09:27 +00:00
Valentin Churavy	5c29e8c65f	[CodegenPrepare] Guard against degenerate branches Summary: Guard against a potential crash observed in https://github.com/JuliaLang/julia/issues/32994#issuecomment-524249628 If two branches are collapsed we can encounter a degenerate conditional branch `TBB==FBB`. The subsequent code assumes that they differ, so we exit out early. Reviewers: ributzka, spatel Subscribers: loladiro, dexonsmith, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66657	2019-12-16 04:23:32 -05:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Hiroshi Yamauchi	d9ae493937	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149	2019-12-09 12:42:59 -08:00
Jeremy Morse	00e238896c	[DebugInfo] Nerf placeDbgValues, with prejudice CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453	2019-12-09 12:52:10 +00:00
Hiroshi Yamauchi	2eb30fafa5	Revert "[PGO][PGSO] Instrument the code gen / target passes." This reverts commit `9a0b5e1407`. This seems to break buildbots.	2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi	9a0b5e1407	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072	2019-12-06 10:43:39 -08:00
Jeremy Morse	c93a9b15ce	[DebugInfo][CGP] Update dbg.values when sinking address computations One of CodeGenPrepare's optimizations is to duplicate address calculations into basic blocks, so that as much information as possible can be folded into memory addressing operands. This is great -- but the dbg.value variable location intrinsics are not updated in the same way. This can lead to dbg.values referring to address computations in other blocks that will never be encoded into the DAG, while duplicate address computations are performed locally that could be used by the dbg.value. Some of these (such as non-constant-offset GEPs) can't be salvaged past. Fix this by, whenever we duplicate an address computation into a block, looking for dbg.value users of the original memory address in the same block, and redirecting those to the local computation. Differential Revision: https://reviews.llvm.org/D58403	2019-12-06 11:27:19 +00:00
Hans Wennborg	cee62e6fcf	Fix a typo.	2019-11-30 13:23:49 +01:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Yi-Hong Lyu	6bbfafd037	[CGP] Make ICMP_EQ use CR result of ICMP_S(L\|G)T dominators For example: long long test(long long a, long long b) { if (a << b > 0) return b; if (a << b < 0) return a; return a*b; } Produces: sld. 5, 3, 4 ble 0, .LBB0_2 mr 3, 4 blr .LBB0_2: # %if.end cmpldi 5, 0 li 5, 1 isel 4, 4, 5, 2 mulld 3, 4, 3 blr But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already contains the result of that comparison). The root cause of this is that LLVM converts signed comparisons into equality comparison based on dominance. Equality comparisons are unsigned by default, so we get either a record-form or cmp (without the l for logical) feeding a cmpl. That is the situation we want to avoid here. Differential Revision: https://reviews.llvm.org/D60506	2019-11-11 17:28:50 +00:00
Guillaume Chatelet	0e62011df8	[Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68944 llvm-svn: 374880	2019-10-15 11:24:36 +00:00
Joerg Sonnenberger	9681ea9560	Reapply r374743 with a fix for the ocaml binding Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784	2019-10-14 16:15:14 +00:00
Dmitri Gribenko	1a21f98ac3	Revert "Add a pass to lower is.constant and objectsize intrinsics" This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768	2019-10-14 12:22:48 +00:00
Joerg Sonnenberger	e4300c392d	Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743	2019-10-13 23:00:15 +00:00
Simon Pilgrim	e746380f6a	CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 374085	2019-10-08 17:00:01 +00:00
Guillaume Chatelet	ab11b9188d	[Alignment][NFC] Remove AllocaInst::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207	2019-09-30 13:34:44 +00:00
Jesper Antonsson	39b81f1cbc	[CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix. Summary: An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst() in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was: https://bugs.llvm.org/show_bug.cgi?id=41052 ... and the previous fix was D59358. Reviewers: aprantl, skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67838 llvm-svn: 373084	2019-09-27 13:01:37 +00:00
Sanjay Patel	2cec4b58f5	Revert [IR] allow fast-math-flags on phi of FP values This reverts r372866 (git commit `dec03223a9`) llvm-svn: 372868	2019-09-25 13:29:09 +00:00
Sanjay Patel	dec03223a9	[IR] allow fast-math-flags on phi of FP values The changes here are based on the corresponding diffs for allowing FMF on 'select': D61917 As discussed there, we want to have fast-math-flags be a property of an FP value because the alternative (having them on things like fcmp) leads to logical inconsistency such as: https://bugs.llvm.org/show_bug.cgi?id=38086 The earlier patch for select made almost no practical difference because most unoptimized conditional code begins life as a phi (based on what I see in clang). Similarly, I don't expect this patch to do much on its own either because SimplifyCFG promptly drops the flags when converting to select on a minimal example like: https://bugs.llvm.org/show_bug.cgi?id=39535 But once we have this plumbing in place, we should be able to wire up the FMF propagation and start solving cases like that. The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a regression in a LoopVectorize test. We are intersecting the FMF of any FPMathOperator there, so if a phi is not properly annotated, new math instructions may not be either. Once we fix the propagation in SimplifyCFG, it may be safe to remove that hack. Differential Revision: https://reviews.llvm.org/D67564 llvm-svn: 372866	2019-09-25 13:14:12 +00:00
David Green	a6e944b173	[CGP] Ensure sinking multiple instructions does not invalidate dominance checks In MVE, as of rL371218, we are attempting to sink chains of instructions such as: %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0 %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer In certain situations though, we can end up breaking the dominance relations of instructions. This happens when we sink the instruction into a loop, but cannot remove the originals. The Use is updated, which might in fact be a Use from the second instruction to the first. This attempts to fix that by reversing the order of instruction that are sunk, and ensuring that we update the uses on new instructions if they have already been sunk, not the old ones. Differential Revision: https://reviews.llvm.org/D67366 llvm-svn: 371743	2019-09-12 16:00:07 +00:00
Tim Northover	98534843fb	CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI. Up to now, we've decided whether to sink address calculations using GEPs or normal arithmetic based on the useAA hook, but there are other reasons GEPs might be preferred. So this patch splits the two questions, with a default implementation falling back to useAA. llvm-svn: 371721	2019-09-12 10:21:00 +00:00
Teresa Johnson	9c27b59cec	Change TargetLibraryInfo analysis passes to always require Function Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284	2019-09-07 03:09:36 +00:00
Craig Topper	93c2787193	[CGP] Remove ModifiedDT from the makeBitReverse loop I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283	2019-08-19 18:02:24 +00:00
Sanjay Patel	acceedb15f	[CodeGenPrepare] Fix use-after-free If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168	2019-08-16 23:10:34 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Kang Zhang	038dd43782	[NFC][CodeGen] Modify the type element of TailCalls to simplify the dupRetToEnableTailCallOpts() Summary: The old code can be simplified to define the element type of TailCalls as `BasicBlock` not `CallInst`. Also I use the for-range loop instead the for loop. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D64905 llvm-svn: 367644	2019-08-02 03:09:07 +00:00
Luis Marques	2e46312ffd	[DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting Some GEPs were not being split, presumably because that split would just be undone by the DAGCombiner. Not performing those splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates. This patch: - Makes the splits also occur in the cases where the base address and the GEP are in the same BB. - Ensures that the DAGCombiner doesn't reassociate them back again. Differential Revision: https://reviews.llvm.org/D60294 llvm-svn: 363544	2019-06-17 10:54:12 +00:00
Sanjay Patel	c8d88ad1a9	[CodeGenPrepare][x86] shift both sides of a vector select when profitable This is based on the example/discussion in PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that shift expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen. This extends CGP functionality introduced with: rL201655 Differential Revision: https://reviews.llvm.org/D63233 llvm-svn: 363511	2019-06-16 15:29:03 +00:00
Sanjay Patel	7ea378b940	[CodeGenPrepare] propagate debuginfo when copying a shuffle llvm-svn: 363409	2019-06-14 15:05:35 +00:00
Matt Arsenault	8dbeb9256c	TTI: Improve default costs for addrspacecast For some reason multiple places need to do this, and the variant the loop unroller and inliner use was not handling it. Also, introduce a new wrapper to be slightly more precise, since on AMDGPU some addrspacecasts are free, but not no-ops. llvm-svn: 362436	2019-06-03 18:41:34 +00:00
Bjorn Pettersson	b4771425f5	Use the DataLayout::typeSizeEqualsStoreSize helper. NFC Just a minor refactoring to use the new helper method DataLayout::typeSizeEqualsStoreSize(). This is done when checking if getTypeSizeInBits is equal/non-equal to getTypeStoreSizeInBits. llvm-svn: 361613	2019-05-24 09:20:20 +00:00
Simon Pilgrim	38ef296265	[CodeGenPrepare] Ensure we get a non-null result from getTrueOrFalseValue. NFCI. llvm-svn: 360328	2019-05-09 10:51:26 +00:00
QingShan Zhang	0e71a6e755	[CodeGenPrepare] Don't split the store if it is volatile We shouldn't split the store when it is volatile. Differential Revision: https://reviews.llvm.org/D61169 llvm-svn: 360228	2019-05-08 07:32:12 +00:00
Roman Lebedev	1a1b922177	[NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use it Summary: It is a common thing to loop over every `PHINode` in some `BasicBlock` and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block. `replaceSuccessorsPhiUsesWith()` already had code to do that, it just wasn't a function. So outline it into a new function, and use it. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61013 llvm-svn: 359996	2019-05-05 18:59:39 +00:00
Roman Lebedev	e3b1d82b53	[NFC] PHINode: introduce replaceIncomingBlockWith() function, use it Summary: There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()` and `PHINode::getNumOperands()`, but no function to replace every specified `BasicBlock` predecessor with some other specified `BasicBlock`. Clearly, there are a lot of places that could use that functionality. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61011 llvm-svn: 359995	2019-05-05 18:59:30 +00:00
Sanjay Patel	5ab41a7a05	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969	2019-05-04 12:46:32 +00:00
Evgeniy Stepanov	46ec57e576	Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908	2019-05-03 17:31:49 +00:00
Sanjay Patel	8ff072e48e	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. Also, we were restarting the iterator loops when doing the overflow intrinsic transforms by marking the dominator tree for update. That was done to prevent iterating over a removed instruction. But we can postpone the deletion using the existing "RemovedInsts" structure, and that means we don't need to update the DT. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359879	2019-05-03 13:09:18 +00:00
Francis Visoiu Mistrih	1646851b87	[CGP] Look through bitcasts when duplicating returns for tail calls The simple case of: ``` int callee(); void caller(void *a) { if (a == NULL) return callee(); return a; } ``` would generate a regular call instead of a tail call because we don't look through the bitcast of the call to `callee` when duplicating the return blocks. Differential Revision: https://reviews.llvm.org/D60837 llvm-svn: 359041	2019-04-23 21:57:46 +00:00
Eric Christopher	b6c190da23	Include what's used in a few cpp files - these were getting transitive includes from MCDwarf.h. llvm-svn: 358254	2019-04-12 06:16:33 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Teresa Johnson	b7e213808c	[CGP] Reset DT when optimizing select instructions Summary: A recent fix (r355751) caused a compile time regression because setting the ModifiedDT flag in optimizeSelectInst means that each time a select instruction is optimized the function walk in runOnFunction stops and restarts again (which was needed to build a new DT before we started building it lazily in r356937). Now that the DT is built lazily, a simple fix is to just reset the DT at this point, rather than restarting the whole function walk. In the future other places that set ModifiedDT may want to switch to just resetting the DT directly. But that will require an evaluation to ensure that they don't otherwise need to restart the function walk. Reviewers: spatel Subscribers: jdoerfert, llvm-commits, xur Tags: #llvm Differential Revision: https://reviews.llvm.org/D59889 llvm-svn: 357111	2019-03-27 18:44:25 +00:00
Teresa Johnson	3bd4b5a925	[CGP] Build the DominatorTree lazily Summary: In r355512 CGP was changed to build the DominatorTree only once per function traversal, to avoid repeatedly building it each time it was accessed. This solved one compile time issue but introduced another. In the second case, we now were building the DT unnecessarily many times when we performed many function traversals (i.e. more than once per function when running CGP because of changes made each time). Change to saving the DT in the CodeGenPrepare object, and building it lazily when needed. It is reset whenever we need to rebuild it. The case that exposed the issue there are 617 functions, and we walk them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a total of 12083 times (so previously we were building the DT 12083 times). With this patch we only build the DT 844 times (average of 1.37 times per function). We dropped the total time to compile this file from 538.11s without this patch to 339.63s with it. There is still an issue as CGP is taking much longer than all other passes even with this patch, and before a recent compiler release cut at r355392 the total time to this compile was only 97 sec with a huge reduction in CGP time. I suspect that one of the other recent changes to CGP led to iterating each function many more times on average, but I need to do some more investigation. Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59696 llvm-svn: 356937	2019-03-25 18:38:48 +00:00
Teresa Johnson	4dc851964c	[CGP] Make several static functions member functions (NFC) This is extracted from D59696 as suggested in the review. It is preparation for making the DominatorTree a member variable. llvm-svn: 356857	2019-03-24 15:18:50 +00:00
Sanjay Patel	d47eac59ef	[CodeGenPrepare] limit formation of overflow intrinsics (PR41129) This is probably a bigger limitation than necessary, but since we don't have any evidence yet that this transform led to real-world perf improvements rather than regressions, I'm making a quick, blunt fix. In the motivating x86 example from: https://bugs.llvm.org/show_bug.cgi?id=41129 ...and shown in the regression test, we want to avoid an extra instruction in the dominating block because that could be costly. The x86 LSR test diff is reversing the changes from D57789. There's no evidence that 1 version is any better than the other yet. Differential Revision: https://reviews.llvm.org/D59602 llvm-svn: 356665	2019-03-21 13:57:07 +00:00
Sanjay Patel	a2250e923b	[CGP] fix formatting; NFC llvm-svn: 356572	2019-03-20 16:47:53 +00:00
Sanjay Patel	d1ce455f7b	[CGP] convert chain of 'if' to 'switch'; NFC This should be extended, but CGP does some strange things, so I'm intentionally not changing the potential order of any transforms yet. llvm-svn: 356566	2019-03-20 15:53:06 +00:00
Mikael Holmen	339daae806	[CodeGenPrepare] avoid crashing from replacing a phi twice Summary: This is a fix to bug 41052: https://bugs.llvm.org/show_bug.cgi?id=41052 While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi. Patch by: JesperAntonsson Reviewers: skatkov, aprantl Reviewed By: aprantl Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59358 llvm-svn: 356260	2019-03-15 13:51:05 +00:00
Sanjay Patel	2c9275a790	[CGP] add another bailout for degenerate code (PR41064) This is almost the same as: rL355345 ...and should prevent any potential crashing from examples like: https://bugs.llvm.org/show_bug.cgi?id=41064 ...although the bug was masked by: rL355823 ...and I'm not sure how to repro the problem after that change. llvm-svn: 356218	2019-03-14 23:14:31 +00:00
Tim Northover	8935aca9c7	CodeGenPrep: preserve inbounds attribute when sinking GEPs. Targets can potentially emit more efficient code if they know address computations never overflow. For example ILP32 code on AArch64 (which only has 64-bit address computation) can ignore the possibility of overflow with this extra information. llvm-svn: 355926	2019-03-12 15:22:23 +00:00
Eugene Leviant	1e249caaec	[CGP] Fix UB when GEP is bound to trivial PHINode Differential revision: https://reviews.llvm.org/D59140 llvm-svn: 355904	2019-03-12 10:10:29 +00:00
Sam Parker	52760bf435	[CGP] Limit distance between overflow math and cmp Inserting an overflowing arithmetic intrinsic can increase register pressure by producing two values at a point where only one is needed, while the second use maybe several blocks away. This increase in pressure is likely to be more detrimental on performance than rematerialising one of the original instructions. So, check that the arithmetic and compare instructions are no further apart than their immediate successor/predecessor. Differential Revision: https://reviews.llvm.org/D59024 llvm-svn: 355823	2019-03-11 13:19:46 +00:00
Sanjay Patel	7d8260feb6	[CGP] fix comments; NFC llvm-svn: 355791	2019-03-10 18:42:30 +00:00
Rong Xu	ce3be45cac	[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst r44412 fixed a huge compile time regression but it needed ModifiedDT flag to be maintained correctly in optimizations in optimizeBlock() and optimizeInst(). Function optimizeSelectInst() does not update the flag. This patch propagates the flag in optimizeSelectInst() back to optimizeBlock(). This patch also removes ModifiedDT in CodeGenPrepare class (which is not used). The property of ModifiedDT is now recorded in a ref parameter. Differential Revision: https://reviews.llvm.org/D59139 llvm-svn: 355751	2019-03-08 22:46:18 +00:00
Teresa Johnson	b1daf0aef6	[CGP] Avoid repeatedly building DominatorTree causing long compile-time (NFC) Summary: In r354298 a DominatorTree construction was added via new function combineToUSubWithOverflow, which was subsequently restructured into replaceMathCmpWithIntrinsic in r354689. We are hitting a very long compile time due to this repeated construction, once per math cmp in the function. We shouldn't need to build the DominatorTree more than once per function, except when a transformation invalidates it. There is already a boolean flag that is returned from these methods indicating whether the DT has been modified. We can simply build the DT once per Function walk in CodeGenPrepare::runOnFunction, since any time a change is made we break out of the Function walk and restart it. I modified the code so that both replaceMathCmpWithIntrinsic as well as mergeSExts (which was also building a DT) use the DT constructed by the run method. From -mllvm -time-passes: Before this patch: CodeGen Prepare user time is 328s With this patch: CodeGen Prepare user time is 21s Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58995 llvm-svn: 355512	2019-03-06 14:57:40 +00:00
Sanjay Patel	3b2d0bc7c2	[CodeGenPrepare] avoid crashing on non-canonical/degenerate code The test is reduced from an example in the post-commit thread for: rL354746 http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html While we must avoid dying here, the real question should be: Why is non-canonical and/or degenerate code making it to CGP when using the new pass manager? llvm-svn: 355345	2019-03-04 22:47:13 +00:00
Sanjay Patel	cb04ba032f	[CGP] add special-cases to form unsigned add with overflow (PR40486) There's likely a missed IR canonicalization for at least 1 of these patterns. Otherwise, we wouldn't have needed the pattern-matching enhancement in D57516. Note that -- unlike usubo added with D57789 -- the TLI hook for this transform defaults to 'on'. So if there's any perf fallout from this, targets should look at how they're lowering the uaddo node in SDAG and/or override that hook. The x86 diffs suggest that there's some missing pattern-matching for forming inc/dec. This should fix the remaining known problems in: https://bugs.llvm.org/show_bug.cgi?id=40486 https://bugs.llvm.org/show_bug.cgi?id=31754 llvm-svn: 354746	2019-02-24 15:31:27 +00:00
Sanjay Patel	ffe1cf5e92	[CGP] move overflow intrinsic insertion to common location; NFCI We need to enhance the uaddo matching to handle special-cases as seen in PR40486 and PR31754. That means we won't necessarily have a def-use pattern, so we'll need to check dominance to determine where to place the intrinsic (as we already do for usubo). This preliminary patch is just rearranging the code, so the planned follow-up to improve uaddo will be more clear. llvm-svn: 354689	2019-02-22 20:20:24 +00:00
Sanjay Patel	198cc305e9	[CGP] match a special-case of unsigned subtract overflow This is the 'sub0' (negate) pattern from PR31754: https://bugs.llvm.org/show_bug.cgi?id=31754 llvm-svn: 354519	2019-02-20 21:23:04 +00:00
Sanjay Patel	d8b4efcb6b	[CGP] form usub with overflow from sub+icmp The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487: https://bugs.llvm.org/show_bug.cgi?id=31754 https://bugs.llvm.org/show_bug.cgi?id=40487 ..and those are shown in the IR test file and x86 codegen file. Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use. This adds a TLI hook that should preserve the existing behavior for uaddo formation, but disables usubo formation by default. Only x86 overrides that setting for now although other targets will likely benefit by forming usbuo too. Differential Revision: https://reviews.llvm.org/D57789 llvm-svn: 354298	2019-02-18 23:33:05 +00:00
Craig Topper	784929d045	Implementation of asm-goto support in LLVM This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563	2019-02-08 20:48:56 +00:00
Florian Hahn	3b251963c3	[CGP] Add support for sinking operands to their users, if they are free. This patch improves code generation for some AArch64 ACLE intrinsics. It adds support to CGP to duplicate and sink operands to their user, if they can be folded into a target instruction, like zexts and sub into usubl. It adds a TargetLowering hook shouldSinkOperands, which looks at the operands of instructions to see if sinking is profitable. I decided to add a new target hook, as for the sinking to be profitable, at least on AArch64, we have to look at multiple operands of an instruction, instead of looking at the users of a zext for example. The sinking is done in CGP, because it works around an instruction selection limitation. If instruction selection is not limited to a single basic block, this patch should not be needed any longer. Alternatively this could be done in the LoopSink pass, which tries to undo LICM for instructions in blocks that are not executed frequently. Note that we do not force the operands to sink to have a single user, because we duplicate them before sinking. Therefore this is only desirable if they really can be done for free. Additionally we could consider the impact on live ranges later on. This should fix https://bugs.llvm.org/show_bug.cgi?id=40025. As for performance, we have internal code that uses intrinsics and can be speed up by 10% by this change. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D57377 llvm-svn: 353152	2019-02-05 10:27:40 +00:00
Sanjay Patel	c00bdab4c8	[CGP] use IRBuilder to simplify code This is no-functional-change-intended although there could be intermediate variations caused by a difference in the debug info produced by setting that from the builder's insertion point. I'm updating the IR test file associated with this code just to show that the naming differences from using the builder are visible. The motivation for adding a helper function is that we are likely to extend this code to deal with other overflow ops. llvm-svn: 353056	2019-02-04 16:30:46 +00:00
Sanjay Patel	84ceae6048	[CGP] adjust target constraints for forming uaddo There are 2 changes visible here: 1. There's no reason to limit this transform based on number of condition registers. That diff allows PPC to produce slightly better (dot-instructions should be generally good) code. Note: someone that cares about PPC codegen might want to look closer at that output because it seems like we could still improve this. 2. We (probably?) should not bother trying to form uaddo (or other overflow ops) when there's no target support for such an op. This goes beyond checking whether the op is expanded because both PPC and AArch64 show better codegen for standard types regardless of whether the op is legal/custom. llvm-svn: 353001	2019-02-03 17:53:09 +00:00
Sanjay Patel	00fcc74e50	[CGP] refactor optimizeCmpExpression (NFCI) This is not truly NFC because we are bailing out without a TLI now. That should not be a real concern though because there should be a TLI in any real-world scenario. That seems better than passing around a pointer and then checking it for null-ness all over the place. The motivation is to fix what appears to be an unintended restriction on the uaddo transform - hasMultipleConditionRegisters() shouldn't be reason to limit the transform. llvm-svn: 352988	2019-02-03 13:48:03 +00:00
James Y Knight	7976eb5838	[opaque pointer types] Pass function types to CallInst creation. This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909	2019-02-01 20:43:25 +00:00
Philip Reames	ede49ddff5	Lower widenable_conditions in CGP This ensures that if we make it to the backend w/o lowering widenable_conditions first, that we generate correct code. Doing it in CGP - instead of isel - let's us fold control flow before hitting block local instruction selection. Differential Revision: https://reviews.llvm.org/D57473 llvm-svn: 352779	2019-01-31 18:45:46 +00:00
David L. Jones	d81f23071c	Revert "Reapply "[CGP] Check for existing inttotpr before creating new one"" This change reverts r351626. The changes in r351626 cause quadratic work in several cases. (See r351626 thread on llvm-commits for details.) llvm-svn: 352722	2019-01-31 03:28:46 +00:00
Erik Pilkington	600e9deacf	Add a 'dynamic' parameter to the objectsize intrinsic This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664	2019-01-30 20:34:35 +00:00
Jonas Paulsson	5ed4d4638f	[CodeGenPrepare] Handle all debug calls in dupRetToEnableTailCallOpts() This patch makes sure that a debug value that is after the bitcast in dupRetToEnableTailCallOpts() is also skipped. The reduced test case is from SPEC-2006 on SystemZ. Review: Vedant Kumar, Wolfgang Pieb https://reviews.llvm.org/D57050 llvm-svn: 352462	2019-01-29 09:03:35 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Roman Tereshin	a0383d6c1f	Reapply "[CGP] Check for existing inttotpr before creating new one" Original commit: r351582 llvm-svn: 351626	2019-01-19 03:37:25 +00:00
Roman Tereshin	022bf3e8e7	Revert "Reapply "[CGP] Check for existing inttotpr before creating new one"" This reverts commit r351618. Compiler RT + ASAN tests are failing for PowerPC. Not sure how would I reproduce these on macOS, so reverting (again) until I do. llvm-svn: 351619	2019-01-19 01:53:26 +00:00
Roman Tereshin	dd6f9f68bb	Reapply "[CGP] Check for existing inttotpr before creating new one" Original commit: r351582 llvm-svn: 351618	2019-01-19 01:41:03 +00:00
Roman Tereshin	86ac532687	Revert "[CGP] Check for existing inttotpr before creating new one" This reverts commit r351582. Bots are failing. Reverting this to fix and re-commit later. llvm-svn: 351598	2019-01-18 21:38:44 +00:00
Roman Tereshin	85a0467a11	[CGP] Check for existing inttotpr before creating new one Make sure CodeGenPrepare doesn't emit multiple inttoptr instructions of the same integer value while sinking address computations, but rather CSEs them on the fly: excessive inttoptr's confuse SCEV into thinking that related pointers have nothing to do with each other. This problem blocks LoadStoreVectorizer from vectorizing some of the loads / stores in a downstream target. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D56838 llvm-svn: 351582	2019-01-18 20:13:42 +00:00
Eli Friedman	a69084ffa8	[CodeGenPrepare] Fix bad IR created by large offset GEP splitting. Creating the IR builder, then modifying the CFG, leads to an IRBuilder where the BB and insertion point are inconsistent, so new instructions have the wrong parent. Modified an existing test because the test wasn't covering anything useful (the "invoke" was not actually an invoke by the time we hit the code in question). Differential Revision: https://reviews.llvm.org/D55729 llvm-svn: 349693	2018-12-19 22:52:04 +00:00
Wolfgang Pieb	ac874c48ca	[Debuginfo] Prevent CodeGenPrepare from dropping debuginfo references. This fixes PR39845. CodeGenPrepare employs a transactional model when performing optimizations, i.e. it changes the IR to attempt an optimization and rolls back the change when it finds the change inadequate. It is during the rollback that references to locals were dropped from debug value intrinsics. This patch reinstates debuginfo references during rollbacks. Reviewers: aprantl, vsk Differential Revision: https://reviews.llvm.org/D55396 llvm-svn: 348896	2018-12-11 21:13:53 +00:00
Serguei Katkov	2673f1783e	[CGP] Improve compile time for complex addressing mode This is a fix for PR39625 with improvement the compile time by reducing the number of intermediate Phi nodes created. Reviewers: john.brawn, reames Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54932 llvm-svn: 347839	2018-11-29 06:45:18 +00:00
Serge Guelton	12c7a96064	Fix disturbing warning - NFCI llvm-svn: 347186	2018-11-19 10:05:28 +00:00
Vedant Kumar	e7b789b529	[ProfileSummary] Standardize methods and fix comment Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182	2018-11-19 05:23:16 +00:00
Ali Tamur	d482b01a62	Use a data structure better suited for large sets in SimplificationTracker. Summary: D44571 changed SimplificationTracker to use SmallSetVector to keep phi nodes. As a result, when the number of phi nodes is large, the build time performance suffers badly. When building for power pc, we have a case where there are more than 600.000 nodes, and it takes too long to compile. In this change, I partially revert D44571 to use SmallPtrSet, which does an acceptable job with any number of elements. In the original patch, having a deterministic iteration order was mentioned as a motivation, however I think it only applies to the nodes already matched in MatchPhiSet method, which I did not touch. Reviewers: bjope, skatkov Reviewed By: bjope, skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54007 llvm-svn: 346710	2018-11-12 21:43:43 +00:00
James Y Knight	72f76bf230	Add support for llvm.is.constant intrinsic (PR4898) This adds the llvm-side support for post-inlining evaluation of the __builtin_constant_p GCC intrinsic. Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call to a function where canConstantFoldTo returns true, and one of the arguments is a struct. Updated from patch initially by Janusz Sobczak. Differential Revision: https://reviews.llvm.org/D4276 llvm-svn: 346322	2018-11-07 15:24:12 +00:00
Peter Collingbourne	abd820a92b	CGP: Clear data structures at the end of a loop iteration instead of the beginning. Clearing LargeOffsetGEPMap at the end fixes a bug where if a large offset GEP is in a dead basic block, we fail an assertion when trying to delete the block due to the asserting VH in LargeOffsetGEPMap. Differential Revision: https://reviews.llvm.org/D53464 llvm-svn: 345082	2018-10-23 21:23:18 +00:00
Krzysztof Pszeniczny	2bfe759a8d	Fix a use-after-RAUW bug in large GEP splitting Summary: Large GEP splitting, introduced in rL332015, uses a `DenseMap<AssertingVH<Value>, ...>`. This causes an assertion to fail (in debug builds) or undefined behaviour to occur (in release builds) when a value is RAUWed. This manifested itself in the 7zip benchmark from the llvm test suite built on ARM with `-fstrict-vtable-pointers` enabled while RAUWing invariant group launders and splits in CodeGenPrepare. This patch merges the large offsets of the argument and the result of an invariant.group strip/launder intrinsic before RAUWing. Reviewers: Prazek, javed.absar, haicheng, efriedma Reviewed By: Prazek, efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51936 llvm-svn: 344802	2018-10-19 19:02:16 +00:00
Fangrui Song	0cac726a00	llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...) Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163	2018-09-27 02:13:45 +00:00
David Green	353cb3d4e5	[CodeGen] Enable tail calls for functions with NonNull attributes. Adding NonNull as attributes to returned pointers has the unfortunate side effect of disabling tail calls. This patch ignores the NonNull attribute when we decide whether to tail merge, in the same way that we ignore the NoAlias attribute, as it has no affect on the call sequence. Differential Revision: https://reviews.llvm.org/D52238 llvm-svn: 343091	2018-09-26 10:46:18 +00:00
Vedant Kumar	1b02dad9f2	[CodeGenPrepare] Preserve debug locs in OptimizeExtractBits CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it easier for the backend to emit fancy extract-bits instructions (e.g UBFX). Teach it to preserve debug locations and salvage debug values. llvm-svn: 342319	2018-09-15 04:08:52 +00:00
David Green	e27e87cdcb	[CGP] Ensure splitgep gives deterministic output The output of splitLargeGEPOffsets does not appear to be deterministic because of the way that we iterate over a DenseMap. I've changed it to a MapVector for consistent output. The test here isn't particularly great, only showing a consmetic difference in output. The original reproducer is much larger but show a diffierence in instruction ordering, leading to different codegen. Differential Revision: https://reviews.llvm.org/D51851 llvm-svn: 342043	2018-09-12 10:19:10 +00:00
David Blaikie	7d30653259	Revert "[CodeGenPrepare] Scan past debug intrinsics to find select candidates (NFC)" This causes crashes due to the interleaved dbg.value intrinsics being left at the end of basic blocks, causing the actual terminators (br, etc) to be not where they should be (not at the end of the block), leading to later crashes. Further discussion on the original commit thread. This reverts commit r340368. llvm-svn: 340794	2018-08-28 00:55:19 +00:00
Vedant Kumar	a85ca3de66	[CodeGenPrepare] Set debug locs when folding a comparison into a uadd.with.overflow CGP can replace a branch + select with a uadd.with.overflow. Teach it to set debug locations as it does this. llvm-svn: 340432	2018-08-22 18:15:03 +00:00
Vedant Kumar	4760686823	[CodeGenPrepare] Set debug loc when widening a switch condition Set a debug location on the cast instruction used to widen a switch condition. llvm-svn: 340379	2018-08-22 01:23:31 +00:00
Vedant Kumar	1e8a2c963c	[CodeGenPrepare] Set debug locations when splitting selects When splitting a select into a diamond, set debug locations on newly-created branch instructions and phi nodes. llvm-svn: 340371	2018-08-22 00:10:37 +00:00
Vedant Kumar	30406fd789	[CodeGenPrepare] Clean up dbg.value use-before-def as late as possible CodeGenPrepare has a strategy for moving dbg.values so that a value's definition always dominates its debug users. This cleanup was happening too early (before certain CGP transforms were run), resulting in some dbg.value use-before-def errors. Perform this cleanup as late as possible to avoid use-before-def. llvm-svn: 340370	2018-08-21 23:43:08 +00:00
Vedant Kumar	00e7558edd	[CodeGenPrepare] Scan past debug intrinsics to find select candidates (NFC) In optimizeSelectInst, when scanning for candidate selects to rewrite into branches, scan past debug intrinsics. This makes the debug-enabled and non-debug paths through optimizeSelectInst more congruent. NFC because every select is eventually visited either way. llvm-svn: 340368	2018-08-21 23:42:38 +00:00
Vedant Kumar	fbc3873be9	[CodeGenPrepare] Exit earlier when optimizing selects (NFC) When optimizing for size, this allows optimizeSelectInst to skip a linear scan and exit early. llvm-svn: 340367	2018-08-21 23:42:23 +00:00
Guozhi Wei	8c17f9a77d	[CodeGenPrepare] Add BothExtension type to PromotedInsts This patch fixes PR38125. Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough. This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended. Differential Revision: https://reviews.llvm.org/D49512 llvm-svn: 339822	2018-08-15 22:08:26 +00:00
Simon Pilgrim	ee82a79041	[CGP] Fix GEP issue with out of range APInt constant values not fitting in int64_t Test case reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7173 llvm-svn: 339556	2018-08-13 12:10:09 +00:00
Jonas Devlieghere	42243df3b9	Fix inconsistency with/without debug information (-g) This fixes an inconsistency in code generation when compiling with or without debug information (-g). When debug information is available in an empty block, the original test would fail, resulting in possibly different code. Patch by: Jeroen Dobbelaere Differential revision: https://reviews.llvm.org/D49467 llvm-svn: 339129	2018-08-07 12:14:01 +00:00
Fangrui Song	cb0bab86b3	[CodeGen] Fix inconsistent declaration parameter name llvm-svn: 337200	2018-07-16 18:51:40 +00:00
Vedant Kumar	b3091da3af	Use Type::isIntOrPtrTy where possible, NFC It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() \|\| T.isPointerTy()`. I used Python's re.sub with this regex to update users: r'([\w.\->()]+)isIntegerTy\s\\|\\|\s\1isPointerTy' llvm-svn: 336462	2018-07-06 20:17:42 +00:00
David Stenberg	23bba56fce	[CodeGen] Make block removal order deterministic in CodeGenPrepare Summary: Replace use of a SmallPtrSet with a SmallSetVector to make the worklist iteration order deterministic. This is done as the order the blocks are removed may affect whether or not PHI nodes in successor blocks are removed. For example, consider the following case where %bb1 and %bb2 are removed: bb1: br i1 undef, label %bb3, label %bb4 bb2: br i1 undef, label %bb4, label %bb3 bb3: pv1 = phi type [ undef, %bb1 ], [ undef, %bb2], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ undef, %bb1 ], [ undef, %bb2 ], [ pv1, %bb3 ], [ v0, %other ] If %bb2 is removed before %bb1, the incoming values from %bb1 and %bb2 to pv1 will be removed before %bb1 is removed as a predecessor to %bb4. The pv1 node will thus be optimized out (to v0) at the time %bb1 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb1 has been removed: bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] The pv2 PHI node will be optimized away by removePredecessor() as all incoming values are identical. In case %bb2 is removed after %bb1, pv1 will not be optimized out at the time %bb2 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb2 to pv2 has been removed: bb3: pv1 = phi type [ undef, %bb2 ], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ pv1, %bb3 ], [ v0, %other ] The pv2 PHI node will thus not be removed in this case, ultimately leading to the following output bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] I have not looked into changing DeleteDeadBlock() so that the redundant PHI nodes are removed. I have not added a test case, as I was not able to create a particularly small and (not messy) reproducer. This is likely due to SmallPtrSet behaving deterministically when in small mode. Reviewers: void, dexonsmith, spatel, skatkov, fhahn, bkramer, nhaehnle Reviewed By: fhahn Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D48369 llvm-svn: 336109	2018-07-02 14:23:48 +00:00
Piotr Padlewski	5b3db45e8f	Implement strip.invariant.group Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073	2018-07-02 04:49:30 +00:00
Alina Sbirlea	dfd14adeb0	Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred. Summary: Two utils methods have essentially the same functionality. This is an attempt to merge them into one. 1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred 2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor Prior to the patch: 1. MergeBasicBlockIntoOnlyPred Updates either DomTree or DeferredDominance Moves all instructions from Pred to BB, deletes Pred Asserts BB has single predecessor If address was taken, replace the block address with constant 1 (?) 2. MergeBlockIntoPredecessor Updates DomTree, LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken After the patch: Method 2. MergeBlockIntoPredecessor is attempting to become the new default: Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults Moves all instruction from BB to Pred, deletes BB Returns if doesn't have a single predecessor Returns if BB's address was taken Uses of MergeBasicBlockIntoOnlyPred that need to be replaced: 1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp Updated in this patch. No challenges. 2. lib/CodeGen/CodeGenPrepare.cpp Updated in this patch. i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation. ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks Some interesting aspects: - Since Pred is not deleted (BB is), the entry block does not need updating. - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred. - isMergingEmptyBlockProfitable assumes BB is the one to be deleted. - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead. - adding some test owner as subscribers for the interesting tests modified: test/CodeGen/X86/avx-cmp.ll test/CodeGen/AMDGPU/nested-loop-conditions.ll test/CodeGen/AMDGPU/si-annotate-cf.ll test/CodeGen/X86/hoist-spill.ll test/CodeGen/X86/2006-11-17-IllegalMove.ll 3. lib/Transforms/Scalar/JumpThreading.cpp Not covered in this patch. It is the only use case using the DeferredDominance. I would defer to Brian Rzycki to make this replacement. Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D48202 llvm-svn: 335183	2018-06-20 22:01:04 +00:00
Hiroshi Inoue	c73b6d6bf7	[NFC] fix trivial typos in comments llvm-svn: 335096	2018-06-20 05:29:26 +00:00
Guozhi Wei	c4c6b548c5	[CodeGenPrepare] Move Extension Instructions Through Logical And Shift Instructions CodeGenPrepare pass move extension instructions close to load instructions in different BB, so they can be combined later. But the extension instructions can't move through logical and shift instructions in current implementation. This patch enables this enhancement, so we can eliminate more extension instructions. Differential Revision: https://reviews.llvm.org/D45537 This is re-commit of r331783, which was reverted by r333305. The performance regression was caused by some unlucky alignment, not a code generation problem. llvm-svn: 334049	2018-06-05 21:03:52 +00:00
David Blaikie	31b98d2e99	Move Analysis/Utils/Local.h back to Transforms Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954	2018-06-04 21:23:21 +00:00
Guozhi Wei	03005d3f03	[CodeGenPrepare] Revert r331783 The patch r331783 caused regression in one of our internal application. So revert it now, will investigate it further. llvm-svn: 333305	2018-05-25 20:30:26 +00:00
Vedant Kumar	40399a213d	[DebugInfo] Maintain DI when converting GEP to bitcast When a GEP with all zero indices is converted to bitcast, its DI wasn't copied over to the newly created instruction. This patch fixes that bug. Patch by Kareem Ergawy! Differential Revision: https://reviews.llvm.org/D47347 llvm-svn: 333235	2018-05-24 23:00:21 +00:00
Vedant Kumar	9374c0432b	[DebugInfo] Maintain DI for sunken bitcasts When a bitcast is being sunk in -codegenprepare pass, its DI wasn't copied over to the newly created instruction. This patch fixes that bug. Patch by Kareem Ergawy! Differential Revision: https://reviews.llvm.org/D47282 llvm-svn: 333133	2018-05-23 22:03:48 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Vedant Kumar	e0b5f86b30	[STLExtras] Add distance() for ranges, pred_size(), and succ_size() This commit adds a wrapper for std::distance() which works with ranges. As it would be a common case to write `distance(predecessors(BB))`, this also introduces `pred_size()` and `succ_size()` helpers to make that easier to write. Differential Revision: https://reviews.llvm.org/D46668 llvm-svn: 332057	2018-05-10 23:01:54 +00:00
Haicheng Wu	0aae2bc260	[CGP] Split large data structres to sink more GEPs Accessing the members of a large data structures needs a lot of GEPs which usually have large offsets due to the size of the underlying data structure. If the offsets are too large to fit into the r+i addressing mode, these GEPs cannot be sunk to their users' blocks and many extra registers are needed then to carry the values of these GEPs. This patch tries to split a large data struct starting from %base like the following. Before: BB0: %base = BB1: %gep0 = gep %base, off0 %gep1 = gep %base, off1 %gep2 = gep %base, off2 BB2: %load1 = load %gep0 %load2 = load %gep1 %load3 = load %gep2 After: BB0: %base = %new_base = gep %base, off0 BB1: %new_gep0 = %new_base %new_gep1 = gep %new_base, off1 - off0 %new_gep2 = gep %new_base, off2 - off0 BB2: %load1 = load i32, i32* %new_gep0 %load2 = load i32, i32* %new_gep1 %load3 = load i32, i32* %new_gep2 In the above example, the struct is split into two parts. The first part still starts from %base and the second part starts from %new_base. After the splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to BB2 and folded into their users. The algorithm to split data structure is simple and very similar to the work of merging SExts. First, it collects GEPs that have large offsets when iterating the blocks. Second, it splits the underlying data structures and updates the collected GEPs to use smaller offsets. Differential Revision: https://reviews.llvm.org/D42759 llvm-svn: 332015	2018-05-10 18:27:36 +00:00
Guozhi Wei	1aea95a9ea	[CodeGenPrepare] Move Extension Instructions Through Logical And Shift Instructions CodeGenPrepare pass move extension instructions close to load instructions in different BB, so they can be combined later. But the extension instructions can't move through logical and shift instructions in current implementation. This patch enables this enhancement, so we can eliminate more extension instructions. Differential Revision: https://reviews.llvm.org/D45537 llvm-svn: 331783	2018-05-08 17:58:32 +00:00
Piotr Padlewski	5dde809404	Rename invariant.group.barrier to launder.invariant.group Summary: This is one of the initial commit of "RFC: Devirtualization v2" proposal: https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing Reviewers: rsmith, amharc, kuhar, sanjoy Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45111 llvm-svn: 331448	2018-05-03 11:03:01 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	13e77db2df	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
David Blaikie	2be3922807	Fix a couple of layering violations in Transforms Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165	2018-03-21 22:34:23 +00:00
Bjorn Pettersson	bf3213e485	[CGP] Avoid segmentation fault when doing PHI node simplifications Summary: Made PHI node simplifiations more robust in several ways: - Minor refactoring to let the SimplificationTracker own the sets with new PHI/Select nodes that are introduced. This is maybe not mapping to the original intention with the SimplificationTracker, but IMHO it encapsulates the logic behind those sets a little bit better. - MatchPhiNode can sometimes populate the Matched set with several entries, where it maps one PHI node to different candidates for replacement. The Matched set is changed into a SmallSetVector to make sure we get a deterministic iteration when doing the replacements. - As described above we may get several different replacements for a single PHI node. The loop in MatchPhiSet that is doing the replacements could end up calling eraseFromParent several times for the same PHI node, resulting in segmentation faults. This problem was supposed to be fixed in rL327250, but due to the non-determinism(?) it only appeared to be fixed (I still got crashes sometime when turning on/off -print-after-all etc to get different iteration order in the DenseSets). With this patch we follow the deterministic ordering in the Matched set when replacing the PHI nodes. If we find a new replacement for an already replaced PHI node we replace the new replacement by the old replacement instead. This is quite similar to what happened in the rl327250 patch, but here we also recursively verify that the old replacement hasn't been replaced already. - It was really hard to track down the fault described above (segementation fault due to doing eraseFromParent multiple times for the same instruction). The fault was intermittent and small changes in the code, or simply turning on -print-after-all etc could make the problem go away. This was basically due to the iteration over PhiNodesToMatch in MatchPhiSet no being deterministic. Therefore I've changed the data structure for the SimplificationTracker::AllPhiNodes into an SmallSetVector. This gives a deterministic behavior. Reviewers: skatkov, john.brawn Reviewed By: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44571 llvm-svn: 327961	2018-03-20 09:06:37 +00:00
Jonas Paulsson	5612bb292c	[CodeGenPrepare] Respect endianness in splitMergedValStore. splitMergedValStore will split a store into two if target prefers this, or if -force-split-store is passed. This patch adds the missing handling for endianness in this function along with a test case. Review: Eli Friedman https://reviews.llvm.org/D44396 llvm-svn: 327375	2018-03-13 08:36:20 +00:00
Serguei Katkov	a20e05bb94	[CGP] Fix the remove of matched phis in complex addressing mode When we replace the Phi we created with matched ones it is possible that there are two identical phi nodes in IR. And matcher is smart enough to find that new created phi matches both of them. So we try to replace our phi node with matched ones twice and what is bad we delete our phi node twice causing a crash. As soon as we found that we have two identical Phi nodes it makes sense to do a clean-up and replace one phi node by other one. The patch implements it. Reviewers: john.brawn, reames Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43758 llvm-svn: 327250	2018-03-12 03:50:07 +00:00
Elena Demikhovsky	945b7e5aa6	Adding a width of the GEP index to the Data Layout. Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102	2018-02-14 06:58:08 +00:00
Daniel Neilson	be58a220e9	[CodeGenPrepare] Improve source and dest alignments of memory intrinsics independently Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the CodeGenPrepare pass to be more aggressive in improving the source and destination alignments of memcpy/memmove/memset by exploiting our new ability to record independent alignments for each argument. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 323891	2018-01-31 17:24:53 +00:00
Serguei Katkov	9fe0524ee6	[CGP] Re-enable Select in complex addressing mode. Switch Select handling on after fixing two bugs: rL323192 and rL323497. llvm-svn: 323498	2018-01-26 06:26:56 +00:00
Serguei Katkov	17e5794f11	[CGP] Fix the GV handling in complex addressing mode If in complex addressing mode the difference is in GV then base reg should not be installed because we plan to use base reg as a merge point of different GVs. This is a fix for PR35980. Reviewers: reames, john.brawn, santosh Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42230 llvm-svn: 323192	2018-01-23 12:07:49 +00:00
Serguei Katkov	22bb1c0e17	Revert [CGP] Re-enable Select in complex addressing mode One of buildbots failed. Revert for now till fix the issue. llvm-svn: 322923	2018-01-19 04:52:39 +00:00
Daniel Neilson	2409d24201	[NFC] Change MemIntrinsicInst::setAlignment() to take an unsigned instead of a Constant Summary: In preparation for https://reviews.llvm.org/D41675 this NFC changes this prototype of MemIntrinsicInst::setAlignment() to accept an unsigned instead of a Constant. llvm-svn: 322403	2018-01-12 21:33:37 +00:00
Serguei Katkov	76a1de3cd5	[CGP] Re-enable Select in complex addressing mode Re-enable Select after a couple of fixes. Differential Revision: https://reviews.llvm.org/D40634 llvm-svn: 322358	2018-01-12 08:33:34 +00:00
Eric Christopher	d72f78e7c8	Tidy some grammar in some comments llvm-svn: 322133	2018-01-09 23:25:38 +00:00
Serguei Katkov	4d1dd6b53a	[CGP] Fix Complex addressing mode for offset If the offset is differ in two addressing mode we can continue only if ScaleReg is not set due to we will use it as merge of different offsets. It should fix PR35799 and PR35805. Reviewers: john.brawn, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41227 llvm-svn: 322056	2018-01-09 04:37:06 +00:00
Benjamin Kramer	c7fc81e659	Use phi ranges to simplify code. No functionality change intended. llvm-svn: 321585	2017-12-30 15:27:33 +00:00
Teresa Johnson	a4ce3bfdda	[PGO] Function section hotness prefix should look at all blocks Summary: The function section prefix for PGO based layout (e.g. hot/unlikely) should look at the hotness of all blocks not just the entry BB. A function with a cold entry but a very hot loop should be placed in the hot section, for example, so that it is located close to other hot functions it may call. For SamplePGO it was already looking at the branch weights on calls, and I made that code conditional on whether this is SamplePGO since it was essentially a noop for instrumentation PGO anyway. Reviewers: davidxl Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D41395 llvm-svn: 321197	2017-12-20 17:53:10 +00:00
Haicheng Wu	0be8825146	[CGP] Format. NFC Clang-format. llvm-svn: 321107	2017-12-19 20:53:32 +00:00
Serguei Katkov	b0b67a8d38	[CGP] Fix the handling select inst in complex addressing mode When we put the value in select placeholder we must pass the value through simplification tracker due to the value might be already simplified and erased. This is a fix for PR35658. Reviewers: john.brawn, uabelho Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41251 llvm-svn: 320956	2017-12-18 04:25:07 +00:00
Serguei Katkov	ac4a8fb1cd	Revert "[CGP] Enable select in complex addr mode" Causes: Assertion `ScaledReg == nullptr' failed. This actually a revert of rL320551. llvm-svn: 320553	2017-12-13 07:39:35 +00:00
Serguei Katkov	b8cb5da28d	[CGP] Enable select in complex addr mode Enable select instruction handling in complex addr modes. Reviewers: john.brawn, reames, aaboud Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40634 llvm-svn: 320551	2017-12-13 06:57:59 +00:00
Hiroshi Yamauchi	9364fa3434	Move splitIndirectCriticalEdges() to BasicBlockUtils.h. Summary: Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so that it can be called from other places. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40750 llvm-svn: 319689	2017-12-04 20:36:01 +00:00
Serguei Katkov	d4df744434	[CGP] Enable complex addr mode Enable complex addr modes after two critical fixes: rL319109 and rL319292 llvm-svn: 319302	2017-11-29 09:48:50 +00:00

... 2 3 4 5 6 ...

665 Commits