llvm-project

Commit Graph

Author	SHA1	Message	Date
Wang, Pengfei	16c2067cf2	[X86][AMX] Fix compilation warning introduced by `981a0bd8`.	2020-12-30 22:22:13 +08:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Arthur O'Dwyer	22cf54a7fb	Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC. Differential Revision: https://reviews.llvm.org/D76572	2020-12-22 19:54:29 -05:00
Kazu Hirata	966f1431de	[Target] Use llvm::erase_if (NFC)	2020-12-20 17:43:22 -08:00
Krzysztof Parzyszek	fe0527e1c7	[Hexagon] Temporarily disable vector realignment for non-HVX vectors	2020-12-15 19:03:07 -06:00
Krzysztof Parzyszek	16385643bb	[Hexagon] Emit enough stores when aligning vector addresses	2020-12-15 18:59:53 -06:00
Krzysztof Parzyszek	71601d2ac9	[Hexagon] Fix bitcasting v1i8 -> i8	2020-12-15 16:01:24 -06:00
Reid Kleckner	55fc64bce0	[Hexagon] Tweak _MSC_VER workaround version My bot runs VS 2019, but it could not compile this code. Message: [55/2465] Building CXX object lib\Target\Hexagon\CMakeFiles\LLVMHexagonCodeGen.dir\HexagonVectorCombine.cpp.obj FAILED: lib/Target/Hexagon/CMakeFiles/LLVMHexagonCodeGen.dir/HexagonVectorCombine.cpp.obj ... C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): error C2976: 'std::map': too few template arguments C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.23.28105\include\map(71): note: see declaration of 'std::map' The version in the path, 14.23, corresponds to _MSC_VER 1923, so raise the version floor to 1924. I have not tested with versions between 1924 and 1928 (latest), but the latest works with the variadic version.	2020-12-14 11:26:36 -08:00
Kazu Hirata	913515e465	[Target] Use llvm::is_contained (NFC)	2020-12-13 19:35:10 -08:00
Krzysztof Parzyszek	baf931a842	[Hexagon] Reconsider getMask fix, return original mask, convert later The getPayload/getMask/getPassThrough functions should return values that could be composed into a masked load/store without any additional type casts. The previous fix violated that. Instead, convert scalar mask to a vector right before rescaling.	2020-12-12 13:27:22 -06:00
Krzysztof Parzyszek	2cf5310471	[Hexagon] Create vector masks for scalar loads/stores AlignVectors treats all loaded/stored values as vectors of bytes, and masks as corresponding vectors of booleans, so make getMask produce a 1-element vector for scalars from the start.	2020-12-12 11:12:17 -06:00
Krzysztof Parzyszek	2d8cc5479b	[Hexagon] Workaround for compilation error with VS2017	2020-12-11 15:11:44 -06:00
Krzysztof Parzyszek	7c9afe9183	[Hexagon] Fix gcc6 compilation issue	2020-12-10 08:17:07 -06:00
Benjamin Kramer	eeb713bbe2	[Hexagon] Fold single-use variables into assert. NFCI. Silences unused variable warnings in Release builds.	2020-12-10 10:53:56 +01:00
Krzysztof Parzyszek	e3b2828b9d	[Hexagon] Silence warnings about unused objects	2020-12-09 17:54:10 -06:00
Krzysztof Parzyszek	43d1c7a564	[Hexagon] Fix build: move template specialization into namespace scope	2020-12-09 17:40:15 -06:00
Krzysztof Parzyszek	f5d07a05bb	[Hexagon] Realign HVX vectors wherever possible Introduce HexagonVectorCombine as a helper class for vector-related optimizations.	2020-12-09 17:11:25 -06:00
Mircea Trofin	bab72dd5d5	[NFC][MC] TargetRegisterInfo::getSubReg is a MCRegister. Typing the API appropriately. Differential Revision: https://reviews.llvm.org/D92341	2020-12-02 15:46:38 -08:00
Krzysztof Parzyszek	b7bde0e4f3	[Hexagon] Improve check for HVX types Allow non-simple types, like <17 x i32> to be treated as HVX vector types.	2020-11-27 13:33:10 -06:00
Simon Pilgrim	c4628460b7	[Hexagon] Add HVX support for ISD::SMAX/SMIN/UMAX/UMIN instead of custom dag patterns Followup to D92112 now that I've learnt about HVX type splitting. This is some necessary cleanup work for min/max ops to eventually help us move the add/sub sat patterns into DAGCombine - D91876. Differential Revision: https://reviews.llvm.org/D92169	2020-11-27 15:46:11 +00:00
Nikita Popov	4df8efce80	[AA] Split up LocationSize::unknown() Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object). This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses. The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way. Differential Revision: https://reviews.llvm.org/D91649	2020-11-26 18:39:55 +01:00
Simon Pilgrim	a015635629	[Hexagon] Add support for ISD::SMAX/SMIN/UMAX/UMIN instead of custom dag patterns This should handle the basic integer min/max handling - the HVX ops are still TODO. This is some necessary cleanup work for min/max ops to eventually help us move the add/sub sat patterns into DAGCombine - D91876. Differential Revision: https://reviews.llvm.org/D92112	2020-11-25 19:02:17 +00:00
Craig Topper	4252f7773a	[SelectionDAG][ARM][AArch64][Hexagon][RISCV][X86] Add SDNPCommutative to fma and fmad nodes in tablegen. Remove explicit commuted patterns from targets. X86 was already specially marking fma as commutable which allowed tablegen to autogenerate commuted patterns. This moves it to the target independent definition and fix up the targets to remove now unneeded patterns. Unfortunately, the tests change because the commuted version of the patterns are generating operands in a different than the explicit patterns. Differential Revision: https://reviews.llvm.org/D91842	2020-11-23 10:09:20 -08:00
Arthur Eubanks	ac7419bb4f	[Hexagon][NewPM] Port -hexagon-loop-idiom and add to pipeline Fixes pmpy-mod.ll under NPM Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D91829	2020-11-20 09:34:37 -08:00
Gaurav Jain	06fcc4f06f	[NFC] Use [MC]Register for Hexagon target Differential Revision: https://reviews.llvm.org/D91160	2020-11-18 08:17:07 -08:00
Florian Hahn	b2f4c5fddc	[AsmWriter] Factor out mnemonic generation to accessible getMnemonic. This patch factors out the part of printInstruction that gets the mnemonic string for a given MCInst. This is intended to be used subsequently for the instruction-mix remarks to display the final mnemonic (D90040). Unfortunately making `getMnemonic` available to the AsmPrinter seems to require making it virtual. Not sure if there's a way around that with the current layering of the AsmPrinters. Reviewed By: Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D90039	2020-11-17 09:47:38 +00:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Sander de Smalen	d57bba7cf8	[SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference. To accommodate frame layouts that have both fixed and scalable objects on the stack, describing a stack location or offset using a pointer + uint64_t is not sufficient. For this reason, we've introduced the StackOffset class, which models both the fixed- and scalable sized offsets. The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset, so that this can be used in other interfaces, such as to eliminate frame indices in PEI or to emit Debug locations for variables on the stack. This patch is purely mechanical and doesn't change the behaviour of how the result of this function is used for fixed-sized offsets. The patch adds various checks to assert that the offset has no scalable component, as frame offsets with a scalable component are not yet supported in various places. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90018	2020-11-05 11:02:18 +00:00
Krzysztof Parzyszek	b26a2755dc	[Hexagon] Move isTypeForHVX from Hexagon TTI to HexagonSubtarget, NFC It's useful outside of Hexagon TTI, and with how TTI is implemented, it is not accessible outside of TTI.	2020-11-02 14:00:45 -06:00
Florian Hahn	b3b993a7ad	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit `408c4408fa`. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00
Florian Hahn	408c4408fa	Revert "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts commit `73f01e3df5`. This appears to break http://lab.llvm.org:8011/#/builders/85/builds/383.	2020-10-30 21:26:14 +00:00
Florian Hahn	73f01e3df5	[TTI] Add VecPred argument to getCmpSelInstrCost. On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV. Reviewed By: dmgreen, RKSimon Differential Revision: https://reviews.llvm.org/D90070	2020-10-30 13:49:08 +00:00
Krzysztof Parzyszek	db60e64036	[Hexagon] Handle additional shuffles that can be made perfect	2020-10-29 19:09:00 -05:00
Krzysztof Parzyszek	1b5baa42bc	[Hexagon] Handle selection between HVX vector predicates Make sure that (select i1 q0 q1) is handled properly.	2020-10-23 18:22:03 -05:00
Nicholas Guy	9a2d2bedb7	Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate Some instructions may be removable through processes such as IfConversion, however DefinesPredicate can not be made aware of when this should be considered. This parameter allows DefinesPredicate to distinguish these removable instructions on a per-call basis, allowing for more fine-grained control from processes like ifConversion. Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose Differential Revision: https://reviews.llvm.org/D88494	2020-10-21 11:52:47 +01:00
Krzysztof Parzyszek	97533b10b2	[Hexagon] Fix license headers in some .td files, NFC	2020-10-16 10:03:05 -05:00
Krzysztof Parzyszek	670cd3c6e3	[Hexagon] Generate better splat code on v62+	2020-10-14 12:55:20 -05:00
Krzysztof Parzyszek	9237e73ae8	[Hexagon] Replace HexagonISD::VSPLAT with ISD::SPLAT_VECTOR This removes VSPLAT and VZERO. VZERO is now SPLAT_VECTOR of (i32 0). Included is also a testcase for the previous (target-independent) commit.	2020-10-10 19:49:47 -05:00
Krzysztof Parzyszek	6fd994b4b7	[Hexagon] Remove ISD node VSPLATW, use VSPLAT instead This is a step towards improving HVX codegen for splat.	2020-10-09 15:38:02 -05:00
Krzysztof Parzyszek	33bb3efbb3	[Hexagon] Generalize handling of SDNodes created during ISel The selection of HVX shuffles can produce more nodes in the DAG, which need special handling, or otherwise they would be left unselected by the main selection code. Make the handling of such nodes more general.	2020-10-09 15:38:02 -05:00
Krzysztof Parzyszek	99cafe0094	[Hexagon] Return 1 instead of 0 from getMaxInterleaveFactor	2020-10-09 09:46:18 -05:00
Krzysztof Parzyszek	f528816d58	[Hexagon] Move selection of HVX multiply from lowering to patterns Also, change i32*i32 to V6_vmpyieoh + V6_vmpyiewuh_acc, which works on V60 as well.	2020-10-02 16:04:34 -05:00
David Sherwood	b8ce6a6756	[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions When we know that a particular type is always going to be fixed width we have so far been writing code like this: getSizeInBits().getFixedSize() Since we are doing this in quite a few places now it seems to make sense to add a new helper function that allows us to replace these calls with a single getFixedSizeInBits() call. Differential Revision: https://reviews.llvm.org/D88649	2020-10-02 07:47:31 +01:00
Arthur Eubanks	ce5379f0f0	[NPM] Add target specific hook to add passes for New Pass Manager The patch adds a new TargetMachine member "registerPassBuilderCallbacks" for targets to add passes to the pass pipeline using the New Pass Manager (similar to adjustPassManager for the Legacy Pass Manager). Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D88138	2020-09-30 13:29:43 -07:00
Krzysztof Parzyszek	3185839bcf	[Hexagon] Avoid crash on CONCAT_VECTORS with illegal element types Legal vector element types may not be legal as scalar types. When CONCAT_VECTORS is converted to BUILD_VECTOR, the individual vector elements become standalone operands to the build operation. If they have illegal (scalar) types, they need to be made legal. In doing so, the case of TRUNCATE was not handled, causing an assertion to fail.	2020-09-24 20:05:23 -05:00
Stefanos Baziotis	a7873e5abc	Small fixes for "[LoopInfo] empty() -> isInnermost(), add isOutermost()"	2020-09-22 23:59:34 +03:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Pengxuan Zheng	e5fea37f1a	[Hexagon] Make HexagonVLCR compatibile with New PM The patch modifies HexagonVectorLoopCarriedReuse pass to make it compatible with both Legacy Pass Manager through HexagonVectorLoopCarriedReuseLegacyPass and with New Pass Manager through HexagonVectorLoopCarriedReusePass. Reviewed By: pzheng Differential Revision: https://reviews.llvm.org/D86955	2020-09-21 13:45:12 -07:00
Krzysztof Parzyszek	5f4abb7fab	[Hexagon] Replace incorrect pattern for vpackl HWI32 -> HVi8 V6_vdealb4w is not correct for pairs, use V6_vpackeh/V6_vpackeb instead.	2020-09-15 20:34:50 -05:00
Krzysztof Parzyszek	bb877d1af2	[Hexagon] Widen loads and handle any-/sign-/zero-extensions	2020-09-14 18:10:23 -05:00
Krzysztof Parzyszek	6352381039	[Hexagon] Some HVX DAG combines 1. VINSERTW0 x, undef -> x 2. VROR (VROR x, a), b) -> VROR x, a+b	2020-09-14 18:10:23 -05:00
Craig Topper	c193a689b4	[SelectionDAG] Use Align/MaybeAlign in calls to getLoad/getStore/getExtLoad/getTruncStore. The versions that take 'unsigned' will be removed in the future. I tried to use getOriginalAlign instead of getAlign in some places. getAlign factors in the minimum alignment implied by the offset in the pointer info. Since we're also passing the pointer info we can use the original alignment. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87592	2020-09-14 13:54:50 -07:00
Krzysztof Parzyszek	9d300bc8d2	[Hexagon] Avoid widening vectors with non-HVX element types	2020-09-12 20:26:54 -05:00
Krzysztof Parzyszek	783e28a508	[Hexagon] Split pair-based masked memops	2020-09-10 14:24:42 -05:00
Simon Pilgrim	601557e9f9	Hexagon.h - remove unnecessary includes. NFCI. Replace with forward declarations and move includes to implicit dependent files.	2020-09-10 16:59:43 +01:00
Krzysztof Parzyszek	0ee54cf883	[Hexagon] Account for truncating pairs to non-pairs when widening truncates Added missing selection patterns for vpackl.	2020-09-09 14:31:52 -05:00
Krzysztof Parzyszek	c2b7b9b642	[Hexagon] Fix order of operands in V6_vdealb4w	2020-09-08 22:09:28 -05:00
Krzysztof Parzyszek	d183f47261	[Hexagon] Handle widening of truncation's operand with legal result Failing example: v8i8 = truncate v8i32. v8i8 is legal, but v8i32 was widened to HVX. Make sure that v8i8 does not get altered (even if it's changed to another legal type).	2020-09-08 16:07:39 -05:00
Roman Lebedev	bb7d3af113	Reland [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline This was reverted in `503deec218` because it caused gigantic increase (3x) in branch mispredictions in certain benchmarks on certain CPU's, see https://reviews.llvm.org/D84108#2227365. It has since been investigated and here are the results: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200907/827578.html > It's an amazingly severe regression, but it's also all due to branch > mispredicts (about 3x without this). The code layout looks ok so there's > probably something else to deal with. I'm not sure there's anything we can > reasonably do so we'll just have to take the hit for now and wait for > another code reorganization to make the branch predictor a bit more happy :) > > Thanks for giving us some time to investigate and feel free to recommit > whenever you'd like. > > -eric So let's just reland this. Original commit message: I've been looking at missed vectorizations in one codebase. One particular thing that stands out is that some of the loops reach vectorizer in a rather mangled form, with weird PHI's, and some of the loops aren't even in a rotated form. After taking a more detailed look, that happened because the loop's headers were too big by then. It is evident that SimplifyCFG's common code hoisting transform is at fault there, because the pattern it handles is precisely the unrotated loop basic block structure. Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled by default, and is always run, unlike it's friend, common code sinking transform, `SinkCommonCodeFromPredecessors()`, which is not enabled by default and is only run once very late in the pipeline. I'm proposing to harmonize this, and disable common code hoisting until //late// in pipeline. Definition of //late// may vary, here currently i've picked the same one as for code sinking, but i suppose we could enable it as soon as right after loop rotation happens. Experimentation shows that this does indeed unsurprizingly help, more loops got rotated, although other issues remain elsewhere. Now, this undoubtedly seriously shakes phase ordering. This will undoubtedly be a mixed bag in terms of both compile- and run- time performance, codesize. Since we no longer aggressively hoist+deduplicate common code, we don't pay the price of said hoisting (which wasn't big). That may allow more loops to be rotated, so we pay that price. That, in turn, that may enable all the transforms that require canonical (rotated) loop form, including but not limited to vectorization, so we pay that too. And in general, no deduplication means more [duplicate] instructions going through the optimizations. But there's still late hoisting, some of them will be caught late. As per benchmarks i've run {F12360204}, this is mostly within the noise, there are some small improvements, some small regressions. One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure this will expose many more pre-existing missed optimizations, as usual :S llvm-compile-time-tracker.com thoughts on this: http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions * this does regress compile-time by +0.5% geomean (unsurprizingly) * size impact varies; for ThinLTO it's actually an improvement The largest fallout appears to be in GVN's load partial redundancy elimination, it spends much more time in `MemoryDependenceResults::getNonLocalPointerDependency()`. Non-local `MemoryDependenceResults` is widely-known to be, uh, costly. There does not appear to be a proper solution to this issue, other than silencing the compile-time performance regression by tuning cut-off thresholds in `MemoryDependenceResults`, at the cost of potentially regressing run-time performance. D84609 attempts to move in that direction, but the path is unclear and is going to take some time. If we look at stats before/after diffs, some excerpts: * RawSpeed (the target) {F12360200} * -14 (-73.68%) loops not rotated due to the header size (yay) * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer * -3937 (-64.19%) common instructions hoisted * +561 (+0.06%) x86 asm instructions * -2 basic blocks * +2418 (+0.11%) IR instructions * vanilla test-suite + RawSpeed + darktable {F12360201} * -36396 (-65.29%) common instructions hoisted * +1676 (+0.02%) x86 asm instructions * +662 (+0.06%) basic blocks * +4395 (+0.04%) IR instructions It is likely to be sub-optimal for when optimizing for code size, so one might want to change tune pipeline by enabling sinking/hoisting when optimizing for size. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D84108 This reverts commit `503deec218`.	2020-09-08 00:24:03 +03:00
Krzysztof Parzyszek	62f89a89f3	[Hexagon] Add assertions about V6_pred_scalar2	2020-09-05 18:20:23 -05:00
Krzysztof Parzyszek	9518f032e4	[Hexagon] When widening truncate result, also widen operand if necessary	2020-09-05 18:19:32 -05:00
Krzysztof Parzyszek	8789f2bbde	[Hexagon] Resize the mem operand when widening loads and stores	2020-09-05 18:17:48 -05:00
Krzysztof Parzyszek	1387f96ab3	[Hexagon] Handle widening of vector truncate	2020-09-05 15:07:38 -05:00
Krzysztof Parzyszek	89a4fe79d4	[Hexagon] Unindent everything in HexagonISelLowering.h, NFC Just a shift, no other formatting changes.	2020-09-04 17:25:29 -05:00
Krzysztof Parzyszek	69fac677bc	[Hexagon] Fix perfect shuffle generation for single vectors Perfect shuffle instruction (vdealvdd/vshuffvdd) work on vector pairs. When given a single input vector, half of it first needs to be transposed into the other vector before the generated shuffles can take effect. Also the first transpose needs to be undone at the end (this last step was missing).	2020-08-30 06:43:16 -05:00
Craig Topper	aab90384a3	[Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None) There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None. So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline. This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations. Differential Revision: https://reviews.llvm.org/D86744	2020-08-28 13:23:45 -07:00
Krzysztof Parzyszek	4ef9275b9b	[Hexagon] Emit better 32-bit multiplication sequence for HVXv62+	2020-08-27 15:24:32 -05:00
Benjamin Kramer	b5924a8e27	[Hexagon] Fold another layer of single-use variable into assert. NFCI.	2020-08-27 16:52:34 +02:00
Benjamin Kramer	2b7df2707f	[Hexagon] Fold single-use variable into assert. NFCI.	2020-08-27 16:44:22 +02:00
Krzysztof Parzyszek	154daf1f94	[Hexagon] Widen short vector stores to HVX vectors using masked stores Also invent a flag -hexagon-hvx-widen=N to set the minimum threshold for widening short vectors to HVX vectors.	2020-08-27 09:25:08 -05:00
Krzysztof Parzyszek	e15143d31b	[Hexagon] Implement llvm.masked.load and llvm.masked.store for HVX	2020-08-26 13:10:22 -05:00
Ankit Aggarwal	2da1eefb58	[Hexagon] Check if EVT is simple type in HVX lowering	2020-08-25 15:02:44 -05:00
Krzysztof Parzyszek	dcef5e0c37	[Hexagon] Remove (redundant) HexagonISelLowering::isHvxOperation(SDValue) Use isHvxOperation(SDNode*) instead.	2020-08-25 11:45:08 -05:00
Roman Lebedev	503deec218	Temporairly revert "[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline" As disscussed in post-commit review starting with https://reviews.llvm.org/D84108#2227365 while this appears to be mostly a win overall, especially code-size-wise, this appears to shake //certain// code pattens in a way that is extremely unfavorable for performance (+30% runtime regression) on certain CPU's (i personally can't reproduce). So until the behaviour is better understood, and a path forward is mapped, let's back this out for now. This reverts commit `1d51dc38d8`.	2020-08-22 00:33:22 +03:00
Craig Topper	c7a0b2684f	[X86][MC][Target] Initial backend support a tune CPU to support -mtune This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line. This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned. One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU. I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning. Differential Revision: https://reviews.llvm.org/D85165	2020-08-14 15:31:50 -07:00
Krzysztof Parzyszek	a2dc19b81b	[Hexagon] Return scalar size in getMinVectorRegisterBitWidth() when no HVX This fixes https://llvm.org/PR47128.	2020-08-12 10:13:58 -05:00
Kerry McLaughlin	85c7e89f3b	[CodeGen] Refactor getMemBasePlusOffset & getObjectPtrOffset to accept a TypeSize Changes the Offset arguments to both functions from int64_t to TypeSize & updates all uses of the functions to create the offset using TypeSize::Fixed() Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85220	2020-08-11 12:17:10 +01:00
Arthur Eubanks	f50b3ff02e	[Hexagon] Use InstSimplify instead of ConstantProp This is the last remaining use of ConstantProp, migrate it to InstSimplify in the goal of removing ConstantProp. Add -hexagon-instsimplify option to enable skipping of instsimplify in tests that can't handle the extra optimization. Differential Revision: https://reviews.llvm.org/D85047	2020-08-04 15:42:39 -07:00
Krzysztof Parzyszek	09897b146a	[RDF] Remove uses of RDFRegisters::normalize (deprecate) This function has been reduced to an identity function for some time.	2020-08-04 17:02:12 -05:00
hgreving	509f5c4ec2	[MC] Fix memory leak when allocating MCInst with bump allocator Adds the function createMCInst() to MCContext that creates a MCInst using a typed bump alloctor. MCInst contains a SmallVector<MCOperand, 8>. The SmallVector is POD only for <= 8 operands. The default untyped bump pointer allocator of MCContext does not delete the MCInst, so if the SmallVector grows, it's a leak. This fixes https://bugs.llvm.org/show_bug.cgi?id=46900.	2020-08-03 16:08:26 -07:00
Sidharth Baveja	b7cfa6ca92	[Loop Peeling] Separate the Loop Peeling Utilities from the Loop Unrolling Utilities Summary: This patch separates the Loop Peeling Utilities from Loop Unrolling. The reason for this change is that Loop Peeling is no longer only being used by loop unrolling; Patch D82927 introduces loop peeling with fusion, such that loops can be modified to have to same trip count, making them legal to be peeled. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D83056	2020-07-31 18:31:58 +00:00
Roman Lebedev	1d51dc38d8	[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline I've been looking at missed vectorizations in one codebase. One particular thing that stands out is that some of the loops reach vectorizer in a rather mangled form, with weird PHI's, and some of the loops aren't even in a rotated form. After taking a more detailed look, that happened because the loop's headers were too big by then. It is evident that SimplifyCFG's common code hoisting transform is at fault there, because the pattern it handles is precisely the unrotated loop basic block structure. Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled by default, and is always run, unlike it's friend, common code sinking transform, `SinkCommonCodeFromPredecessors()`, which is not enabled by default and is only run once very late in the pipeline. I'm proposing to harmonize this, and disable common code hoisting until //late// in pipeline. Definition of //late// may vary, here currently i've picked the same one as for code sinking, but i suppose we could enable it as soon as right after loop rotation happens. Experimentation shows that this does indeed unsurprizingly help, more loops got rotated, although other issues remain elsewhere. Now, this undoubtedly seriously shakes phase ordering. This will undoubtedly be a mixed bag in terms of both compile- and run- time performance, codesize. Since we no longer aggressively hoist+deduplicate common code, we don't pay the price of said hoisting (which wasn't big). That may allow more loops to be rotated, so we pay that price. That, in turn, that may enable all the transforms that require canonical (rotated) loop form, including but not limited to vectorization, so we pay that too. And in general, no deduplication means more [duplicate] instructions going through the optimizations. But there's still late hoisting, some of them will be caught late. As per benchmarks i've run {F12360204}, this is mostly within the noise, there are some small improvements, some small regressions. One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure this will expose many more pre-existing missed optimizations, as usual :S llvm-compile-time-tracker.com thoughts on this: http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions * this does regress compile-time by +0.5% geomean (unsurprizingly) * size impact varies; for ThinLTO it's actually an improvement The largest fallout appears to be in GVN's load partial redundancy elimination, it spends much more time in `MemoryDependenceResults::getNonLocalPointerDependency()`. Non-local `MemoryDependenceResults` is widely-known to be, uh, costly. There does not appear to be a proper solution to this issue, other than silencing the compile-time performance regression by tuning cut-off thresholds in `MemoryDependenceResults`, at the cost of potentially regressing run-time performance. D84609 attempts to move in that direction, but the path is unclear and is going to take some time. If we look at stats before/after diffs, some excerpts: * RawSpeed (the target) {F12360200} * -14 (-73.68%) loops not rotated due to the header size (yay) * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer * -3937 (-64.19%) common instructions hoisted * +561 (+0.06%) x86 asm instructions * -2 basic blocks * +2418 (+0.11%) IR instructions * vanilla test-suite + RawSpeed + darktable {F12360201} * -36396 (-65.29%) common instructions hoisted * +1676 (+0.02%) x86 asm instructions * +662 (+0.06%) basic blocks * +4395 (+0.04%) IR instructions It is likely to be sub-optimal for when optimizing for code size, so one might want to change tune pipeline by enabling sinking/hoisting when optimizing for size. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D84108	2020-07-29 20:05:30 +03:00
David Green	60280e9818	[Analysis] TTI: Add CastContextHint for getCastInstrCost Currently, getCastInstrCost has limited information about the cast it's rating, often just the opcode and types. Sometimes there is a context instruction as well, but it isn't trustworthy: for instance, when the vectorizer is rating a plan, it calls getCastInstrCost with the old instructions when, in fact, it's trying to evaluate the cost of the instruction post-vectorization. Thus, the current system can get the cost of certain casts incorrect as the correct cost can vary greatly based on the context in which it's used. For example, if the vectorizer queries getCastInstrCost to evaluate the cost of a sext(load) with tail predication enabled, getCastInstrCost will think it's free most of the time, but it's not always free. On ARM MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar situations can come up with how masked loads can be extended when being split. To fix that, this path adds a new parameter to getCastInstrCost to give it a hint about the context of the cast. It adds a CastContextHint enum which contains the type of the load/store being created by the vectorizer - one for each of the types it can produce. Original patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79162	2020-07-29 13:32:53 +01:00
Ikhlas Ajbar	d50d4c3d44	[Hexagon] Correct the order of operands when lowering funnel shift-left This patch corrects the order of operands in the pattern that lowers fshl in Hexagon.	2020-07-28 21:22:41 -05:00
Simon Pilgrim	017e5c949b	MCFixup.h - remove unnecessary MCExpr.h include. NFCI. Move the include down to files that actually depend on MCExpr definitions. Also exposes an implicit dependency on MCContext in AVRAsmBackend.h	2020-07-20 15:17:19 +01:00
Roman Lebedev	fb432a51f4	Reland "[NFCI] createCFGSimplificationPass(): migrate to also take SimplifyCFGOptions" This reverts commit `1067d3e176`, which reverted commit `b2018198c3`, because it introduced a Dependency Cycle between Transforms/Scalar and Transforms/Utils. So let's just move SimplifyCFGOptions.h into Utils/, thus avoiding the cycle.	2020-07-16 13:40:01 +03:00
Adrian Kuegel	1067d3e176	Revert "[NFCI] createCFGSimplificationPass(): migrate to also take SimplifyCFGOptions" This reverts commit `b2018198c3`. This commit introduced a Dependency Cycle between Transforms/Scalar and Transforms/Utils. Transforms/Scalar already depends on Transforms/Utils, so if SimplifyCFGOptions.h is moved to Scalar, and Utils/Local.h still depends on it, we have a cycle.	2020-07-16 10:54:10 +02:00
Roman Lebedev	b2018198c3	[NFCI] createCFGSimplificationPass(): migrate to also take SimplifyCFGOptions Taking so many parameters is simply unmaintainable. We don't want to include the entire llvm/Transforms/Utils/Local.h into llvm/Transforms/Scalar.h so i've split SimplifyCFGOptions into it's own header.	2020-07-16 01:27:54 +03:00
serge-sans-paille	62881fda58	Fix HexagonGenExtract return status Differential Revision: https://reviews.llvm.org/D83460	2020-07-13 20:41:59 +02:00
Sidharth Baveja	e541e1b757	[NFC] Separate Peeling Properties into its own struct (re-land after minor fix) Summary: This patch separates the peeling specific parameters from the UnrollingPreferences, and creates a new struct called PeelingPreferences. Functions which used the UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct. Author: sidbav (Sidharth Baveja) Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov) Reviewed By: Meinersbur (Michael Kruse) Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D80580	2020-07-10 18:39:30 +00:00
Nikita Popov	0b39d2d752	Revert "[NFC] Separate Peeling Properties into its own struct" This reverts commit `0369dc98f9`. Many failing tests.	2020-07-08 21:43:32 +02:00
Sidharth Baveja	0369dc98f9	[NFC] Separate Peeling Properties into its own struct Summary: This patch makes the peeling properties of the loop accessible by other loop transformations. Author: sidbav (Sidharth Baveja) Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel) Reviewed By: Meinersbur (Michael Kruse) Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D80580	2020-07-08 18:59:59 +00:00
Anh Tuyen Tran	6965af43e6	Revert "[NFC] Separate Peeling Properties into its own struct" This reverts commit `fead250b43`.	2020-07-08 18:58:05 +00:00
Anh Tuyen Tran	fead250b43	[NFC] Separate Peeling Properties into its own struct Summary: This patch makes the peeling properties of the loop accessible by other loop transformations. Author: sidbav (Sidharth Baveja) Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel) Reviewed By: Meinersbur (Michael Kruse) Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D80580	2020-07-08 18:56:03 +00:00
Guillaume Chatelet	87e2751cf0	[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D83082	2020-07-03 08:06:43 +00:00
Guillaume Chatelet	8dbafd24d6	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82977	2020-07-02 11:28:02 +00:00
James Y Knight	4b0aa5724f	Change the INLINEASM_BR MachineInstr to be a non-terminating instruction. Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block. Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already. Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate. Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D79794	2020-07-01 12:51:50 -04:00
Guillaume Chatelet	d3085c2501	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82956	2020-07-01 14:31:56 +00:00
Adam Balogh	ec5ba353fa	[Hexagon][NFC] Remove redundant condition Condition `secondReg` is checked both in an outer and in an inner `if` statement in static function `canCompareBeNewValueJump()` in file `HexagonNewValueJump.cpp`. This patch removes the redundant inner check. The issue was found using `clang-tidy` check under review `misc-redundant-condition`. See https://reviews.llvm.org/D81272. Differential Revision: https://reviews.llvm.org/D82556	2020-07-01 09:04:26 +02:00
Guillaume Chatelet	c1cd61e02a	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemcpy to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82849	2020-06-30 13:12:31 +00:00
Guillaume Chatelet	5f8bdb3e6a	[Alignment][NFC] TargetLowering::allowsMemoryAccess Second patch of a series to adapt TargetLowering::allowsXXX functions This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82785	2020-06-30 08:17:00 +00:00
Guillaume Chatelet	4f5133a4dc	[Alignment][NFC] Migrate AArch64, ARM, Hexagon, MSP and NVPTX backends to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82749	2020-06-30 07:56:17 +00:00
Guillaume Chatelet	b66e33a689	[Alignment][NFC] Migrate TTI::getGatherScatterOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82577	2020-06-26 11:08:27 +00:00
Guillaume Chatelet	fdc7c7fb87	[Alignment][NFC] Migrate TTI::getInterleavedMemoryOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82573	2020-06-26 11:00:53 +00:00
Guillaume Chatelet	7e1f79c3de	[Alignment][NFC] Migrate TTI::getMaskedMemoryOpCost to Align This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82569	2020-06-26 10:14:16 +00:00
dfukalov	7ddee0922f	[NFCI][CostModel] Add const to Value*. Summary: Get back `const` partially lost in one of recent changes. Additionally specify explicit qualifiers in few places. Reviewers: samparker Reviewed By: samparker Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82383	2020-06-24 23:16:08 +03:00
Ikhlas Ajbar	085701b8b0	[Hexagon] Reducing minimum alignment requirement This patch reduces minimum alignment requirement to 1 byte for arguments passed by value on stack.	2020-06-24 10:28:37 -05:00
Sam Parker	fa8bff0cd1	[CostModel] Unify getArithmeticInstrCost Add the remaining arithmetic opcodes into the generic implementation of getUserCost and then call this from getInstructionThroughput. Most of the backends have been modified to return the base implementation for cost kinds other RecipThroughput. The outlier here is AMDGPU which already uses getArithmeticInstrCost for all the cost kinds. This change means that most of the opcodes can be removed from that backends implementation of getUserCost. Differential Revision: https://reviews.llvm.org/D80992	2020-06-10 09:08:45 +01:00
Guillaume Chatelet	800e100588	Revert "[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess" This reverts commit `f21c52667e`.	2020-06-09 10:43:59 +00:00
Guillaume Chatelet	f21c52667e	[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMemoryAccess` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81379	2020-06-09 10:11:07 +00:00
Sam Parker	37289615c0	[NFCI][CostModel] Unify getCmpSelInstrCost Add cases for icmp, fcmp and select into the switch statement of the generic getUserCost implementation with getInstructionThroughput then calling into it. The BasicTTI and backend implementations have be set to return a default value (1) when a cost other than throughput is being queried. Differential Revision: https://reviews.llvm.org/D80550	2020-06-09 07:41:22 +01:00
James Y Knight	1978309db1	MachineBasicBlock::updateTerminator now requires an explicit layout successor. Previously, it tried to infer the correct destination block from the successor list, but this is a rather tricky propspect, given the existence of successors that occur mid-block, such as invoke, and potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in particular would be problematic, because its successor blocks are not distinct from "normal" successors, as EHPads are.) Instead, require the caller to pass in the expected fallthrough successor explicitly. In most callers, the correct block is immediately clear. But, in MachineBlockPlacement, we do need to record the original ordering, before starting to reorder blocks. Unfortunately, the goal of decoupling the behavior of end-of-block jumps from the successor list has not been fully accomplished in this patch, as there is currently no other way to determine whether a block is intended to fall-through, or end as unreachable. Further work is needed there. Differential Revision: https://reviews.llvm.org/D79605	2020-06-06 22:30:51 -04:00
Sam Parker	9303546b42	[CostModel] Unify getMemoryOpCost Use getMemoryOpCost from the generic implementation of getUserCost and have getInstructionThroughput return the result of that for loads and stores. This also means that the X86 implementation of getUserCost can be removed with the functionality folded into its getMemoryOpCost. Differential Revision: https://reviews.llvm.org/D80984	2020-06-05 10:13:38 +01:00
hsmahesha	0ed2c04636	[AMDGPU/MemOpsCluster] Let mem ops clustering logic also consider number of clustered bytes Summary: While clustering mem ops, AMDGPU target needs to consider number of clustered bytes to decide on max number of mem ops that can be clustered. This patch adds support to pass number of clustered bytes to target mem ops clustering logic. Reviewers: foad, rampitec, arsenm, vpykhtin, javedabsar Reviewed By: foad Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80545	2020-06-01 22:52:34 +05:30
Sam Parker	8aaabadece	[CostModel] Unify getCastInstrCost Add the remaining cast instruction opcodes to the base implementation of getUserCost and directly return the result. This allows getInstructionThroughput to return getUserCost for the casts. This has required changes to PPC and SystemZ because they implement getUserCost and/or getCastInstrCost with adjustments for vector operations. Adjusts have also been made in the remaining backends that implement the method so that they still produce a cost of zero or one for cost kinds other than throughput. Differential Revision: https://reviews.llvm.org/D79848	2020-05-26 11:29:57 +01:00
Fangrui Song	7e49dc6184	[MC] Change MCCFIInstruction::createDefCfa to cfiDefCfa which does not negate Offset The negative Offset has caused a bunch of problems and confused quite a few call sites. Delete the unneeded negation and fix all call sites.	2020-05-22 15:47:26 -07:00
Marek Kurdej	9301e3aaca	[Target] Fix typos. NFC	2020-05-22 14:40:43 +02:00
Sam Parker	8cc911fa5b	[NFCI][CostModel] Refactor getIntrinsicInstrCost Combine the two API calls into one by introducing a structure to hold the relevant data. This has the added benefit of moving the boiler plate code for arguments and flags, into the constructors. This is intended to be a non-functional change, but the complicated web of logic involved here makes it very hard to guarantee. Differential Revision: https://reviews.llvm.org/D79941	2020-05-20 11:59:08 +01:00
Florian Hahn	bcbd26bfe6	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as `b8a3c34eee`, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-05-20 10:53:40 +01:00
Brian Cain	cfba1a9668	[Hexagon] pX.new cannot be used with p3:0 as producer Writes to p3:0 do not produce new values, we should bar any .new consumer trying to use it as a producer.	2020-05-19 17:06:34 -05:00
Simon Pilgrim	cdafe59f95	TargetLoweringObjectFile.h - remove unnecessary includes. NFCI. Replace with forward declarations and move includes down to source files where required. I also needed to move the TargetLoweringObjectFile::SectionForGlobal wrapper implementation down into TargetLoweringObjectFile.cpp	2020-05-19 09:28:13 +01:00
Craig Topper	c9f63297e2	Fix several places that were calling verifyFunction or verifyModule without checking the return value. verifyFunction/verifyModule don't assert or error internally. They also don't print anything if you don't pass a raw_ostream to them. So the caller needs to check the result and ideally pass a stream to get the messages. Otherwise they're just really expensive no-ops. I've filed PR45965 for another instance in SLPVectorizer that causes a lit test failure. Differential Revision: https://reviews.llvm.org/D80106	2020-05-18 13:28:46 -07:00
Ties Stuij	8c24f33158	[IR][BFloat] Add BFloat IR type Summary: The BFloat IR type is introduced to provide support for, initially, the BFloat16 datatype introduced with the Armv8.6 architecture (optional from Armv8.2 onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE 754 floating point IR type. This is part of a patch series upstreaming Armv8.6 features. Subsequent patches will upstream intrinsics support and C-lang support for BFloat. Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith Tags: #llvm Differential Revision: https://reviews.llvm.org/D78190	2020-05-15 14:43:43 +01:00
Xinglong Liao	5f3f45dc53	[Hexagon] Check isInstr() before getInstr() with SUnit SUnit represent a MachineInstr in post-regalloc scheduling but SDNode in pre-regalloc scheduling. when pass -enable-hexagon-sdnode-sched to Hexagon backend with -O1 and above, this may cause an assertion failed. Fixes PR45194. Differential Revision: https://reviews.llvm.org/D76134	2020-05-14 08:47:54 -05:00
Christopher Tetreault	2a77d1d0ed	[SVE] Remove usages of VectorType::getNumElements() from Hexagon Reviewers: efriedma, kmclaughlin, sdesmalen, kparzysz Reviewed By: kparzysz Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79819	2020-05-13 17:13:12 -07:00
Craig Topper	d1119980e5	[SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode. This patch stores the alignment for ConstantPoolSDNode as an Align and updates the getConstantPool interface to take a MaybeAlign. Removing getAlignment() will be done as a follow up. Differential Revision: https://reviews.llvm.org/D79436	2020-05-08 16:04:11 -07:00
Simon Pilgrim	4e3c005554	[TTI] getScalarizationOverhead - use explicit VectorType operand getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions). Followup to D78357 Reviewed By: @samparker Differential Revision: https://reviews.llvm.org/D79341	2020-05-05 16:59:23 +01:00
Sam Parker	40574fefe9	[NFC][CostModel] Add TargetCostKind to relevant APIs Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002	2020-05-05 10:35:54 +01:00
Sam McCall	d10c995b4d	std::isspace -> llvm::isSpace (where locale should be ignored) I've left out some cases where I wasn't totally sure this was right or whether the include was ok (compiler-rt) or idiomatic (flang).	2020-05-02 15:36:04 +02:00
Suyog Sarda	ea093f6481	Handle cases for subregisters. While restoring latency, check if any of the registers of source instruction is a subregister of the successor instructions apart from being same register.	2020-04-30 20:32:33 -05:00
Simon Pilgrim	090cae8491	[TTI] Add DemandedElts to getScalarizationOverhead The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited. This patch does 2 things: 1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern. 2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs. This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing. A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well. Reviewed By: @craig.topper Differential Revision: https://reviews.llvm.org/D78216	2020-04-29 12:00:38 +01:00
Krzysztof Parzyszek	25a4b1904c	Handle part-word LL/SC in atomic expansion pass Differential Revision: https://reviews.llvm.org/D77213	2020-04-28 10:07:39 -05:00
Sam Parker	e9c9329aa4	[TTI] Add TargetCostKind argument to getUserCost There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635	2020-04-28 08:57:45 +01:00
Simon Pilgrim	a3982491db	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Fangrui Song	2cb48d620f	[TableGen] Drop deprecated leading # operation (NOP) and replace ## with #	2020-04-25 16:26:45 -07:00
Simon Pilgrim	fd8035cf32	HexagonShuffler.h - remove duplicate STLExtras.h include. NFC.	2020-04-24 13:27:56 +01:00
Krzysztof Parzyszek	5c7a2cfac1	[Hexagon] Fix result word order when bitcasting vector pred to int64/128	2020-04-23 19:15:11 -05:00
Christopher Tetreault	18c611ed92	[SVE] Remove calls to isScalable from Hexagon Reviewers: efriedma, sdesmalen, kparzysz, colinl Reviewed By: kparzysz Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77757	2020-04-23 14:02:14 -07:00
Kazuaki Ishizaki	0312b9f550	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469	2020-04-23 14:26:32 +09:00
Simon Pilgrim	f8a5e746c6	[Hexagon] Remove unused forward declarations. NFC.	2020-04-22 18:26:50 +01:00
Benjamin Kramer	4b33c935db	[Hexagon] Silence warning llvm/lib/Target/Hexagon/HexagonTargetObjectFile.cpp:296:11: warning: enumeration value 'ScalableVectorTyID' not handled in switch [-Wswitch] switch (Ty->getTypeID()) { ^	2020-04-22 18:57:08 +02:00
Christopher Tetreault	2dea3f1298	[SVE] Add new VectorType subclasses Summary: Introduce new types for fixed width and scalable vectors. Does not remove getNumElements yet so as to not break code during transition period. Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr Reviewed By: sdesmalen Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm, #lldb Differential Revision: https://reviews.llvm.org/D77587	2020-04-22 08:59:01 -07:00
Shengchen Kan	8bb059ab63	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Fraser Cormack	c819ef9653	Provide operand indices to adjustSchedDependency This allows targets to know exactly which operands are contributing to the dependency, which is required for targets with per-operand scheduling models. Differential Revision: https://reviews.llvm.org/D77135	2020-04-17 11:08:44 +01:00
Simon Pilgrim	bcd7f77713	MCObjectWriter.h - remove Endian.h/EndianStream.h/raw_ostream.h includes. NFC Push these includes down to the the writers that actually need them, a number of which were implicitly relying on the MCObjectWriter.h.	2020-04-17 10:44:08 +01:00
Christopher Tetreault	e68f1f2d43	[SVE] Remove calls to getBitWidth from Hexagon Reviewers: efriedma, sdesmalen, kparzysz Reviewed By: kparzysz Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77899	2020-04-14 11:09:49 -07:00
Craig Topper	113f37a1f9	[CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase Differential Revision: https://reviews.llvm.org/D77995	2020-04-13 13:50:15 -07:00
Fangrui Song	d2e5157c1f	[MC] Add UseIntegratedAssembler = false. NFC	2020-04-11 10:13:49 -07:00
Matt Arsenault	84aa58cbe2	CodeGen: Use Register in TargetLowering	2020-04-08 12:10:58 -04:00
Matt Arsenault	2481f26ac3	CodeGen: Use Register in TargetFrameLowering	2020-04-07 17:07:44 -04:00

1 2 3 4 5 ...

2661 Commits