llvm-project

Commit Graph

Author	SHA1	Message	Date
Carl Ritson	cfbb92441f	[SDAG] Fix pow2 assumption when splitting vectors When reducing vector builds to shuffles it possible that the DAG combiner may try to extract invalid subvectors. This happens as the existing code assumes vectors will be power of 2 sizes, which is already untrue, but becomes more noticable with v6 and v7 types. Specifically the existing code assumes that half PowerOf2Ceil of a given vector index will fit twice into a given vector. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103880	2021-06-11 08:58:16 +09:00
David Spickett	64de8763aa	Revert "Implementation of global.get/set for reftypes in LLVM IR" This reverts commit `31859f896c`. Causing SVE and RISCV-V test failures on bots.	2021-06-10 10:11:17 +00:00
Paulo Matos	31859f896c	Implementation of global.get/set for reftypes in LLVM IR This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D95425	2021-06-10 10:07:45 +02:00
Sanjay Patel	dd763ac791	[SDAG] fix miscompile from merging stores of different sizes As shown in: https://llvm.org/PR50623 ...and the similar tests here, we were not accounting for store merging of different sizes that do not cover the entire range of the wide value to be stored. This is the easy fix: just make sure that all of the original stores are the same size, so when we calculate the wide width, it's a simple N * M check. This still allows all of the motivating optimizations from: D86420 / `54a5dd485c` D87112 / `7a06b166b1` We could enhance this code to track individual bytes and allow merging multiple sizes.	2021-06-09 09:51:39 -04:00
Simon Pilgrim	114e712c34	InstrEmitter.cpp - don't dereference a dyn_cast<>. dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.	2021-06-08 17:59:04 +01:00
Simon Pilgrim	61a2d6bfe4	[DAG] foldShuffleOfConcatUndefs - ensure shuffles of upper (undef) subvector elements is undef (PR50609) shuffle(concat(x,undef),concat(y,undef)) -> concat(shuffle(x,y),shuffle(x,y)) If the original shuffle references any of the upper (undef) subvector elements, ensure the split shuffle masks uses undef instead of an out-of-bounds value. Fixes PR50609	2021-06-08 15:49:41 +01:00
Hans Wennborg	386b66b2fc	Revert "3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" > This reapplies `c0f3dfb9`, which was reverted following the discovery of > crashes on linux kernel and chromium builds - these issues have since > been fixed, allowing this patch to re-land. This reverts commit `36ec97f76a`. The change caused non-determinism in the compiler, see comments on the code review at https://reviews.llvm.org/D91722. Reverting to unbreak people's builds until that can be addressed. This also reverts the follow-up "[DebugInfo] Limit the number of values that may be referenced by a dbg.value" in `a0bd6105d8`.	2021-06-08 14:54:08 +02:00
David Green	b889c6ee99	[DAG] Allow isNullOrNullSplat to see truncated zeroes This sets the AllowTruncation flag on isConstOrConstSplat in isNullOrNullSplat, allowing it to see truncated constant zeroes on architectures such as AArch64, where only a i32.i64 are legal. As a truncation of 0 is always 0, this should always be valid, allowing some extra folding to happen including some of the cases from D103755. Differential Revision: https://reviews.llvm.org/D103756	2021-06-08 10:18:58 +01:00
Arthur Eubanks	47211fa889	Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry" Needs to be discussed more. This reverts commit 255a5c1baa6020c009934b4fa342f9f6dbbcc46 This reverts commit df2056ff3730316f376f29d9986c9913b95ceb1 This reverts commit faff79b7ca144e505da6bc74aa2b2f7cffbbf23 This reverts commit d2a9020785c6e02afebc876aa2778fa64c5cafd	2021-06-07 16:07:44 -07:00
Guillaume Chatelet	1da2c7d25c	[NFC] Fix semantic discrepancy for MVT::LAST_VALUETYPE Differential Revision: https://reviews.llvm.org/D103251	2021-06-07 10:04:16 +00:00
Fraser Cormack	aec9cbbeb8	[SelectionDAG] Extend FoldConstantVectorArithmetic to SPLAT_VECTOR This patch extends the SelectionDAG's ability to constant-fold vector arithmetic to include support for SPLAT_VECTOR. This is not only for scalable-vector types but also for fixed-length vector types, which helps Hexagon in a couple of cases. The original RISC-V test case was in fact an infinite DAGCombine loop. The pattern `and (truncate v1), (truncate v2)` can be combined to `truncate (and v1, v2)` but the truncate can similarly be combined back to `truncate (and v1, v2)` (but, crucially, only when one of `v1` or `v2` is a constant vector). It wasn't exposed in on fixed-length types because a TRUNCATE of a constant BUILD_VECTOR was folded into the BUILD_VECTOR itself, whereas this did not happen for the equivalent (scalable-vector) SPLAT_VECTOR. Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D103246	2021-06-04 09:53:15 +01:00
Arthur Eubanks	9255a5c1ba	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. Issues can be diagnosed with D103412. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-06-03 15:52:01 -07:00
Fraser Cormack	1de1887f5f	[CodeGen] Fix a scalable-vector crash in VSELECT legalization The `DAGTypeLegalizer::WidenVSELECTMask` function is not (yet) ready for scalable vector types, and has numerous places in which it tries to grab either the fixed size or number of elements of its types. I believe that it should be possible to update this method to properly account for scalable-vector types, but we don't have test cases for that; RISC-V bails out early on as it has legal i1 vector masks. As such, this patch just prevents it from crashing. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103536	2021-06-03 10:24:55 +01:00
Sanjay Patel	0718ac706d	[SDAG] allow cast folding for vector sext-of-setcc with signed compare This extends `434c8e013a` and `ede3982792` to handle signed predicates by sign-extending the setcc operands. This is not shown directly in https://llvm.org/PR50055 , but the pattern is visible by changing the unsigned convert to signed in the source code.	2021-06-02 15:05:02 -04:00
Sanjay Patel	ede3982792	[SDAG] allow more cast folding for vector sext-of-setcc This is a follow-up to D103280 that eases the use restrictions, so we can handle the motivating case from: https://llvm.org/PR50055 The loop code is adapted from similar use checks in ExtendUsesToFormExtLoad() and SliceUpLoad(). I did not see an easier way to filter out non-chain uses of load values. Differential Revision: https://reviews.llvm.org/D103462	2021-06-02 13:14:49 -04:00
Bjorn Pettersson	536e02a23c	[CodeGen] Refactor libcall lookups for RTLIB::POWI_* Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_* libcall for a given value type. This includes a small refactoring of ExpandFPLibCall and ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code, plus adding an ExpandFPLibCall version that can be called directly when expanding FPOWI/STRICT_FPOWI to ensure that we actually use the same RTLIB::Libcall when expanding the libcall as we used when checking the legality of such a call by doing a getLibcallName check. Differential Revision: https://reviews.llvm.org/D103050	2021-06-02 11:40:34 +02:00
Bjorn Pettersson	d1273d39d3	[LegalizeTypes] Avoid promotion of exponent in FPOWI The FPOWI DAG node is normally lowered to a libcall to one of the RTLIB::POWI* runtime functions and the exponent should normally have a type matching sizeof(int) when making the call. Thus, type promotion of the exponent could lead to an FPOWI with a type for the second operand that would be incorrect when doing the libcall (a situation which would be hard to detect post-legalization if we allow such FPOWI nodes). This patch is changing DAGTypeLegalizer::PromoteIntOp_FPOWI to do the rewrite into a libcall directly instead of promoting the operand. This way we can check that the exponent is smaller than sizeof(int) and we can let TargetLowering handle promotion as part of making the libcall. It could be noticed here that makeLibCall has some knowledge about targets such as 64-bit RISCV, for which the libcall argument should be extended to a type larger than sizeof(int). Differential Revision: https://reviews.llvm.org/D102950	2021-06-02 11:40:34 +02:00
Sanjay Patel	1b14f3951a	[SDAG] add helper function for sext-of-setcc folds; NFC Try to make this easier to read as noted in D103280	2021-06-01 08:07:17 -04:00
Sanjay Patel	63fe4cb082	[SDAG] add check to sext-of-setcc fold to bypass changing a legal op I accidentaly pushed a draft of D103280 that was discussed during the review, but it was not supposed to be the final version. Rather than revert and recommit, I'm updating the existing code. This way we have a record of the codegen diff that would result if we decide to remove this predicate in the future.	2021-05-31 08:58:11 -04:00
Sanjay Patel	434c8e013a	[SDAG] try harder to fold casts into vector compare sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) -- (when the zexts are free and a bunch of other conditions) We have a couple of similar folds to this already for vector selects, but this pattern slips through because it is only a setcc. The tests are based on the motivating case from: https://llvm.org/PR50055 ...but we need extra logic to get that example, so I've left that as a TODO for now. Differential Revision: https://reviews.llvm.org/D103280	2021-05-31 07:14:01 -04:00
Florian Hahn	126f90b252	[DAGCombine] Poison-prove scalarizeExtractedVectorLoad. extractelement is poison if the index is out-of-bounds, so just scalarizing the load may introduce an out-of-bounds load, which is UB. To avoid introducing new UB, we can mask the index so it only contains valid indices. Fixes PR50382. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103077	2021-05-30 11:40:55 +01:00
Arthur Eubanks	71cca4f728	Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry" This reverts commit `1c7f32334d`. Some code still needs to properly set parameter ABI attributes, see D101806.	2021-05-29 23:08:15 -07:00
Arthur Eubanks	3a6f12f915	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `bc7d15c61d`. Dependent change is to be reverted.	2021-05-29 22:40:33 -07:00
Eli Friedman	0b3b0a727a	[AArch64][RISCV] Make sure isel correctly honors failure orderings. If a cmpxchg specifies acquire or seq_cst on failure, make sure we generate code consistent with that ordering even if the success ordering is not acquire/seq_cst. At one point, it was ambiguous whether this sort of construct was valid, but the C++ standad and LLVM now accept arbitrary combinations of success/failure orderings. This doesn't address the corresponding issue in AtomicExpand. (This was reported as https://bugs.llvm.org/show_bug.cgi?id=33332 .) Fixes https://bugs.llvm.org/show_bug.cgi?id=50512. Differential Revision: https://reviews.llvm.org/D103284	2021-05-28 12:47:40 -07:00
Craig Topper	2830d924b0	[VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names. Parameter positions seem like they should be unsigned. While there, make function names lowercase per coding standards. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D103224	2021-05-28 11:28:47 -07:00
Craig Topper	d24d2447cd	[SelectionDAG] Fix typo in assert. NFC	2021-05-28 10:37:11 -07:00
Tim Northover	9ff2eb1ea5	SwiftTailCC: teach verifier musttail rules applicable to this CC. SwiftTailCC has a different set of requirements than the C calling convention for a tail call. The exact argument sequence doesn't have to match, but fewer ABI-affecting attributes are allowed. Also make sure the musttail diagnostic triggers if a musttail call isn't actually a tail call.	2021-05-28 11:12:00 +01:00
Fraser Cormack	5a80dc4988	[VP][SelectionDAG] Add a target-configurable EVL operand type This patch adds a way for the target to configure the type it uses for the explicit vector length operands of VP SDNodes. The type must be a legal integer type (there is still no target-independent legalization of this operand) and must currently be at least as big as i32, the type used by the IR intrinsics. An implicit zero-extension takes place on targets which choose a larger type. All VP nodes should be created with this type used for the EVL operand. This allows 64-bit RISC-V to avoid custom legalization of all VP nodes, keeping them in their target-independent form for that bit longer. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103027	2021-05-27 15:27:36 +01:00
Fraser Cormack	b7101e218c	[DAGCombine][RISCV] Don't try to trunc-store combined vector stores DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173	2021-05-27 14:16:32 +01:00
Fraser Cormack	772b58a641	[SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs This patch extends the cases in which the legalizer is able to express VSELECT in terms of XOR/AND/OR. When dealing with a VSELECT between boolean vector types, the mask itself is an all-ones or all-ones value of the operand type, so a 0/1 boolean type behaves identically to a 0/-1 type. This greatly helps RISC-V which relies on expansion for these nodes. It also allows scalable-vector bool VSELECTs to use the default expansion, where before it would crash in SelectionDAG::UnrollVectorOp. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103147	2021-05-27 10:08:57 +01:00
Jonas Paulsson	d058262b14	[SystemZ] Support i128 inline asm operands. Support virtual, physical and tied i128 register operands in inline assembly. i128 is on SystemZ not really supported and is not a legal type and generally such a value will be split into two i64 parts. There are however some instructions that require a pair of two GPR64 registers contained in the GR128 bit reg class, which is untyped. For inline assmebly operands, it proved to be very cumbersome to first follow the general behavior of splitting an i128 operand into two parts and then later rebuild the INLINEASM MI to have one GR128 register. Instead, some minor common code changes were made to SelectionDAGBUilder to only create one GR128 register part to begin with. In particular: - getNumRegisters() now has an optional parameter "RegisterVT" which is passed by AddInlineAsmOperands() and GetRegistersForValue(). - The bitcasting in GetRegistersForValue is not performed if RegVT is Untyped. - The RC for a tied use in AddInlineAsmOperands() is now computed either from the tied def (virtual register), or by getMinimalPhysRegClass() (physical register). - InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register class (DstRC) can also be computed for an illegal type. In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and joinRegisterPartsIntoValue() have been implemented to handle i128 operands. Differential Revision: https://reviews.llvm.org/D100788 Review: Ulrich Weigand	2021-05-26 10:08:32 -05:00
Michael Liao	c9dd29925f	[SelectionDAG] Propagate scoped AA metadata when lowering mem intrinsics. - When memory intrinsics, such as memcpy, the attached scoped AA metadata is not passed down to the backend. As a result, the backend cannot schedule relevant memory operations around them following that hint. In this patch, SelectionDAG is enhanced to propagate that metadata (scoped AA only) when they are lowered into loads and stores. Differential Revision: https://reviews.llvm.org/D102215	2021-05-25 14:42:26 -04:00
LemonBoy	fd5cc41818	[SelectionDAG] Fix argument copy elision with irregular types D29668 enabled to avoid a useless copy of the argument value into an alloca if the caller places it in memory (as it often happens on x86) by directly forwarding the pointer to it. This optimization is illegal if the type contains padding bytes: if a truncating store into the alloca is replaced the upper bits are filled with garbage and produce code misbehaving at runtime. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D102153	2021-05-22 09:43:37 +02:00
Stephen Tozer	36ec97f76a	3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reapplies `c0f3dfb9`, which was reverted following the discovery of crashes on linux kernel and chromium builds - these issues have since been fixed, allowing this patch to re-land. This reverts commit `4397b7095d`.	2021-05-21 11:06:20 +01:00
Jessica Clarke	e10958c807	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits, where zero-extending atomics were incorrectly returning 0 rather than the (slightly confusing) required return value of 1. Re-landed again after D102819 fixed PowerPC to correctly zero-extend all of its atomics as it claimed to do, since the combination of that bug and this optimisation caused buildbot regressions. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-20 20:34:23 +01:00
Fraser Cormack	26bd2250c1	[RISCV] Ensure shuffle splat operands are type-legal The use of `SelectionDAG::getSplatValue` isn't guaranteed to return a type-legal splat value as it may implicitly extract a vector element from another shuffle. It is not permitted to introduce an illegal type when lowering shuffles. This patch addresses the crash by adding a boolean flag to `getSplatValue`, defaulting to false, which when set will ensure a type-legal return value. If it is unable to do that it will fail to return a splat value. I've been through the existing uses of `getSplatValue` in other targets and was unable to find a need or test cases showing a need to update their uses. In some cases, the call is made during `LegalizeVectorOps` which may still produce illegal scalar types. In other situations, the illegally-typed splat value may be quickly patched up to a legal type (such as any-extending the returned `extract_vector_elt` up to a legal type) before `LegalizeDAG` notices. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102687	2021-05-20 18:00:03 +01:00
Stephen Tozer	cf725dde9c	[DebugInfo] Handle DIArgList in FastISel or GlobalIsel Currently, variadic dbg.values (i.e. those using a DIArgList as part of their location) are not handled properly by FastISel or GlobalISel, and will produce invalid DBG_VALUE instructions if they encounter them. This patch fixes this issue by emitting undef DBG_VALUE instructions for variadic dbg.values, so that no incorrect instruction is produced and any prior variable location is terminated. This is simply a quick-fix to prevent errors; a correct implementation should come later for these ISel pipelines to ensure that we do not drop debug information unnecessarily. Differential Revision: https://reviews.llvm.org/D102500	2021-05-20 17:37:28 +01:00
David Sherwood	a21bff0673	[CodeGen] Add support for widening the result of EXTRACT_SUBVECTOR When trying to return a type such as <vscale x 1 x i32> from a function we crash in DAGTypeLegalizer::WidenVecRes_EXTRACT_SUBVECTOR when attempting to get the fixed number of elements in the vector. For the simple case we are dealing with, i.e. extracting <vscale x 1 x i32> from index 0 of input vector <vscale x 4 x i32> we can simply rely upon existing code that just returns the input. Differential Revision: https://reviews.llvm.org/D102605	2021-05-20 12:27:08 +01:00
David Sherwood	d07d5c1b06	[CodeGen] Add support for widening INSERT_SUBVECTOR operands When attempting to return something like a <vscale x 1 x i32> type from a function we end up trying to widen the vector by inserting a <vscale x 1 x i32> subvector into an undefined <vscale x 4 x i32> vector. However, during legalisation we then attempt to widen the INSERT_SUBVECTOR operands and hit an error in WidenVectorOperand. This patch adds a new WidenVecOp_INSERT_SUBVECTOR function that currently only supports inserting subvectors into undefined vectors. Differential Revision: https://reviews.llvm.org/D102501	2021-05-20 10:37:03 +01:00
Sanjay Patel	6025663578	[SDAG] propagate FMF from target-specific IR intrinsics This is a step towards relying more on node-level FMF rather than function-wide or target settings. I think it was just an oversight that we didn't get this path in D87361 or follow-on patches. The lack of FMF propagation is blocking D90901 from converting tests to IR-level FMF. We can't do much more than this currently because we also fail to propagate flags from x86-specific node to generic FMA node. That would be another patch, so the test just verifies that we can transfer from IR to initial SDAG node. Differential Revision: https://reviews.llvm.org/D102725	2021-05-19 07:50:50 -04:00
Arthur Eubanks	bc7d15c61d	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. This is a reland after an MSan fix in D102667. Differential Revision: https://reviews.llvm.org/D101713	2021-05-18 14:30:22 -07:00
Arthur Eubanks	1c7f32334d	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. This is a reland after fixing MSan issues in D102667. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-05-18 14:30:22 -07:00
Ten Tzen	797ad70152	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1 This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/	2021-05-17 22:42:17 -07:00
Serguei Katkov	57c660f374	[Statepoint Lowering] Cleanup: remove unused option statepoint-always-spill-base.	2021-05-18 12:15:15 +07:00
Simon Pilgrim	c29522d648	[TargetLowering] prepareUREMEqFold/prepareSREMEqFold - account for non legal shift types Ensure we tell getShiftAmountTy that we're working with pre-legalized types to prevent cases where the (legalized) shift type can no longer handle the (non-legalized) type width. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366	2021-05-17 11:03:27 +01:00
Fraser Cormack	85e31eddf2	[DAGCombiner] Relax an assertion to an early return The select-of-constants transform was asserting that its constant vector inputs did not implicitly truncate their input without that as an explicit precondition to the function. This patch relaxes that assertion into an early return to skip the optimization. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102393	2021-05-17 09:15:55 +01:00
Arthur Eubanks	341902672c	Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry" This reverts commit `16748bd2fb`. Causes https://crbug.com/1209013	2021-05-16 22:02:10 -07:00
Arthur Eubanks	7647cb14dc	Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering" This reverts commit `85af8a8c1b`.	2021-05-16 22:00:54 -07:00
Pan, Tao	976a3e5f61	[SelectionDAG] Make fast and linearize visible by clang -pre-RA-sched ScheduleDAGFast.cpp is compiled to object file, but the ScheduleDAGFast object file isn't linked into clang executable file as no symbol is referred by outside. Add calling to createXxx of ScheduleDAGFast.cpp, then the ScheduleDAGFast object file will be linked into clang executable file. The static RegisterScheduler will register scheduler fast and linearize at clang boot time. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D101601	2021-05-17 11:25:15 +08:00
Nikita Popov	fb9ed1979a	[IR] Add BasicBlock::isEntryBlock() (NFC) This is a recurring and somewhat awkward pattern. Add a helper method for it.	2021-05-15 12:41:58 +02:00
Sanjay Patel	9dfd7f9b67	[SDAG] reduce code duplication for extend_vec_inreg combines; NFC These are identical so far, and I was looking at adding a fold for a pattern with scalar_to_vector which would also nd up duplicated.	2021-05-14 08:29:57 -04:00
Tim Northover	ea0eec69f1	IR+AArch64: add a "swiftasync" argument attribute. This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).	2021-05-14 11:43:58 +01:00
cynecx	8ec9fd4839	Support unwinding from inline assembly I've taken the following steps to add unwinding support from inline assembly: 1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax: ``` invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"() to label %exit unwind label %uexit ``` 2.) Add Bitcode writing/reading support + LLVM-IR parsing. 3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled. 4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind. 5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled. 6.) Don't allow unwinding callbr. Reviewed By: Amanieu Differential Revision: https://reviews.llvm.org/D95745	2021-05-13 19:13:03 +01:00
Max Kazantsev	d8b37de8a4	[GC][NFC] Move GCStrategy from CodeGen to IR We want it to be available in analyzes so that we could use the CodeGen notion in middle-end passes (for example, to check if a GC may free some particular pointer). This is a preparatory patch that simply moves the files around. Note: if this causes some build issues, this patch must just be reverted. Differential Revision: https://reviews.llvm.org/D100557 Reviewed By: reames	2021-05-13 12:31:59 +07:00
Fraser Cormack	c5ec00e62b	[TargetLowering] Improve legalization of scalable vector types This patch extends the vector type-conversion and legalization capabilities of scalable vector types. Firstly, `vscale x 1` types now behave more like the corresponding `vscale x 2+` types. This enables the integer promotion legalization of extended scalable types, such as the promotion of `<vscale x 1 x i5>` to `<vscale x 1 x i8>`. These `vscale x 1` types are also now better handled by `getVectorTypeBreakdown`, where what looks like older handling for 1-element fixed-length vector types was spuriously updated to include scalable types. Widening of scalable types is now better supported, by using `INSERT_SUBVECTOR` to insert the smaller scalable vector "value" type into the wider scalable vector "part" type. This allows AArch64 to pass and return `vscale x 1` types by value by widening. There are still cases where we are unable to legalize `vscale x 1` types, such as where expansion would require splitting the vector in two. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D102073	2021-05-12 16:33:07 +01:00
Stefan Pintilie	8d37411e48	Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" This reverts commit `6c80361b84`. Breaks PowerPC Big Endian buildbots.	2021-05-12 09:46:18 -05:00
Hendrik Greving	762ac725bf	[DAGCombiner] Fix DAG combine store elimination, different address space. Fixes a bug in the DAG combiner that eliminates the stores because it missed to inspect the address space of the pointers. %v = load %ptr_as1 // no chain side effect store %v, %ptr_as2 As well as store %v, %ptr_as1 store %v, %ptr_as2 Fixes a test for above in X86. Differential Revision: https://reviews.llvm.org/D102096	2021-05-12 07:14:22 -07:00
Arthur Eubanks	85af8a8c1b	[NFC] Use ArgListEntry indirect types more in ISel lowering For opaque pointers, we're trying to avoid uses of PointerType::getElementType(). A couple of ISel places use PointerType::getElementType(). Some of these are easy to fix by using ArgListEntry's indirect types. The inalloca type wasn't stored there, as opposed to preallocated and byval which have their indirect types available, so add it and use it. Differential Revision: https://reviews.llvm.org/D101713	2021-05-10 13:05:15 -07:00
Arthur Eubanks	16748bd2fb	[TargetLowering] Only inspect attributes in the arguments for ArgListEntry Parameter attributes are considered part of the function [1], and like mismatched calling conventions [2], we can't have the verifier check for mismatched parameter attributes. [1] https://llvm.org/docs/LangRef.html#parameter-attributes [2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D101806	2021-05-10 12:35:11 -07:00
Bradley Smith	635164b95a	[AArch64][SVE] Improve SVE codegen for fixed length BITCAST Expanding a fixed length operation involves wrapping the operation in an insert/extract subvector pair, as such, when this is done to bitcast we end up with an extract_subvector of a bitcast. DAGCombine tries to convert this into a bitcast of an extract_subvector which restores the initial fixed length bitcast, causing an infinite loop of legalization. As part of this patch, we must make sure the above DAGCombine does not trigger after legalization if the created bitcast would not be legal. Differential Revision: https://reviews.llvm.org/D101990	2021-05-10 14:43:53 +01:00
Fraser Cormack	6db0cedd23	[LegalizeVectorOps][RISCV] Add scalable-vector SELECT expansion This patch extends VectorLegalizer::ExpandSELECT to permit expansion also for scalable vector types. The only real change is conditionally checking for BUILD_VECTOR or SPLAT_VECTOR legality depending on the vector type. We can use this to fix "cannot select" errors for scalable vector selects on the RISCV target. Note that in future patches RISCV will possibly custom-lower vector SELECTs to VSELECTs for branchless codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D102063	2021-05-10 08:22:35 +01:00
Simon Pilgrim	dd21c6b843	[DAG] Ensure all SD classes consistently return a const reference with getDebugLoc(). NFCI. Avoids a lot of unnecessary tracking increments/decrements of the underlying TrackingMDNodeRef.	2021-05-07 14:48:23 +01:00
Stephen Tozer	ce0c1f3ced	[DebugInfo] Fix crash when emitting an invalidated SDDbgValue This patch fixes a crash in the compiler that occurs when certain invalidated SDDbgValues are emitted. The cause of this was that we would attempt to check the liveness of the debug value's operands, which triggers an assert if any of those operands are invalid. This patch changes this check such that it only occurs if the SDDbgValue is valid; if not, the check is irrelevant anyway, so can be safely ignored. Differential Revision: https://reviews.llvm.org/D101540	2021-05-07 13:13:56 +01:00
Simon Pilgrim	280aa3415e	[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts. Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication. I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to). NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly. Differential Revision: https://reviews.llvm.org/D101987	2021-05-07 13:12:30 +01:00
Jessica Clarke	6c80361b84	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits, where zero-extending atomics were incorrectly returning 0 rather than the (slightly confusing) required return value of 1. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-06 04:01:20 +01:00
Jessica Clarke	897d7bceb9	Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics" This seems to have broken sanitizers, giving lots of Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed. failures across multiple targets (currently X86 and PowerPC). Reverting until I have a chance to reproduce and debug. This reverts commit `6e876f9ded`.	2021-05-05 17:02:05 +01:00
Jessica Clarke	6e876f9ded	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-05 16:34:45 +01:00
Christudasan Devadasan	80c79035ef	DAG: Cleanup assertion in EmitFuncArgumentDbgValue Removing an assertion introduced with D68945. The patch was later reverted with `6531a78ac4`, but failed to remove this assertion. It causes a problem while trying to split a 64-bit argument into sub registers. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D101594	2021-05-04 21:48:58 +05:30
Tomas Matheson	9d86095ff8	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `753185031d`.	2021-05-03 21:48:20 +01:00
Tomas Matheson	753185031d	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as `3338290c18`. Reverted in `c7df6b1223`.	2021-05-03 20:25:15 +01:00
Craig Topper	6430430958	[TableGen] Use sign rotated VBR for OPC_EmitInteger. This allows for a much more efficient encoding for small negative numbers by storing the sign bit first and negating the rest of the bits. This was already being used for OPC_CheckInteger. For every in tree target this affects, the table got smaller. R600GenDAGISel.inc saw the largest reduction of 7K. I did have to add a new opcode for StringIntegers used for register class ids and subregister indices since we don't have the integer value to encode. The enum name is emitted directly into the table. Previously assumed the enum would expand to a positive 7-bit number. We might be able to just shift that right by 1 and assume it is a positive 6 bit number, but that will need more investigation.	2021-05-02 12:40:44 -07:00
Nathan Chancellor	4397b7095d	Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This reverts commit `791930d740`, as per https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy. I observed breakage with the Linux kernel, as reported at https://reviews.llvm.org/D91722#2724321 Fixes exist at https://reviews.llvm.org/D101523 https://reviews.llvm.org/D101540 but they have not landed so to unbreak the tree for the weekend, revert this commit. Commit `b11e4c9907` ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"") only reverted one follow-up fix, not the original patch that broke the kernel. e	2021-04-30 20:23:21 -07:00
Tomas Matheson	c7df6b1223	Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0" This reverts commit `3338290c18`. Broke expensive checks on debian.	2021-04-30 16:53:14 +01:00
Tomas Matheson	3338290c18	[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0 atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164	2021-04-30 16:40:33 +01:00
Craig Topper	0c330afdfa	[RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32. This replaces D98479. This allows type legalization to form SPLAT_VECTOR_PARTS so we don't lose the splattedness when the scalar type is split. I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so we can continue using non-VL nodes for scalable vectors. I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR with other operations. Especially interesting is a splat BUILD_VECTOR of the extract_vector_elt which can become a splat shuffle, but won't if we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR or add visitSPLAT_VECTOR. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D100803	2021-04-29 08:20:09 -07:00
Craig Topper	3067520bf4	[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer. This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101262	2021-04-27 14:38:42 -07:00
Dávid Bolvanský	ef2dc7ed9f	[Analysis] Attribute alignment should not prevent tail call optimization Fixes tail folding issue mentioned in D100879. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D101230	2021-04-24 19:57:42 +02:00
Stephen Tozer	791930d740	Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" Previous build failures were caused by an error in bitcode reading and writing for DIArgList metadata, which has been fixed in `e5d844b587`. There were also some unnecessary asserts that were being triggered on certain builds, which have been removed. This reverts commit `dad5caa59e`.	2021-04-23 10:54:01 +01:00
Craig Topper	5185b52988	[RISCV] Fix crash with fptosi.sat/fptoui.sat intrinsics on RV64. Add test cases. Add PromoteIntOp_FP_TO_XINT_SAT to type legalize the bit width operand from i32 to i64 for RV64. Add test cases for the saturating intrinsics for half/float/double and i32/i64. CodeGen is definitely not optimal. We can probably make use of the native behavior of fcvt instructions in many cases. Fixes PR50083	2021-04-22 15:18:15 -07:00
Jun Ma	978eb3f168	[DAGCombiner] Allow operand of step_vector to be negative. It is proper to relax non-negative limitation of step_vector. Also this patch adds more combines for step_vector: (sub X, step_vector(C)) -> (add X, step_vector(-C)) Differential Revision: https://reviews.llvm.org/D100812	2021-04-22 20:58:03 +08:00
Simon Pilgrim	d860bf2d0e	[DAG] TargetLowering.cpp - breakup if-else chains where each block returns. NFCI. Match style guide that requests that if+return blocks are separate.	2021-04-21 11:17:27 +01:00
Fraser Cormack	c141bd3cf9	[DAGCombiner] Support all-ones/all-zeros SPLAT_VECTOR in more combines This patch adds incrementally-better support for SPLAT_VECTOR in a handful of vector combines by changing a few more isBuildVectorAllOnes/isBuildVectorAllZeros to the equivalent isConstantSplatVectorAllOnes/Zeros calls. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D100851	2021-04-21 11:05:37 +01:00
Philip Reames	4824d876f0	Revert "Allow invokable sub-classes of IntrinsicInst" This reverts commit `d87b9b81cc`. Post commit review raised concerns, reverting while discussion happens.	2021-04-20 15:38:38 -07:00
Philip Reames	d87b9b81cc	Allow invokable sub-classes of IntrinsicInst It used to be that all of our intrinsics were call instructions, but over time, we've added more and more invokable intrinsics. According to the verifier, we're up to 8 right now. As IntrinsicInst is a sub-class of CallInst, this puts us in an awkward spot where the idiomatic means to check for intrinsic has a false negative if the intrinsic is invoked. This change switches IntrinsicInst from being a sub-class of CallInst to being a subclass of CallBase. This allows invoked intrinsics to be instances of IntrinsicInst, at the cost of requiring a few more casts to CallInst in places where the intrinsic really is known to be a call, not an invoke. After this lands and has baked for a couple days, planned cleanups: Make GCStatepointInst a IntrinsicInst subclass. Merge intrinsic handling in InstCombine and use idiomatic visitIntrinsicInst entry point for InstVisitor. Do the same in SelectionDAG. Do the same in FastISEL. Differential Revision: https://reviews.llvm.org/D99976	2021-04-20 15:03:49 -07:00
Simon Pilgrim	2a419a0b99	[X86][SSE] combineX86ShuffleChain - check if we're blending with zero into already zero elements Add a SelectionDAG::MaskedElementsAreZero helper that wraps SelectionDAG::MaskedValueIsZero testing for entirely zero vector elements	2021-04-20 17:09:49 +01:00
Simon Pilgrim	e156f2515c	[DAG] SelectionDAG.cpp - breakup if-else chains where each block returns. NFCI. Match style guide that requests that if+return blocks are separate.	2021-04-20 12:37:00 +01:00
Jun Ma	1ef5699d1a	[DAGCombiner] Support fold zero scalar vector. This patch changes ISD::isBuildVectorAllZeros to ISD::isConstantSplatVectorAllZeros which handles zero sclar vector. TestPlan: check-llvm Differential Revision: https://reviews.llvm.org/D100813	2021-04-20 16:28:43 +08:00
Fraser Cormack	457da7f298	[SelectionDAG] Relax constraints on STEP_VECTOR step operand This patch relaxes the requirement that the STEP_VECTOR step constant must be of a type at least as large as the vector element type. This does not permit its use on targets which have legal vector element types larger than the largest legal scalar type, such as i64 vectors on RV32. As such, the requirement has been loosened so that the step operand must be any scalar type so long as the constant immediate is non-negative and the value fits inside the vector element type. This limits combining optimizations in certain circumstances but in practice it's unlikely to be a hindrance. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D100660	2021-04-20 08:41:42 +01:00
David Sherwood	83f5fa519e	[CodeGen] Improve code generation for clamping of constant indices with scalable vectors When trying to clamp a constant index into a scalable vector we can test if the index is less than the minimum number of elements in the vector. If so, we can simply return the index because we know it is guaranteed to fit inside the vector. Differential Revision: https://reviews.llvm.org/D100639	2021-04-19 08:34:17 +01:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Simon Pilgrim	37a4621fb6	[DAG] SelectionDAG::isSplatValue - early out if binop is not splat. NFCI. Just return false if we fail to match splats - the remainder of the code is for (fixed)vector operations - shuffles/insertions etc.	2021-04-16 18:26:33 +01:00
Momchil Velikov	f9d932e673	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
Jun Ma	7e1422c1e4	[DAGCombiner] Fold step_vector with add/mul/shl This patch implements some DAG combines for STEP_VECTOR: add step_vector(C1), step_vector(C2) -> step_vector(C1+C2) add (add X step_vector(C1)), step_vector(C2) -> add X step_vector(C1+C2) mul step_vector(C1), C2 -> step_vector(C1*C2) shl step_vector(C1), C2 -> step_vector(C1<<C2) TestPlan: check-llvm Differential Revision: https://reviews.llvm.org/D100088	2021-04-15 18:06:35 +08:00
Tim Northover	6401b78ab3	SDAG: constant fold bf16 -> i16 casts This direction is particularly useful because i16 constants are much more likely to be legal than bf16.	2021-04-14 11:27:46 +01:00
Tim Northover	5e3d9fcc3a	StackProtector: ensure protection does not interfere with tail call frame. The IR stack protector pass must insert stack checks before the call instead of between it and the return. Similarly, SDAG one should recognize that ADJCALLFRAME instructions could be part of the terminal sequence of a tail call. In this case because such call frames cannot be nested in LLVM the stack protection code must skip over the whole sequence (or risk clobbering argument registers).	2021-04-13 15:14:57 +01:00
Amy Huang	dad5caa59e	Revert "Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This change causes an assert / segmentation fault in LTO builds. This reverts commit `f2e4f3eff3`.	2021-04-12 20:10:17 -07:00
Stephen Tozer	f2e4f3eff3	Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" The causes of the previous build errors have been fixed in revisions `aa3e78a59f`, and `140757bfaa` This reverts commit `f40976bd01`.	2021-04-12 16:57:29 +01:00
Stephen Tozer	aa3e78a59f	Reapply "[DebugInfo] Correctly track SDNode dependencies for list debug values" Fixed memory leak error by using BumpAllocator for SDDbgValue arrays. This reverts commit `1b589172bd`.	2021-04-12 12:51:29 +01:00
dfukalov	d066079728	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. Main reason is preparation to transform AliasResult to class that contains offset for PartialAlias case. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D98027	2021-04-09 12:54:22 +03:00
Stephen Tozer	1b589172bd	Revert "[DebugInfo] Correctly track SDNode dependencies for list debug values" Reverted due to failure on the sanitizer-x86_64-linux-fast bot. This reverts commit `e10493eb50`.	2021-04-08 17:55:45 +01:00
Stephen Tozer	e10493eb50	[DebugInfo] Correctly track SDNode dependencies for list debug values During SelectionDAG, we must track the SDNodes that each SDDbgValue depends on to compute its value. These are ultimately derived from the location operands to the SDDbgValue, but were stored in a separate vector prior to this patch. This resulted in cases where one of the lists was updated incorrectly, resulting in crashes during compilation. This patch fixes the issue by directly recomputing the dependency list from the SDDbgOperands in getDependencies(). Differential Revision: https://reviews.llvm.org/D99423	2021-04-08 17:01:45 +01:00
Craig Topper	67953311e2	[SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR This allows FoldConstantArithmetic to handle SPLAT_VECTOR in addition to BUILD_VECTOR. This allows it to support scalable vectors. I'm also allowing fixed length SPLAT_VECTOR which is used by some targets, but I'm not familiar enough to write tests for those targets. I had to block this function from running on CONCAT_VECTORS to avoid calling getNode for a CONCAT_VECTORS of 2 scalars. This can happen because the 2 operand getNode calls this function for any opcode. Previously we were protected because CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR before that call. But it's not always possible to fold a CONCAT_VECTORS of SPLAT_VECTORs, and we don't even try. This fixes PR49781 where DAG combine thought constant folding should be possible, but FoldConstantArithmetic couldn't do it. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D99682	2021-04-07 10:03:33 -07:00
Yevgeny Rouban	3e738afae4	[Statepoint Lowering] Allow other than N byte sized types in deopt bundle I do not see any bit-width restriction from the point of the LLVM Lang Ref - Operand Bundles on the types of the deopt bundle operands. Statepoint Lowering seems to be able to work with any types. This patch relaxes the two related assertions and adds a new test for this change. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D100006	2021-04-07 17:48:31 +07:00
Philip Reames	fb41cae039	More precisely type code used for gc.relocate assertions [nfc]	2021-04-06 11:27:36 -07:00
Simon Pilgrim	ddbb58736a	[KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI. As promised in D98866	2021-04-06 10:11:41 +01:00
Nikita Popov	665065821e	[FastISel] Remove kill tracking This is a followup to D98145: As far as I know, tracking of kill flags in FastISel is just a compile-time optimization. However, I'm not actually seeing any compile-time regression when removing the tracking. This probably used to be more important in the past, before FastRA was switched to allocate instructions in reverse order, which means that it discovers kills as a matter of course. As such, the kill tracking doesn't really seem to serve a purpose anymore, and just adds additional complexity and potential for errors. This patch removes it entirely. The primary changes are dropping the hasTrivialKill() method and removing the kill arguments from the emitFast methods. The rest is mechanical fixup. Differential Revision: https://reviews.llvm.org/D98294	2021-04-03 15:50:13 +02:00
Simon Pilgrim	4ea5475a3f	[KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI. Include exhaustive test coverage.	2021-04-02 21:44:33 +01:00
Jun Ma	274ac9d40e	[AArch64][SVE] Lowering sve.dot to DOT node Differential Revision: https://reviews.llvm.org/D99699	2021-04-02 20:05:17 +08:00
Sander de Smalen	0f7bbbc481	Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed. In order to bring up scalable vector support in LLVM incrementally, we introduced behaviour to emit a warning, instead of an error, when asking the wrong question of a scalable vector, like asking for the fixed number of elements. This patch puts that behaviour under a flag. The default behaviour is that the compiler will always error, which means that all LLVM unit tests and regression tests will now fail when a code-path is taken that still uses the wrong interface. The behaviour to demote an error to a warning can be individually enabled for tools that want to support experimental use of scalable vectors. This patch enables that behaviour when driving compilation from Clang. This means that for users who want to try out scalable-vector support, fixed-width codegen support, or build user-code with scalable vector intrinsics, Clang will not crash and burn when the compiler encounters such a case. This allows us to do away with the following pattern in many of the SVE tests: RUN: .... 2>%t RUN: cat %t \| FileCheck --check-prefix=WARN WARN-NOT: warning: ... The behaviour to emit warnings is only temporary and we expect this flag to be removed in the future when scalable vector support is more stable. This patch also has fixes the following tests: unittests: ScalableVectorMVTsTest.SizeQueries SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG regression tests: Transforms/InstCombine/vscale_gep.ll Reviewed By: paulwalker-arm, ctetreau Differential Revision: https://reviews.llvm.org/D98856	2021-04-02 10:55:22 +01:00
Simon Pilgrim	77d625f8d8	[DAG] MergeInnerShuffle with BinOps - sometimes accept undef mask elements If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well. This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.	2021-04-01 14:33:00 +01:00
Simonas Kazlauskas	777a58e05b	Support {S,U}REMEqFold before legalization This allows these optimisations to apply to e.g. `urem i16` directly before `urem` is promoted to i32 on architectures where i16 operations are not intrinsically legal (such as on Aarch64). The legalization then later can happen more directly and generated code gets a chance to avoid wasting time on computing results in types wider than necessary, in the end. Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like {D87976}. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D88785	2021-04-01 01:35:41 +03:00
Craig Topper	9e00b6660d	[SelectionDAG] Remove unneeded vector resize from the end of FoldConstantArithmetic. NFC There's an assert right before that makes sure the size already matches. Earlier in this function's life, scalars and vectors shared more code.	2021-03-31 12:33:10 -07:00
Tomas Matheson	a9968c0a33	[NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87	2021-03-30 17:31:39 +01:00
Bradley Smith	9745dce8c3	[SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps This is currently performed in SelectionDAGLegalize, here we make it also happen in LegalizeVectorOps, allowing a target to lower the SETCC condition codes first in LegalizeVectorOps and then lower to a custom node afterwards, without having to duplicate all of the SETCC condition legalization in the target specific lowering. As a result of this, fixed length floating point SETCC nodes can now be properly lowered for SVE. Differential Revision: https://reviews.llvm.org/D98939	2021-03-29 15:32:25 +01:00
Florian Hahn	eb3d9f2eb6	[SelDag] Add isIntOrFPConstant helper function. This patch adds a new isIntOrFPConstant helper function to check if a SDValue is a integer of FP constant. This pattern is used in various places. There also are places that incorrectly just check for integer constants, e.g. D99384, so hopefully this helper will help people avoid that issue. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D99428	2021-03-28 12:48:58 +01:00
David Sherwood	748ae5281d	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Craig Topper	2f13e63f9e	[LegalizeDAG] Add asserts to verify the types of custom legalized operation matches the original node. We've messed this up a few times recently on RISCV. Experiments with these asserts found a couple issues on other targets as well. They've all been cleaned up now so we can put in these asserts to catch future issues I had to waive Glue because ADDC/ADDE/etc legalization replaces Glue with i32 on at least AArch64. X86 used to do the same before we switched to ADDCARRY. So I guess that's just how that works. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D98979	2021-03-22 10:28:51 -07:00
Craig Topper	30080b003e	[DAGCombiner] Minor compile time improvement to (sext_in_reg (sign_extend_vector_inreg x)) optimization. Don't bother calling ComputeNumSignBits if N00Bits < ExtVTBits. No matter what answer we get back this will be true: (N00Bits - DAG.ComputeNumSignBits(N00, DemandedSrcElts)) < ExtVTBits) So we might as well save the computation. This makes the code more consistent with the similar (sext_in_reg (sext x)) handling above.	2021-03-21 11:16:41 -07:00
Simon Pilgrim	64c2641c89	[DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.	2021-03-21 14:01:37 +00:00
Simon Pilgrim	9d2df96407	[DAG] computeKnownBits - add ISD::MULHS/MULHU/SMUL_LOHI/UMUL_LOHI handling Reuse the existing KnownBits multiplication code to handle the 'extend + multiply + extract high bits' pattern for multiply-high ops. Noticed while looking at the codegen for D88785 / D98587 - the patch helps division-by-constant expansion code in particular, which suggests that we might have some further KnownBits div/rem cases we could handle - but this was far easier to implement. Differential Revision: https://reviews.llvm.org/D98857	2021-03-19 16:02:31 +00:00
Simon Pilgrim	ffb2887103	[DAG] Fold shuffle(bop(shuffle(x,y),shuffle(z,w)),undef) -> bop(shuffle'(x,y),shuffle'(z,w)) Followup to D96345, handle unary shuffles of binops (as well as binary shuffles) if we can merge the shuffle with inner operand shuffles. Differential Revision: https://reviews.llvm.org/D98646	2021-03-19 14:14:56 +00:00
Craig Topper	182b831aeb	[DAGCombiner][RISCV] Teach visitMGATHER/MSCATTER to remove gather/scatters with all zeros masks that use SPLAT_VECTOR. Previously only all zeros BUILD_VECTOR was recognized.	2021-03-18 15:34:14 -07:00
Simon Pilgrim	1ba5c550d4	[DAG] Improve folding (sext_in_reg (*_extend_vector_inreg x)) -> (sext_vector_inreg x) Extend this to support ComputeNumSignBits of the (used) source vector elements so that we can handle more than just the case where we're sext_in_reg from the source element signbit. Noticed while investigating the poor codegen in D98587.	2021-03-18 15:34:53 +00:00
Simon Pilgrim	b1afa187c8	[DAG] SelectionDAG::isSplatValue - add ISD::ABS handling Add ISD::ABS to the existing unary instructions handling for splat detection This is similar to D83605, but doesn't appear to need to touch any of the wasm refactoring. Differential Revision: https://reviews.llvm.org/D98778	2021-03-18 10:28:29 +00:00
Stephen Tozer	3bfddc2593	Reapply "[DebugInfo] Handle multiple variable location operands in IR" Fixed section of code that iterated through a SmallDenseMap and added instructions in each iteration, causing non-deterministic code; replaced SmallDenseMap with MapVector to prevent non-determinism. This reverts commit `01ac6d1587`.	2021-03-17 16:45:25 +00:00
Hans Wennborg	01ac6d1587	Revert "[DebugInfo] Handle multiple variable location operands in IR" This caused non-deterministic compiler output; see comment on the code review. > This patch updates the various IR passes to correctly handle dbg.values with a > DIArgList location. This patch does not actually allow DIArgLists to be produced > by salvageDebugInfo, and it does not affect any pass after codegen-prepare. > Other than that, it should cover every IR pass. > > Most of the changes simply extend code that operated on a single debug value to > operate on the list of debug values in the style of any_of, all_of, for_each, > etc. Instances of setOperand(0, ...) have been replaced with with > replaceVariableLocationOp, which takes the value that is being replaced as an > additional argument. In places where this value isn't readily available, we have > to track the old value through to the point where it gets replaced. > > Differential Revision: https://reviews.llvm.org/D88232 This reverts commit `df69c69427`.	2021-03-17 13:36:48 +01:00
serge-sans-paille	6e040a19db	[NFC] Wisely nest dyn_cast in FunctionLoweringInfo Take advantage of the inheritance tree to avoid a few comparison.	2021-03-16 10:22:44 +01:00
Fraser Cormack	0035decae7	[CodeGen] Fix issues with scalable-vector INSERT/EXTRACT_SUBVECTORs This patch addresses a few issues when dealing with scalable-vector INSERT_SUBVECTOR and EXTRACT_SUBVECTOR nodes. When legalizing in DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR, we store the low and high halves to the stack separately. The offset for the high half was calculated incorrectly. Additionally, we can optimize this process when we can detect that the subvector is contained entirely within the low/high split vector type. While this optimization is valid on scalable vectors, when performing the 'high' optimization, the subvector must also be a scalable vector. Note that the 'low' optimization is still conservative: it may be possible to insert v2i32 into the low half of a split nxv1i32/nxv1i32, but we can't guarantee it. It is always possible to insert v2i32 into nxv2i32 or v2i32 into nxv4i32+2 as we know vscale is at least 1. Lastly, in SelectionDAG::isSplatValue, we early-exit on the extracted subvector value type being a scalable vector, forgetting that we can also extract a fixed-length vector from a scalable one. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98495	2021-03-15 17:04:21 +00:00
Craig Topper	5b825433d7	[DAGCombiner] Optimize 1-bit smulo to AND+SETNE. A 1-bit smulo overflows is both inputs are -1 since the result should be +1 which can't be represented in a signed 1 bit value. We can detect this with an AND and a setcc. The multiply result can also use the same AND. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D97634	2021-03-13 09:39:36 -08:00
Craig Topper	2ea7014089	[DAGCombiner] Use isConstantSplatVectorAllZeros/Ones instead of isBuildVectorAllZeros/Ones in visitMSTORE and visitMLOAD. This allows us to optimize when the mask is a splat_vector in addition to build_vector.	2021-03-12 12:14:56 -08:00
LemonBoy	cfe69c8efd	[SelectionDAG] Improve scalarization of irregular vector types Use a more general strategy when splitting a vector into scalar parts (and vice-versa) to correctly handle vector types whose element size is not a power of 2 (and a multiple of 8). Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D98273	2021-03-11 19:57:13 +01:00
Stephen Tozer	f40976bd01	Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands" This reverts commit `c0f3dfb9f1`. Reverted due to an error on the clang-x64-windows-msvc buildbot.	2021-03-11 14:48:01 +00:00
gbtozers	c0f3dfb9f1	[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic operations with two or more non-const operands; this includes the GetElementPtr instruction, and most Binary Operator instructions. These salvages produce DIArgList locations and are only valid for dbg.values, as currently variadic DIExpressions must use DW_OP_stack_value. This functionality is also only added for salvageDebugInfoForDbgValues; other functions that directly call salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be updated in a later patch. Differential Revision: https://reviews.llvm.org/D91722	2021-03-11 13:33:49 +00:00
Serguei Katkov	0480927712	[Statepoint Lowering] Handle the case with several gc.result Recently gc.result has been marked with readnone instead of readonly and this opens a door for different optimization to duplicate gc.result. Statepoint lowering is not ready to see several gc.results. The problem appears when there are gc.results with one located in the same basic block and another located in other basic block. In this case we need both export VR and fill local setValue. Note that this case is not sufficient optimization done before CodeGen. It is evident that local gc.result dominates all other gc.results and it is handled by GVN and EarlyCSE. But anyway, even if IR is not optimal Backend should not crash on a valid IR. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98393	2021-03-11 18:44:44 +07:00
Quentin Colombet	66dab2fa84	[NFC] Fix compiler warnings Fix warnings caused by -Wrange-loop-analysis. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98298	2021-03-10 11:03:50 -08:00
Craig Topper	9106d04554	[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004	2021-03-10 09:46:18 -08:00
Jinzheng Tu	481079e284	[NFC] Unify FIME with FIXME in comments There are 5 occurrences FIME and 15333 FIXME. All of them should be FIXME. Reviewed By: alexfh Differential Revision: https://reviews.llvm.org/D98321	2021-03-10 14:00:51 +01:00
Serguei Katkov	2fccd1b00a	[Statepoint Lowering] Fix the crash with gc.relocate in a separate block If it was decided to relocate derived pointer using the spill its value is not exported in general case. When gc.relocate is located in an another block than a statepoint we cannot get SD for derived value but for spill case it is not required at all. However implementation of gc.relocate lowering unconditionally request SD value causing the assert triggering. The CL fixes this by handling spill case earlier than SD is really required. Reviewers: reames, dantrushin Reviewed By: dantrushin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D98324	2021-03-10 19:51:04 +07:00
Nikita Popov	55ae279ba7	[FastISel] Don't trivially kill extractvalues (PR49467) All extractvalues of the same value at the same index will map to the same register, so even if one specific extractvalue only has one use, we should not mark it as a trivial kill, as there may be more extractvalues later. Fixes https://bugs.llvm.org/show_bug.cgi?id=49467. Differential Revision: https://reviews.llvm.org/D98145	2021-03-09 18:46:38 +01:00
gbtozers	df69c69427	[DebugInfo] Handle multiple variable location operands in IR This patch updates the various IR passes to correctly handle dbg.values with a DIArgList location. This patch does not actually allow DIArgLists to be produced by salvageDebugInfo, and it does not affect any pass after codegen-prepare. Other than that, it should cover every IR pass. Most of the changes simply extend code that operated on a single debug value to operate on the list of debug values in the style of any_of, all_of, for_each, etc. Instances of setOperand(0, ...) have been replaced with with replaceVariableLocationOp, which takes the value that is being replaced as an additional argument. In places where this value isn't readily available, we have to track the old value through to the point where it gets replaced. Differential Revision: https://reviews.llvm.org/D88232	2021-03-09 16:44:38 +00:00
gbtozers	5491a86f59	[DebugInfo] Emit DBG_VALUE_LIST from ISel This patch completes ISel support for DIArgList dbg.values by allowing SDDbgValues with multiple location operands to be emitted as DBG_VALUE_LIST instructions. The primary change of this patch is refactoring EmitDbgValue by pulling location operand emission out to the new function AddDbgValueLocationOps, which is used for both DIArgList and single value dbg.values. Outside of that, the only behaviour change is that the scheduler has a lambda added, HasUnknownVReg, to prevent us from attempting to emit a DBG_VALUE_LIST before all of its used VRegs have become available. Differential Revision: https://reviews.llvm.org/D88592	2021-03-09 12:17:39 +00:00
Cullen Rhodes	2750f3ed31	[IR] Introduce llvm.experimental.vector.splice intrinsic This patch introduces a new intrinsic @llvm.experimental.vector.splice that constructs a vector of the same type as the two input vectors, based on a immediate where the sign of the immediate distinguishes two variants. A positive immediate specifies an index into the first vector and a negative immediate specifies the number of trailing elements to extract from the first vector. For example: @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count These intrinsics support both fixed and scalable vectors, where the former is lowered to a shufflevector to maintain existing behaviour, although while marked as experimental the recommended way to express this operation for fixed-width vectors is to use shufflevector. For scalable vectors where it is not possible to express a shufflevector mask for this operation, a new ISD node has been implemented. This is one of the named shufflevector intrinsics proposed on the mailing-list in the RFC at [1]. Patch by Paul Walker and Cullen Rhodes. [1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D94708	2021-03-09 10:44:22 +00:00
gbtozers	93b170ea24	[DebugInfo] Handle dbg.values with multiple variable location operands in ISel This patch adds partial support in Instruction Selection for dbg.values that use a DIArgList. This patch does not add support for producing DBG_VALUE_LIST, but adds the logic for processing DIArgLists within the ISel pass. This change is largely focused on handleDebugValue and some of the functions that it calls. Outside of this, salvageDebugInfo and transferDbgValues have been modified to replace individual operands instead of the entire value; dangling debug info for variadic debug values is not currently supported (but may be added later). Differential Revision: https://reviews.llvm.org/D88589	2021-03-09 09:48:03 +00:00
Jessica Paquette	f7d73a6b9e	[SelectionDAG] Don't scalarize vector fpround sources that don't need it. Similar to the workaround code in ScalarizeVecRes_UnaryOp, ScalarizeVecRes_SETCC , ScalarizeVecRes_VSELECT, etc. If we have a case like this: ``` define <1 x half> @func(<1 x float> %x) { %tmp = fptrunc <1 x float> %x to <1 x half> ret <1 x half> %tmp } ``` On AArch64, the <1 x float> is legal. So, this will crash if we call GetScalarizedVector on it. Differential Revision: https://reviews.llvm.org/D98208	2021-03-08 14:37:33 -08:00
Stephen Tozer	c0450af559	Fix: [DebugInfo] Support representation of multiple location operands in SDDbgValue Removes a "default" label from a fully covered switch, causing errors on -Wcovered-switch-default builds.	2021-03-08 19:14:12 +00:00
gbtozers	9525af7b91	[DebugInfo] Support representation of multiple location operands in SDDbgValue This patch modifies the class that represents debug values during ISel, SDDbgValue, to support multiple location operands (to represent a dbg.value that uses a DIArgList). Part of this class's functionality has been split off into a new class, SDDbgOperand. The new class SDDbgOperand represents a single value, corresponding to an SSA value or MachineOperand in the IR and MIR respectively. Members of SDDbgValue that were previously related to that specific value (as opposed to the variable or DIExpression), such as the Kind enum, have been moved to SDDbgOperand. SDDbgValue now contains an array of SDDbgOperand instead, allowing it to hold more than one of these values. All changes outside SDDbgValue are simply updates to use the new interface. Differential Revision: https://reviews.llvm.org/D88585	2021-03-08 18:45:17 +00:00
gbtozers	e5d958c456	[DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch updates DbgVariableIntrinsics to support use of a DIArgList for the location operand, resulting in a significant change to its interface. This patch does not update all IR passes to support multiple location operands in a dbg.value; the only change is to update the DbgVariableIntrinsic interface and its uses. All code outside of the intrinsic classes assumes that an intrinsic will always have exactly one location operand; they will still support DIArgLists, but only if they contain exactly one Value. Among other changes, the setOperand and setArgOperand functions in DbgVariableIntrinsic have been made private. This is to prevent code from setting the operands of these intrinsics directly, which could easily result in incorrect/invalid operands being set. This does not prevent these functions from being called on a debug intrinsic at all, as they can still be called on any CallInst pointer; it is assumed that any code directly setting the operands on a generic call instruction is doing so safely. The intention for making these functions private is to prevent DIArgLists from being overwritten by code that's naively trying to replace one of the Values it points to, and also to fail fast if a DbgVariableIntrinsic is updated to use a DIArgList without a valid corresponding DIExpression.	2021-03-08 14:36:13 +00:00
Craig Topper	0eb405c3b8	[SelectionDAG] Add computeKnownBits support for ISD::USUBSAT. The result of ISD::USUBSAT will never be larger than the LHS. We can use this to put a bound on the number of leading zeros. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D98133	2021-03-07 09:48:42 -08:00
LemonBoy	2ec43e4167	[LegalizeDAG] Implement promotion rules for SELECT_CC Implement the promotion rule for SELECT_CC nodes by upcasting all the parameters and downcasting the result. The AArch64 target makes use of this rule and, since it was not implemented, in some cases the instruction selector would hit an assertion upon encountering the illegal node. This patch requires D97840, the included test cases hit both problems. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97859	2021-03-05 18:22:55 +01:00
Craig Topper	ad532be012	[SelectionDAG] Assert that operands to SelectionDAG::getNode are not DELETED_NODE to catch issues like PR49393 earlier. I'm not sure this would catch all such issues, but it would catch some. The problem for PR49393 was that we were holding a reference to a node that wasn't connect edto the DAG across a function that could delete unused nodes. In this particular case we managed to try to use the deleted node while it was in the deleted state before its memory got recycled. It could also happen that we delete the node, something allocates a new node which recycles the memory. Then we try to use the reference we were holding and it is now a completely different node with different valid opcode. This patch would not catch that. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D97969	2021-03-04 23:05:32 -08:00

1 2 3 4 5 ...

11556 Commits