llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	7ed3e87825	[Attributes] Determine attribute properties from TableGen data Continuing from D105763, this allows placing certain properties about attributes in the TableGen definition. In particular, we store whether an attribute applies to fn/param/ret (or a combination thereof). This information is used by the Verifier, as well as the ForceFunctionAttrs pass. I also plan to use this in LLParser, which also duplicates info on which attributes are valid where. This keeps metadata about attributes in one place, and makes it more likely that it stays in sync, rather than in various functions spread across the codebase. Differential Revision: https://reviews.llvm.org/D105780	2021-07-12 22:13:38 +02:00
Nikita Popov	6ac32872ee	[Attributes] Replace doesAttrKindHaveArgument() (NFC) This is now the same as isIntAttrKind(), so use that instead, as it does not require manual maintenance. The naming is also more accurate in that both int and type attributes have an argument, but this method was only targeting int attributes. I initially wanted to tighten the AttrBuilder assertion, but we have some in-tree uses that would violate it.	2021-07-12 21:57:26 +02:00
Nikita Popov	333c0acb9b	[Verifier] Support opaque pointers for global_ctors Adjust the assertion to allow opaque pointers.	2021-06-28 21:40:54 +02:00
Akira Hatanaka	f85b9d6443	[ObjC][ARC] Ignore operand bundle "clang.arc.attachedcall" on a call if the call's return type is void Instead of trying hard to prevent global optimization passes such as deadargelim from changing the return type to void, just ignore the bundle if the return type is void. clang currently emits calls to @llvm.objc.clang.arc.noop.use, which consumes the function call result, immediately after the function call to prevent changes to the return type, but optimization passes can delete the call to @llvm.objc.clang.arc.noop.use if the function call doesn't return, which enables deadargelim to change the return type. rdar://76671438 Differential Revision: https://reviews.llvm.org/D103062	2021-06-28 11:02:30 -07:00
Nikita Popov	8c2d4621d9	[Verifier] Support masked load/store with opaque pointers	2021-06-26 18:11:59 +02:00
Nikita Popov	f660af46e3	[OpaquePtr] Support call instruction Add support for call of opaque pointer, currently only possible for indirect calls. This requires a bit of special casing in LLParser, as calls do not specify the callee operand type explicitly. Differential Revision: https://reviews.llvm.org/D104740	2021-06-23 20:17:26 +02:00
Joe Ellis	3c4dbf6ea9	[Verifier] Fail on overrunning and invalid indices for {insert,extract} vector intrinsics With regards to overrunning, the langref (llvm/docs/LangRef.rst) specifies: (llvm.experimental.vector.insert) Elements ``idx`` through (``idx`` + num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition cannot be determined statically but is false at runtime, then the result vector is undefined. (llvm.experimental.vector.extract) Elements ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector indices. If this condition cannot be determined statically but is false at runtime, then the result vector is undefined. For the non-mixed cases (e.g. inserting/extracting a scalable into/from another scalable, or inserting/extracting a fixed into/from another fixed), it is possible to statically check whether or not the above conditions are met. This was previously missing from the verifier, and if the conditions were found to be false, the result of the insertion/extraction would be replaced with an undef. With regards to invalid indices, the langref (llvm/docs/LangRef.rst) specifies: (llvm.experimental.vector.insert) ``idx`` represents the starting element number at which ``subvec`` will be inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum vector length. (llvm.experimental.vector.extract) The ``idx`` specifies the starting element number within ``vec`` from which a subvector is extracted. ``idx`` must be a constant multiple of the known-minimum vector length of the result type. Similarly, these conditions were not previously enforced in the verifier. In some circumstances, invalid indices were permitted silently, and in other circumstances, an undef was spawned where a verifier error would have been preferred. This commit adds verifier checks to enforce the constraints above. Differential Revision: https://reviews.llvm.org/D104468	2021-06-23 10:33:22 +00:00
Nick Desaulniers	8ace121305	[IR] convert warn-stack-size from module flag to fn attr Otherwise, this causes issues when building with LTO for object files that use different values. Link: https://github.com/ClangBuiltLinux/linux/issues/1395 Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D104342	2021-06-21 15:09:25 -07:00
Zequan Wu	fad8d4230f	[OpaquePtr] Verify Opaque pointer in function parameter Verifying opaque pointer as function parameter when using with `byval`, `byref`, `inalloca`, `preallocated`. Differential Revision: https://reviews.llvm.org/D104309	2021-06-15 14:57:48 -07:00
Philip Reames	ac81cb7e6d	Allow ptrtoint/inttoptr of non-integral pointer types in IR I don't like landing this change, but it's an acknowledgement of a practical reality. Despite not having well specified semantics for inttoptr and ptrtoint involving non-integral pointer types, they are used in practice. Here's a quick summary of the current pragmatic reality: * I happen to know that the main external user of non-integral pointers has effectively disabled the verifier rules. * RS4GC (the lowering pass for abstract GC machine model which is the key motivation for non-integral pointers), even supports them. We just have all the tests using an integral pointer space to let the verifier run. * Certain idioms (such as alignment checks for alignment N, where any relocation is guaranteed to be N byte aligned) are fine in practice. * As implemented, inttoptr/ptrtoint are CSEd and are not control dependent. This means that any code which is intending to check a particular bit pattern at site of use must be wrapped in an intrinsic or external function call. This change allows them in the Verifier, and updates the LangRef to specific them as implementation dependent. This allows us to acknowledge current reality while still leaving ourselves room to punt on figuring out "good" semantics until the future.	2021-06-11 13:38:32 -07:00
Tim Northover	9ff2eb1ea5	SwiftTailCC: teach verifier musttail rules applicable to this CC. SwiftTailCC has a different set of requirements than the C calling convention for a tail call. The exact argument sequence doesn't have to match, but fewer ABI-affecting attributes are allowed. Also make sure the musttail diagnostic triggers if a musttail call isn't actually a tail call.	2021-05-28 11:12:00 +01:00
Yevgeny Rouban	4d26f41f76	[RS4GC] Introduce intrinsics to get base ptr and offset There can be a need for some optimizations to get (base, offset) for any GC pointer. The base can be calculated by generating needed instructions as it is done by the RewriteStatepointsForGC::findBasePointer() function. The offset can be calculated in the same way. Though to not expose the base calculation and to make the offset calculation as simple as ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal outside RS4GC, this patch introduces 2 intrinsics: @llvm.experimental.gc.get.pointer.base(%derived_ptr) @llvm.experimental.gc.get.pointer.offset(%derived_ptr) These intrinsics are inlined by RS4GC along with generation of statepoint sequences. With these new intrinsics the GC parseable lowering for atomic memcpy intrinsics (`6ec2c5e402`) could be implemented as a separate pass. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D100445	2021-05-27 09:14:14 +07:00
Marco Elver	280333021e	[SanitizeCoverage] Add support for NoSanitizeCoverage function attribute We really ought to support no_sanitize("coverage") in line with other sanitizers. This came up again in discussions on the Linux-kernel mailing lists, because we currently do workarounds using objtool to remove coverage instrumentation. Since that support is only on x86, to continue support coverage instrumentation on other architectures, we must support selectively disabling coverage instrumentation via function attributes. Unfortunately, for SanitizeCoverage, it has not been implemented as a sanitizer via fsanitize= and associated options in Sanitizers.def, but rolls its own option fsanitize-coverage. This meant that we never got "automatic" no_sanitize attribute support. Implement no_sanitize attribute support by special-casing the string "coverage" in the NoSanitizeAttr implementation. To keep the feature as unintrusive to existing IR generation as possible, define a new negative function attribute NoSanitizeCoverage to propagate the information through to the instrumentation pass. Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035 Reviewed By: vitalybuka, morehouse Differential Revision: https://reviews.llvm.org/D102772	2021-05-25 12:57:14 +02:00
Arthur Eubanks	7a29a12301	[Verifier] Move some atomicrmw/cmpxchg checks to instruction creation These checks already exist as asserts when creating the corresponding instruction. Anybody creating these instructions already need to take care to not break these checks. Move the checks for success/failure ordering in cmpxchg from the verifier to the LLParser and BitcodeReader plus an assert. Add some tests for cmpxchg ordering. The .bc files are created from the .ll files with an llvm-as with these checks disabled. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102803	2021-05-21 13:41:17 -07:00
Andy Wingo	81bc732816	[IR][Verifier] Relax restriction on alloca address spaces In the WebAssembly target, we would like to allow alloca in two address spaces. The alloca instruction already has an address space argument, but the verifier asserts that the address space of an alloca is the default alloca address space from the datalayout. This patch removes this restriction. Targets that would like to impose additional restrictions should do so via target-specific verification passes. Differential Revision: https://reviews.llvm.org/D101045	2021-05-21 11:52:45 +02:00
Arthur Eubanks	0bebda17be	[OpaquePtr] Make atomicrmw work with opaque pointers FullTy is only necessary when we need to figure out what type an instruction works with given a pointer's pointee type. However, we just end up using the value operand's type, so FullTy isn't necessary. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102788	2021-05-19 12:49:28 -07:00
Arthur Eubanks	1b25fce404	[OpaquePtr] Make cmpxchg work with opaque pointers Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102745	2021-05-19 12:44:10 -07:00
Arthur Eubanks	6013d84392	[OpaquePtr] Make loads and stores work with opaque pointers Don't check that types match when the pointer operand is an opaque pointer. I would separate the Assembler and Verifier changes, but verify-uselistorder in the Assembler test ends up running the verifier. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102450	2021-05-18 13:43:50 -07:00
Ten Tzen	797ad70152	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1 This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/	2021-05-17 22:42:17 -07:00
Tim Northover	dbf8cc7b66	Verifier: second attempt to fix what I broke with swiftasync. During a rebase I messed up this array, so trying to put it back to as it was before with just one SwiftAsync entry.	2021-05-15 08:04:57 +01:00
Tim Northover	709f2c7e14	SwiftAsync: remove duplicate instance in array. NFC.	2021-05-14 19:21:54 +01:00
Tim Northover	ea0eec69f1	IR+AArch64: add a "swiftasync" argument attribute. This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).	2021-05-14 11:43:58 +01:00
cynecx	8ec9fd4839	Support unwinding from inline assembly I've taken the following steps to add unwinding support from inline assembly: 1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax: ``` invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"() to label %exit unwind label %uexit ``` 2.) Add Bitcode writing/reading support + LLVM-IR parsing. 3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled. 4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind. 5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled. 6.) Don't allow unwinding callbr. Reviewed By: Amanieu Differential Revision: https://reviews.llvm.org/D95745	2021-05-13 19:13:03 +01:00
Bruno Cardoso Lopes	819e0d105e	[CGAtomic] Lift strong requirement for remaining compare_exchange combinations Follow up on `431e3138a` and complete the other possible combinations. Besides enforcing the new behavior, it also mitigates TSAN false positives when combining orders that used to be stronger.	2021-05-06 21:05:20 -07:00
Nick Lewycky	30bbfda01f	Improve error messages for attributes in the wrong context. verifyFunctionAttrs has a comment that the value V is printed in error messages. The recently added errors for attributes didn't print V. Make them print V. Change the stringification of AttributeList. Firstly they started with 'PAL[' which stood for ParamAttrsList. Change that to 'AttributeList[' matching its current name AttributeList. Print out semantic meaning of the index instead of the raw index value (i.e. 'return', 'function' or 'arg(n)'). Differential revision: https://reviews.llvm.org/D101484	2021-04-29 01:44:16 -07:00
Luo, Yuanke	bcdaccfe34	[X86][AMX] Verify illegal types or instructions for x86_amx. This patch is related to https://reviews.llvm.org/D100032 which define some illegal types or operations for x86_amx. There are no arguments, arrays, pointers, vectors or constants of x86_amx. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D100472	2021-04-20 16:14:22 +08:00
Nick Lewycky	cf899a31ae	Add a cache of checked AttributeLists. Differential Revision: https://reviews.llvm.org/D100738	2021-04-19 16:01:06 -07:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Nick Lewycky	244d9d6e41	Verify the LLVMContext that an Attribute belongs to. Attributes don't know their parent Context, adding this would make Attribute larger. Instead, we add hasParentContext that answers whether this Attribute belongs to a particular LLVMContext by checking for itself inside the context's FoldingSet. Same with AttributeSet and AttributeList. The Verifier checks them with the Module context. Differential Revision: https://reviews.llvm.org/D99362	2021-04-16 09:44:38 -07:00
Momchil Velikov	f9d932e673	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
Alok Kumar Sharma	9fb0025f70	[DebugInfo] Upgrade DISubragne::count to accept DIExpression also This is needed for Fortran assumed shape arrays whose dimensions are defined as, - 'count' is taken from array descriptor passed as parameter by caller, access from descriptor is defined by type DIExpression. - 'lowerBound' is defined by callee. The current alternate way represents using upperBound in place of count, where upperBound is calculated in callee in a temp variable using lowerBound and count Representation with count (DIExpression) is not only clearer as compared to upperBound (DIVariable) but it has another advantage that variable count is accessed by being parameter has better chance of survival at higher optimization level than upperBound being local variable. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99335	2021-03-30 09:16:55 +05:30
Adrian Prantl	8573c28a51	Add debug support for set types This commit adds debugging support for set types defined in languages such as Pascal and Modula-2. Patch by Peter McKinna! Differential Revision: https://reviews.llvm.org/D76115	2021-03-29 18:04:48 -07:00
Matt Arsenault	9a0c9402fa	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `07e46367ba`.	2021-03-29 08:55:30 -04:00
Oliver Stannard	07e46367ba	Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute"" Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit `fc9df30991`.	2021-03-29 11:32:22 +01:00
Matt Arsenault	fc9df30991	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `20d5c42e0e`.	2021-03-28 13:35:21 -04:00
Nico Weber	20d5c42e0e	Revert "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `4fefed6563`. Broke check-clang everywhere.	2021-03-28 13:02:52 -04:00
Matt Arsenault	4fefed6563	OpaquePtr: Turn inalloca into a type attribute I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.	2021-03-28 11:12:23 -04:00
Nick Lewycky	80f6c99a78	Verify that MDNodes belong to the same context as the Module. Differential Revision: https://reviews.llvm.org/D99289	2021-03-24 12:38:05 -07:00
David Sherwood	748ae5281d	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Bradley Smith	48f5a392cb	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Jeroen Dobbelaere	04790d9cfb	Support intrinsic overloading on unnamed types This patch adds support for intrinsic overloading on unnamed types. This fixes PR38117 and PR48340 and will also be needed for the Full Restrict Patches (D68484). The main problem is that the intrinsic overloading name mangling is using 's_s' for unnamed types. This can result in identical intrinsic mangled names for different function prototypes. This patch changes this by adding a '.XXXXX' to the intrinsic mangled name when at least one of the types is based on an unnamed type, ensuring that we get a unique name. Implementation details: - The mapping is created on demand and kept in Module. - It also checks for existing clashes and recycles potentially existing prototypes and declarations. - Because of extra data in Module, Intrinsic::getName needs an extra Module* argument and, for speed, an optional FunctionType* argument. - I still kept the original two-argument 'Intrinsic::getName' around which keeps the original behavior (providing the base name). -- Main reason is that I did not want to change the LLVMIntrinsicGetName version, as I don't know how acceptable such a change is -- The current situation already has a limitation. So that should not get worse with this patch. - Intrinsic::getDeclaration and the verifier are now using the new version. Other notes: - As far as I see, this should not suffer from stability issues. The count is only added for prototypes depending on at least one anonymous struct - The initial count starts from 0 for each intrinsic mangled name. - In case of name clashes, existing prototypes are remembered and reused when that makes sense. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D91250	2021-03-19 14:34:25 +01:00
gbtozers	e5d958c456	[DebugInfo] Support DIArgList in DbgVariableIntrinsic This patch updates DbgVariableIntrinsics to support use of a DIArgList for the location operand, resulting in a significant change to its interface. This patch does not update all IR passes to support multiple location operands in a dbg.value; the only change is to update the DbgVariableIntrinsic interface and its uses. All code outside of the intrinsic classes assumes that an intrinsic will always have exactly one location operand; they will still support DIArgLists, but only if they contain exactly one Value. Among other changes, the setOperand and setArgOperand functions in DbgVariableIntrinsic have been made private. This is to prevent code from setting the operands of these intrinsics directly, which could easily result in incorrect/invalid operands being set. This does not prevent these functions from being called on a debug intrinsic at all, as they can still be called on any CallInst pointer; it is assumed that any code directly setting the operands on a generic call instruction is doing so safely. The intention for making these functions private is to prevent DIArgLists from being overwritten by code that's naively trying to replace one of the Values it points to, and also to fail fast if a DbgVariableIntrinsic is updated to use a DIArgList without a valid corresponding DIExpression.	2021-03-08 14:36:13 +00:00
gbtozers	65600cb2a7	[DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics This patch adds a new metadata node, DIArgList, which contains a list of SSA values. This node is in many ways similar in function to the existing ValueAsMetadata node, with the difference being that it tracks a list instead of a single value. Internally, it uses ValueAsMetadata to track the individual values, but there is also a reasonable amount of DIArgList-specific value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special case in parsing and printing due to the fact that it requires a function state (as it may reference function-local values). This patch should not result in any immediate functional change; it allows for DIArgLists to be parsed and printed, but debug variable intrinsics do not yet recognize them as a valid argument (outside of parsing). Differential Revision: https://reviews.llvm.org/D88175	2021-03-05 17:02:24 +00:00
Akira Hatanaka	1900503595	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies `ed4718eccb`, which was reverted because it was causing a miscompile. The bug that was causing the miscompile has been fixed in `75805dce5f`. Original commit message: Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-03-04 11:22:30 -08:00
Hans Wennborg	0a5dd06718	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR" This caused miscompiles of Chromium tests for iOS due clobbering of live registers. See discussion on the code review for details. > Background: > > This fixes a longstanding problem where llvm breaks ARC's autorelease > optimization (see the link below) by separating calls from the marker > instructions or retainRV/claimRV calls. The backend changes are in > https://reviews.llvm.org/D92569. > > https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue > > What this patch does to fix the problem: > > - The front-end adds operand bundle "clang.arc.attachedcall" to calls, > which indicates the call is implicitly followed by a marker > instruction and an implicit retainRV/claimRV call that consumes the > call result. In addition, it emits a call to > @llvm.objc.clang.arc.noop.use, which consumes the call result, to > prevent the middle-end passes from changing the return type of the > called function. This is currently done only when the target is arm64 > and the optimization level is higher than -O0. > > - ARC optimizer temporarily emits retainRV/claimRV calls after the calls > with the operand bundle in the IR and removes the inserted calls after > processing the function. > > - ARC contract pass emits retainRV/claimRV calls after the call with the > operand bundle. It doesn't remove the operand bundle on the call since > the backend needs it to emit the marker instruction. The retainRV and > claimRV calls are emitted late in the pipeline to prevent optimization > passes from transforming the IR in a way that makes it harder for the > ARC middle-end passes to figure out the def-use relationship between > the call and the retainRV/claimRV calls (which is the cause of > PR31925). > > - The function inliner removes an autoreleaseRV call in the callee if > nothing in the callee prevents it from being paired up with the > retainRV/claimRV call in the caller. It then inserts a release call if > claimRV is attached to the call since autoreleaseRV+claimRV is > equivalent to a release. If it cannot find an autoreleaseRV call, it > tries to transfer the operand bundle to a function call in the callee. > This is important since the ARC optimizer can remove the autoreleaseRV > returning the callee result, which makes it impossible to pair it up > with the retainRV/claimRV call in the caller. If that fails, it simply > emits a retain call in the IR if retainRV is attached to the call and > does nothing if claimRV is attached to it. > > - SCCP refrains from replacing the return value of a call with a > constant value if the call has the operand bundle. This ensures the > call always has at least one user (the call to > @llvm.objc.clang.arc.noop.use). > > - This patch also fixes a bug in replaceUsesOfNonProtoConstant where > multiple operand bundles of the same kind were being added to a call. > > Future work: > > - Use the operand bundle on x86-64. > > - Fix the auto upgrader to convert call+retainRV/claimRV pairs into > calls with the operand bundles. > > rdar://71443534 > > Differential Revision: https://reviews.llvm.org/D92808 This reverts commit `ed4718eccb`.	2021-03-03 15:51:40 +01:00
Kazu Hirata	4444b343d7	[IR] Use range-based for loops (NFC)	2021-03-01 23:40:33 -08:00
Sanjay Patel	215bb15791	[IR] restrict vector reduction intrinsic types The arguments in all cases should be vectors of exactly one of integer or FP. All of the tests currently pass the verifier because we check for any vector type regardless of the type of reduction. This obviously can't work if we mix up integer and FP, and based on current LangRef text it was not intended to work for pointers either. The pointer case from https://llvm.org/PR49215 is what led me here. That example was avoided with `5b250a27ec`. Differential Revision: https://reviews.llvm.org/D96904	2021-02-21 12:37:00 -05:00
Sanjay Patel	d79063129c	[Verifier] remove dead code for saturating intrinsics; NFC Test coverage shows that we assert with the string from the tablegen defs file for these intrinsics, so these cases should never be live.	2021-02-19 14:58:25 -05:00
Xun Li	a0d09ce460	[NFC][Coroutine] Fix an error message on coro.id verification The error message should be about coro.id, not coro.begin Differential Revision: https://reviews.llvm.org/D96447	2021-02-12 10:44:03 -08:00
Akira Hatanaka	ed4718eccb	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-12 09:51:57 -08:00

1 2 3 4 5 ...

810 Commits