llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	2bc57d85eb	Don't override __attribute__((no_stack_protector)) by inlining (PR52886) Since `26c6a3e736`, LLVM's inliner will "upgrade" the caller's stack protector attribute based on the callee. This lead to surprising results with Clang's no_stack_protector attribute added in `4fbf84c173` (D46300). Consider the following code compiled with clang -fstack-protector-strong -Os (https://godbolt.org/z/7s3rW7a1q). extern void h(int* p); inline __attribute__((always_inline)) int g() { return 0; } int __attribute__((__no_stack_protector__)) f() { int a[1]; h(a); return g(); } LLVM will inline g() into f(), and f() would get a stack protector, against the users explicit wishes, potentially breaking the program e.g. if h() changes the value of the stack cookie. That's a miscompile. More recently, `bc044a88ee` (D91816) addressed this problem by preventing inlining when the stack protector is disabled in the caller and enabled in the callee or vice versa. However, the problem remained if the callee is marked always_inline as in the example above. This affected users, see e.g. http://crbug.com/1274129 and http://llvm.org/pr52886. One way to fix this would be to prevent inlining also in the always_inline case. Despite the name, always_inline does not guarantee inlining, so this would be legal but potentially surprising to users. However, I think the better fix is to not enable the stack protector in a caller based on the callee. The motivation for the old behaviour is unclear, it seems counter-intuitive, and causes real problems as we've seen. This commit implements that fix, which means in the example above, g() gets inlined into f() (also without always_inline), and f() is emitted without stack protector. I think that matches most developers' expectations, and that's also what GCC does. Another effect of this change is that a no_stack_protector function can now be inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856): extern void h(int* p); inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() { return 0; } int f() { int a[1]; h(a); return g(); } I think that's fine. Such code would be unusual since no_stack_protector is normally applied to a program entry point which sets up the stack canary. And even if such code exists, inlining doesn't change the semantics: there is still no stack cookie setup/check around entry/exit of the g() code region, but there may be in the surrounding context, as there was before inlining. This also matches GCC. See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 Differential revision: https://reviews.llvm.org/D116589	2022-01-13 12:04:49 +01:00
Sebastian Neubauer	f4139440f1	[Docs] Fix IR and TableGen grammar inconsistencies IR: - globals (and functions, ifuncs, aliases) can have a partition - catchret has a `to` before the label - the sint/int types do not exist - signext comes after the type - a variable was missing its type TableGen: - The second value after a `#` concatenation is optional See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351 - IncludeDirective and PreprocessorDirective were never referenced in the grammar - Add some missing ; - Parent classes of multiclasses can have generic arguments. Reuse the `ParentClassList` that is already used in other places. MIR: - liveins only allows physical registers, which start with a $ Differential Revision: https://reviews.llvm.org/D116674	2022-01-13 11:55:13 +01:00
Simon Moll	33efbc8184	[VP] llvm.vp.merge intrinsic and LangRef llvm.vp.merge interprets the %evl operand differently than the other vp intrinsics: all lanes at positions greater or equal than the %evl operand are passed through from the second vector input. Otherwise it behaves like llvm.vp.select. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116725	2022-01-12 14:06:56 +01:00
David Sherwood	51497dc0b2	[IR] Change vector.splice intrinsic to reject out-of-bounds indices I've changed the definition of the experimental.vector.splice instrinsic to reject indices that are known to be or possibly out-of-bounds. In practice, this means changing the definition so that the index is now only valid in the range [-VL, VL-1] where VL is the known minimum vector length. We use the vscale_range attribute to take the minimum vscale value into account so that we can permit more indices when the attribute is present. The splice intrinsic is currently only ever generated by the vectoriser, which will never attempt to splice vectors with out-of-bounds values. Changing the definition also makes things simpler for codegen since we can always assume that the index is valid. This patch was created in response to review comments on D115863 Differential Revision: https://reviews.llvm.org/D115933	2022-01-11 09:37:39 +00:00
Luís Ferreira	34435fd105	[llvm] Add support for DW_TAG_immutable_type Added documentation about DW_TAG_immutable_type too. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D113633	2022-01-05 19:17:08 +00:00
Nikita Popov	8484bab9cd	[LangRef] Require elementtype attribute for indirect inline asm operands Indirect inline asm operands may require the materialization of a memory access according to the pointer element type. As this will no longer be available with opaque pointers, we require it to be explicitly annotated using the elementtype attribute, for example: define void @test(i32* %p, i32 %x) { call void asm "addl $1, $0", "=rm,r"(i32 elementtype(i32) %p, i32 %x) ret void } This patch only includes the LangRef change and Verifier updates to allow adding the elementtype attribute in this position. It does not yet enforce this, as this will require changes on the clang side (and test updates) first. Something I'm a bit unsure about is whether we really need the elementtype for all indirect constraints, rather than only indirect register constraints. I think indirect memory constraints might not strictly need it (though the backend code is written in a way that does require it). I think it's okay to just make this a general requirement though, as this means we don't need to carefully deal with multiple or alternative constraints. In addition, I believe that MemorySanitizer benefits from having the element type even in cases where it may not be strictly necessary for normal lowering (`cd2b050fa4/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (L4066)`). Differential Revision: https://reviews.llvm.org/D116531	2022-01-04 10:02:06 +01:00
Fraser Cormack	d762794040	[IR] Allow the 'align' param attr on vectors of pointers This patch extends the available uses of the 'align' parameter attribute to include vectors of pointers. The attribute specifies pointer alignment element-wise. This change was previously requested and discussed in D87304. The vector predication (VP) intrinsics intend to use this for scatter and gather operations, as they lack the explicit alignment parameter that the masked versions use. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D115161	2022-01-03 12:32:46 +00:00
Nuno Lopes	b23669123a	[docs] Mark @llvm.sideeffect() as willreturn Changed by https://reviews.llvm.org/D65455	2022-01-01 18:04:04 +00:00
Sami Tolvanen	5dc8aaac39	[llvm][IR] Add no_cfi constant With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces function references with CFI jump table references, which is a problem for low-level code that needs the address of the actual function body. For example, in the Linux kernel, the code that sets up interrupt handlers needs to take the address of the interrupt handler function instead of the CFI jump table, as the jump table may not even be mapped into memory when an interrupt is triggered. This change adds the no_cfi constant type, which wraps function references in a value that LowerTypeTestsModule::replaceCfiUses does not replace. Link: https://github.com/ClangBuiltLinux/linux/issues/1353 Reviewed By: nickdesaulniers, pcc Differential Revision: https://reviews.llvm.org/D108478	2021-12-20 12:55:32 -08:00
Fraser Cormack	8002fa6760	[LangRef] Remove incorrect vector alignment rules The LangRef incorrectly says that if no exact match is found when seeking alignment for a vector type, the largest vector type smaller than the sought-after vector type. This is incorrect as vector types require an exact match, else they fall back to reporting the natural alignment. The corrected rule was not added in its place, as rules for other types (e.g., floating-point types) aren't documented. A unit test was added to demonstrate this. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D112463	2021-12-14 14:35:40 +00:00
Fraser Cormack	2d60bc87a2	[VP] [NFC] Fix vp_store signature and vp_gather examples Reviewed By: frasercrmck, simoll Differential Revision: https://reviews.llvm.org/D115027	2021-12-13 17:53:19 +00:00
Kito Cheng	39c861719b	[RISCV] Fix vm operand constraint to fit GCC's behavior - `vm` constraint is used for masking operand, which always v0. - Update testcase, only masking operand should use `vm`, vector mask operations should just use `vr` for any vector register. - Revise the description of `vm` constraint. - This patch also fix issue on RISCVRegisterInfo.td and RISCVISelLowering.cpp. RISCVRegisterInfo.td: - The first VT in the list must be the largest total size since the SelectionDAGBuilder uses the first register in the list as the canonical type for the register. RISCVISelLowering.cpp: - Fix RISCVTargetLowering::splitValueIntoRegisterParts and RISCVTargetLowering::joinRegisterPartsIntoValue for handling vectors with different total size, that will happened on fractional LMUL since fractional LMUL is always occupy one vector register. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D112599	2021-12-09 14:46:49 +08:00
Fraser Cormack	3460cc2585	[VP] Propagate align parameter attr on VP load/store to ISel This patch fixes a case where the 'align' parameter attribute on the pointer operands to llvm.vp.load and llvm.vp.store was being dropped during the conversion to the SelectionDAG. The default alignment equal to the ABI type alignment of the vector type was kept. It also updates the documentation to reflect the fact that the parameter attribute is now properly supported. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D114422	2021-12-07 10:16:16 +00:00
Cullen Rhodes	698584f89b	[IR] Remove unbounded as possible value for vscale_range minimum The default for min is changed to 1. The behaviour of -mvscale-{min,max} in Clang is also changed such that 16 is the max vscale when targeting SVE and no max is specified. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D113294	2021-12-07 09:52:21 +00:00
Shivam Gupta	b1eb6a3589	[Docs] Fix a link current link is pointing to https://llvm.org/docs/CodeGenerator.html#segmented-stacks while it point to https://llvm.org/docs/CodeGenerator.html#tail-call-optimization or id81. Differential Revision: https://reviews.llvm.org/D115119	2021-12-06 10:02:06 +05:30
Fraser Cormack	92d279fd6d	[LangRef][VP] Correct operands' types in vp.select documentation The types of llvm.vp.select's operands much match the return type.	2021-11-19 12:08:34 +00:00
Shao-Ce SUN	0c660256eb	[NFC] Trim trailing whitespace in *.rst	2021-11-15 09:17:08 +08:00
Ahmed Bougacha	68854f4e57	[IR] Define ptrauth intrinsics. This defines the new `@llvm.ptrauth.` pointer authentication intrinsics: sign, auth, strip, blend, and sign_generic, documented in PointerAuth.md. Pointer Authentication is a mechanism by which certain pointers are signed. When a pointer gets signed, a cryptographic hash of its value and other values (pepper and salt) is stored in unused bits of that pointer. Before the pointer is used, it needs to be authenticated, i.e., have its signature checked. This prevents pointer values of unknown origin from being used to replace the signed pointer value. sign and auth provide the core operations. strip removes the ptrauth bits from a signed pointer without checking them. sign_generic allows signing non-pointer values. Finally, blend combines salt values ("discriminators") to derive more targeted and less reusable ones. In later patches, we implement primary backend support for these intrinsics using the AArch64 PAuth feature, and build on that to implement the arm64e Darwin ABI and ELF PAuth ABI Extension in clang. For more details, see the docs page, as well as our llvm-dev RFC: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136091.html or our 2019 Developers' Meeting talk. Differential Revision: https://reviews.llvm.org/D90868	2021-11-14 07:59:00 -08:00
Arthur Eubanks	05963a3d66	Revert "[DebugInfo] Enforce implicit constraints on `distinct` MDNodes" This reverts commit `ee76525698`. Causes crashes, see comments in D104827.	2021-11-09 14:27:55 -08:00
Scott Linder	ee76525698	[DebugInfo] Enforce implicit constraints on `distinct` MDNodes Add UNIQUED and DISTINCT properties in Metadata.def and use them to implement restrictions on the `distinct` property of MDNodes: * DIExpression can currently be parsed from IR or read from bitcode as `distinct`, but this property is silently dropped when printing to IR. This causes accepted IR to fail to round-trip. As DIExpression appears inline at each use in the canonical form of IR, it cannot actually be `distinct` anyway, as there is no syntax to describe it. * Similarly, DIArgList is conceptually always uniqued. It is currently restricted to only appearing in contexts where there is no syntax for `distinct`, but for consistency it is treated equivalently to DIExpression in this patch. * DICompileUnit is already restricted to always being `distinct`, but along with adding general support for the inverse restriction I went ahead and described this in Metadata.def and updated the parser to be general. Future nodes which have this restriction can share this support. The new UNIQUED property applies to DIExpression and DIArgList, and forbids them to be `distinct`. It also implies they are canonically printed inline at each use, rather than via MDNode ID. The new DISTINCT property applies to DICompileUnit, and requires it to be `distinct`. A potential alternative change is to forbid the non-inline syntax for DIExpression entirely, as is done with DIArgList implicitly by requiring it appear in the context of a function. For example, we would forbid: !named = !{!0} !0 = !DIExpression() Instead we would only accept the equivalent inlined version: !named = !{!DIExpression()} This essentially removes the ability to create a `distinct` DIExpression by construction, as there is no syntax for `distinct` inline. If this patch is accepted as-is, the result would be that the non-canonical version is accepted, but the following would be an error and produce a diagnostic: !named = !{!0} ; error: 'distinct' not allowed for !DIExpression() !0 = distinct !DIExpression() Also update some documentation to consistently use the inline syntax for DIExpression, and to describe the restrictions on `distinct` for nodes where applicable. Reviewed By: StephenTozer, t-tye Differential Revision: https://reviews.llvm.org/D104827	2021-11-09 18:19:11 +00:00
Fraser Cormack	3a11fb572c	[LangRef][VP] Document vp.gather and vp.scatter intrinsics This patch fleshes out the missing documentation for the final two VP intrinsics introduced in D99355: `llvm.vp.gather` and `llvm.vp.scatter`. It does so mostly by deferring to the `llvm.masked.gather` and `llvm.masked.scatter` intrinsics, respectively. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D112997	2021-11-05 11:36:03 +00:00
Fraser Cormack	93e1802af3	[LangRef][VP] Document vp.load and vp.store intrinsics This patch fleshes out the missing documentation for two of the VP intrinsics introduced in D99355: `llvm.vp.load` and `llvm.vp.store`. It does so mostly by deferring to the `llvm.masked.load` and `llvm.masked.store` intrinsics, respectively. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D112930	2021-11-05 10:39:34 +00:00
Fraser Cormack	6fb41c3dea	[LangRef][VP] Correct mask type in vp.slice documentation The mask type for the llvm.experimental.vp.splice intrinsics must have the same number of elements as the result type. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D112924	2021-11-02 14:48:23 +00:00
Fraser Cormack	c3dce37a55	[LangRef] Document that DataLayout defaults to little-endian Little-endian has apparently been the default since 2014. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D112316	2021-10-26 10:54:23 +01:00
William Woodruff	86a4a93a1c	[docs] [NFC] Clarify the datalayout documentation This patch fixes a couple of small oversights in the documentation for the datalayout specification: * The v and f specifications are subject to the same constraints on <size> as i is. * The p[n] specification didn't mark <idx> as optional, despite being documented and parsed as such. * Similarly, none of the alignment specifications require <pref>.	2021-10-12 23:21:48 +05:30
modimo	ef643617b8	[NFC][LangRef] Update description for FuncFlags Add the additional flags from D36850 as well as noInline/alwaysInline from previous changes. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D111600	2021-10-11 22:03:53 -07:00
Arthur Eubanks	b41cfbfcbb	[docs] Mention in release notes that we now support 2^32 alignment Missed in D110451. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D111472	2021-10-11 10:23:15 -07:00
Yuanfang Chen	8e3b9f453f	[LangRef] Fix a typo in DISubrange section	2021-10-08 16:46:31 -07:00
Fangrui Song	f66b1b2717	[LangRef] Update ifunc syntax Extracted from Itay Bookstein's D108872.	2021-10-07 11:14:40 -07:00
Simon Moll	72a08c0b94	[VP] Vector predicated vector splice intrinsic This patch introduces the vector-predicated version of the experimental_vector_splice intrinsic [1] at the IR level. It considers the active vector length for both vectors and and uses a vector mask to disable certain lanes in the result. [1] https://reviews.llvm.org/D94708 Change originally authored by Vineet Kumar <vineet.kumar@bsc.es> Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D103898	2021-09-29 10:43:36 +02:00
Arthur Eubanks	aa53785f23	Reland [clang] Rework dontcall attributes To avoid using the AST when emitting diagnostics, split the "dontcall" attribute into "dontcall-warn" and "dontcall-error", and also add the frontend attribute value as the LLVM attribute value. This gives us all the information to report diagnostics we need from within the IR (aside from access to the original source). One downside is we directly use LLVM's demangler rather than using the existing Clang diagnostic pretty printing of symbols. Previous revisions didn't properly declare the new dependencies. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110364	2021-09-28 15:31:30 -07:00
Arthur Eubanks	7833d20f1f	Revert "[clang] Rework dontcall attributes" This reverts commit `2943071e2e`. Breaks bots	2021-09-28 14:49:27 -07:00
Arthur Eubanks	2943071e2e	[clang] Rework dontcall attributes To avoid using the AST when emitting diagnostics, split the "dontcall" attribute into "dontcall-warn" and "dontcall-error", and also add the frontend attribute value as the LLVM attribute value. This gives us all the information to report diagnostics we need from within the IR (aside from access to the original source). One downside is we directly use LLVM's demangler rather than using the existing Clang diagnostic pretty printing of symbols. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110364	2021-09-28 14:21:10 -07:00
Anirudh Prasad	e09a1dc475	[SystemZ][z/OS] Add GOFF Support to the DataLayout - This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation. - Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend. Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay Differential Revision: https://reviews.llvm.org/D109362	2021-09-24 14:09:01 -04:00
Alok Kumar Sharma	a5b72abc9e	[DebugInfo] Enhance DIImportedEntity to accept children entities New field `elements` is added to '!DIImportedEntity', representing list of aliased entities. This is needed to dump optimized debugging information where all names in a module are imported, but a few names are imported with overriding aliases. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D109343	2021-09-16 10:41:55 +05:30
Craig Topper	2fd180bbb9	[IR] Reduce max supported integer from 2^24-1 to 2^23. SelectionDAG will promote illegal types up to a power of 2 before splitting down to a legal type. This will create an IntegerType with a bit width that must be <= MAX_INT_BITS. This places an effective upper limit on any type of 2^23 so that we don't try create a 2^24 type. I considered putting a fatal error somewhere in the path from TargetLowering::getTypeConversion down to IntegerType::get, but limiting the type in IR seemed better. This breaks backwards compatibility with IR that is using a really large type. I suspect such IR is going to be very rare due to the the compile time costs such a type likely incurs. Prevents the ICE in PR51829. Reviewed By: efriedma, aaron.ballman Differential Revision: https://reviews.llvm.org/D109721	2021-09-14 07:52:10 -07:00
Akira Hatanaka	dea6f71af0	[ObjC][ARC] Use the addresses of the ARC runtime functions instead of integer 0/1 for the operand of bundle "clang.arc.attachedcall" https://reviews.llvm.org/D102996 changes the operand of bundle "clang.arc.attachedcall". This patch makes changes to llvm that are needed to handle the new IR. This should make it easier to understand what the IR is doing and also simplify some of the passes as they no longer have to translate the integer values to the runtime functions. Differential Revision: https://reviews.llvm.org/D103000	2021-09-08 11:58:03 -07:00
Roman Lebedev	3f1f08f0ed	Revert @llvm.isnan intrinsic patchset. Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.) TLDR: the original patch had no prior RFC, yet it had some changes that really need a proper RFC discussion. It won't be productive to discuss such an RFC, once it's actually posted, while said patch is already committed, because that introduces bias towards already-committed stuff, and the tree is potentially in broken state meanwhile. While the end result of discussion may lead back to the current design, it may also not lead to the current design. Therefore i take it upon myself to revert the tree back to last known good state. This reverts commit `4c4093e6e3`. This reverts commit `0a2b1ba33a`. This reverts commit `d9873711cb`. This reverts commit `791006fb8c`. This reverts commit `c22b64ef66`. This reverts commit `72ebcd3198`. This reverts commit `5fa6039a5f`. This reverts commit `9efda541bf`. This reverts commit `94d3ff09cf`.	2021-09-02 13:53:56 +03:00
Simon Moll	ea2cdbf5e6	[VP] Declaration and docs for vp.select intrinsic llvm.vp.select extends the regular select instruction with an explicit vector length (%evl). All lanes with indexes at and above %evl are undefined. Lanes below %evl are taken from the first input where the mask is true and from the second input otherwise. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D105351	2021-09-02 11:17:14 +02:00
Kazu Hirata	5294a0f7c3	[llvm] Fix typos in documentation (NFC)	2021-08-28 06:37:03 -07:00
Nick Desaulniers	846e562dcc	[Clang] add support for error+warning fn attrs Add support for the GNU C style __attribute__((error(""))) and __attribute__((warning(""))). These attributes are meant to be put on declarations of functions whom should not be called. They are frequently used to provide compile time diagnostics similar to _Static_assert, but which may rely on non-ICE conditions (ie. relying on compiler optimizations). This is also similar to diagnose_if function attribute, but can diagnose after optimizations have been run. While users may instead simply call undefined functions in such cases to get a linkage failure from the linker, these provide a much more ergonomic and actionable diagnostic to users and do so at compile time rather than at link time. Users instead may be able use inline asm .err directives. These are used throughout the Linux kernel in its implementation of BUILD_BUG and BUILD_BUG_ON macros. These macros generally cannot be converted to use _Static_assert because many of the parameters are not ICEs. The Linux kernel still needs to be modified to make use of these when building with Clang; I have a patch that does so I will send once this feature is landed. To do so, we create a new IR level Function attribute, "dontcall" (both error and warning boil down to one IR Fn Attr). Then, similar to calls to inline asm, we attach a !srcloc Metadata node to call sites of such attributed callees. The backend diagnoses these during instruction selection, while we still know that a call is a call (vs say a JMP that's a tail call) in an arch agnostic manner. The frontend then reconstructs the SourceLocation from that Metadata, and determines whether to emit an error or warning based on the callee's attribute. Link: https://bugs.llvm.org/show_bug.cgi?id=16428 Link: https://github.com/ClangBuiltLinux/linux/issues/1173 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D106030	2021-08-25 10:34:18 -07:00
Alexander Potapenko	b0391dfc73	[clang][Codegen] Introduce the disable_sanitizer_instrumentation attribute The purpose of __attribute__((disable_sanitizer_instrumentation)) is to prevent all kinds of sanitizer instrumentation applied to a certain function, Objective-C method, or global variable. The no_sanitize(...) attribute drops instrumentation checks, but may still insert code preventing false positive reports. In some cases though (e.g. when building Linux kernel with -fsanitize=kernel-memory or -fsanitize=thread) the users may want to avoid any kind of instrumentation. Differential Revision: https://reviews.llvm.org/D108029	2021-08-20 14:01:06 +02:00
Fraser Cormack	f3e9047249	[VP] Add vector-predicated reduction intrinsics This patch adds vector-predicated ("VP") reduction intrinsics corresponding to each of the existing unpredicated `llvm.vector.reduce.*` versions. Unlike the unpredicated reductions, all VP reductions have a start value. This start value is returned when the no vector element is active. Support for expansion on targets without native vector-predication support is included. This patch is based on the ["reduction slice"](https://reviews.llvm.org/D57504#1732277) of the LLVM-VP reference patch (https://reviews.llvm.org/D57504). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104308	2021-08-17 17:56:35 +01:00
Florian Hahn	f999312872	Recommit "[Matrix] Overload stride arg in matrix.columnwise.load/store." This reverts the revert `28c04794df`. The failing MLIR test that caused the revert should be fixed in this version. Also includes a PPC test fix previously in `1f87c7c478`.	2021-08-12 18:31:57 +01:00
Mehdi Amini	28c04794df	Revert "[Matrix] Overload stride arg in matrix.columnwise.load/store." This reverts commit `a1ef81de35`. Broke the MLIR buildbot.	2021-08-12 11:57:19 +00:00
Florian Hahn	a1ef81de35	[Matrix] Overload stride arg in matrix.columnwise.load/store. This patch adjusts the intrinsics definition of llvm.matrix.column.major.load and llvm.matrix.column.major.store to allow overloading the type of the stride. The bitwidth of the stride is used to perform the offset computation. This fixes a crash when using __builtin_matrix_column_major_load or __builtin_matrix_column_major_store on 32 bit platforms. The stride argument of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit platforms. Note that we still perform offset computations with 64 bit width on 32 bit platforms for accesses that do not take a user-specified stride. This can be fixed separately. Fixes PR51304. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D107349	2021-08-12 10:45:25 +01:00
Serge Pavlov	4c4093e6e3	Introduce intrinsic llvm.isnan This is recommit of the patch `16ff91ebcc`, reverted in `0c28a7c990` because it had an error in call of getFastMathFlags (base type should be FPMathOperator but not Instruction). The original commit message is duplicated below: Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-06 14:32:27 +07:00
Serge Pavlov	0c28a7c990	Revert "Introduce intrinsic llvm.isnan" This reverts commit `16ff91ebcc`. Several errors were reported mainly test-suite execution time. Reverted for investigation.	2021-08-04 17:18:15 +07:00
Serge Pavlov	16ff91ebcc	Introduce intrinsic llvm.isnan Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made. * The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN. * Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way. * Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function. Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks: * The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations. * IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool. Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false': std::isnan(std::numeric_limits<float>::quiet_NaN()) The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'. To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics. Differential Revision: https://reviews.llvm.org/D104854	2021-08-04 15:27:49 +07:00
Hsiangkai Wang	ee3aef93b7	[RISCV][Docs] Add description about inline asm constraint for V. Add inline asm constraint 'vr' for vector registers and 'vm' for vector mask registers. Differential Revision: https://reviews.llvm.org/D106633	2021-08-01 05:58:17 +08:00

1 2 3 4 5 ...

1008 Commits