llvm-project

Commit Graph

Author	SHA1	Message	Date
Stella Laurenzo	2dc68b5398	Add APFloat and MLIR type support for fp8 (e5m2). This is a first step towards high level representation for fp8 types that have been built in to hardware with near term roadmaps. Like the BFLOAT16 type, the family of fp8 types are inspired by IEEE-754 binary floating point formats but, due to the size limits, have been tweaked in various ways in order to maximally use the range/precision in various scenarios. The list of variants is small/finite and bounded by real hardware. This patch introduces the E5M2 FP8 format as proposed by Nvidia, ARM, and Intel in the paper: https://arxiv.org/pdf/2209.05433.pdf As the more conformant of the two implemented datatypes, we are plumbing it through LLVM's APFloat type and MLIR's type system first as a template. It will be followed by the range optimized E4M3 FP8 format described in the paper. Since that format deviates further from the IEEE-754 norms, it may require more debate and implementation complexity. Given that we see two parts of the FP8 implementation space represented by these cases, we are recommending naming of: * `F8M<N>` : For FP8 types that can be conceived of as following the same rules as FP16 but with a smaller number of mantissa/exponent bits. Including the number of mantissa bits in the type name is enough to fully specify the type. This naming scheme is used to represent the E5M2 type described in the paper. * `F8M<N>F` : For FP8 types such as E4M3 which only support finite values. The first of these (this patch) seems fairly non-controversial. The second is previewed here to illustrate options for extending to the other known variant (but can be discussed in detail in the patch which implements it). Many conversations about these types focus on the Machine-Learning ecosystem where they are used to represent mixed-datatype computations at a high level. At that level (which is why we also expose them in MLIR), it is important to retain the actual type definition so that when lowering to actual kernels or target specific code, the correct promotions, casts and rescalings can be done as needed. We expect that most LLVM backends will only experience these types as opaque `I8` values that are applicable to some instructions. MLIR does not make it particularly easy to add new floating point types (i.e. the FloatType hierarchy is not open). Given the need to fully model FloatTypes and make them interop with tooling, such types will always be "heavy-weight" and it is not expected that a highly open type system will be particularly helpful. There are also a bounded number of floating point types in use for current and upcoming hardware, and we can just implement them like this (perhaps looking for some cosmetic ways to reduce the number of places that need to change). Creating a more generic mechanism for extending floating point types seems like it wouldn't be worth it and we should just deal with defining them one by one on an as-needed basis when real hardware implements a new scheme. Hopefully, with some additional production use and complete software stacks, hardware makers will converge on a set of such types that is not terribly divergent at the level that the compiler cares about. (I cleaned up some old formatting and sorted some items for this case: If we converge on landing this in some form, I will NFC commit format only changes as a separate commit) Differential Revision: https://reviews.llvm.org/D133823	2022-10-02 17:17:08 -07:00
Yeting Kuo	cefb7aab61	[VP][RISCV] Add vp.copysign and RISC-V support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134935	2022-10-01 10:19:10 +08:00
Xiang Li	a80a888de5	[DirectX backend] Support global ctor for DXILBitcodeWriter. 1. Save typed pointer type for GlobalVariable/Function instead of the ObjectType. This will allow use GlobalVariable/Function as value. 2. Save target type for global ctors for Constant. 3. In DXILBitcodeWriter::getTypeID, check PointerMap first for Constant case. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D133283	2022-09-30 11:27:23 -07:00
Serge Pavlov	b934be2c05	[Support] Class for response file expansion (NFC) Functions that implement expansion of response and config files depend on many options, which are passes as arguments. Extending the expansion requires new options, it in turn causes changing calls in various places making them even more bulky. This change introduces a class ExpansionContext, which represents set of options that control the expansion. Its methods implements expansion of responce files including config files. It makes extending the expansion easier. No functional changes. Differential Revision: https://reviews.llvm.org/D132379	2022-09-29 19:15:01 +07:00
chenglin.bi	0346f78a6f	[ARM64EC] Add arm64ec for getArchName Followup D125412, return the correct arch name for Arm64EC Reviewed By: efriedma, mstorsjo Differential Revision: https://reviews.llvm.org/D134787	2022-09-29 09:05:17 +08:00
Jessica Paquette	704b2e162c	[GlobalISel] Add isConstFalseVal helper to Utils Add a utility function which returns true if the given value is a constant false value. This is necessary to port one of the compare simplifications in TargetLowering::SimplifySetCC. Differential Revision: https://reviews.llvm.org/D91754	2022-09-28 15:44:26 -07:00
Aiden Grossman	8d77f8fde7	[MLGO] Add per-instruction MBB frequencies to regalloc dev features This commit adds in two new features to the ML regalloc eviction analysis that can be used in ML models, a vector of MBB frequencies and a vector of indicies mapping instructions to their corresponding basic blocks. This will allow for further experimentation with per-instruction features and give a lot more flexibility for future experimentation over how we're extracting MBB frequency data currently. Reviewed By: mtrofin, jacobhegna Differential Revision: https://reviews.llvm.org/D134166	2022-09-28 18:45:04 +00:00
Serge Pavlov	5ddde5f80a	Revert "[Support] Class for response file expansion (NFC)" This reverts commit `6e491c48d6`. There are missed changes in flang.	2022-09-28 13:33:28 +07:00
Serge Pavlov	6e491c48d6	[Support] Class for response file expansion (NFC) Functions that implement expansion of response and config files depend on many options, which are passes as arguments. Extending the expansion requires new options, it in turn causes changing calls in various places making them even more bulky. This change introduces a class ExpansionContext, which represents set of options that control the expansion. Its methods implements expansion of responce files including config files. It makes extending the expansion easier. No functional changes. Differential Revision: https://reviews.llvm.org/D132379	2022-09-28 11:47:59 +07:00
eopXD	9677d70eb2	[VP][RISCV] Add vp.floor, vp.round, vp.roundeven and their RISC-V support Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134759	2022-09-27 19:45:58 -07:00
eopXD	163cb33854	[VP][RISCV] Add vp.ceil and RISC-V support Previous commit `8b00b24f85` missed to add `int_ceil` anchor for the llvm.ceil.* section under LangRef.rst Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134586	2022-09-27 12:04:09 -07:00
eopXD	384b8b3da7	Revert "[VP][RISCV] Add vp.ceil and RISC-V support" This reverts commit `8b00b24f85`.	2022-09-27 11:12:57 -07:00
eopXD	8b00b24f85	[VP][RISCV] Add vp.ceil and RISC-V support Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134586	2022-09-27 11:08:27 -07:00
Craig Topper	a6383bb51c	[VP][RISCV] Add vp.fmuladd. Expanded in SelectionDAGBuilder similar to llvm.fmuladd. Reviewed By: frasercrmck, simoll Differential Revision: https://reviews.llvm.org/D134474	2022-09-27 10:02:37 -07:00
Carlos Alberto Enciso	bab129f2d3	[ADT] Add IntervalTree - light tree data structure to hold intervals. Fix build failure in: https://lab.llvm.org/buildbot/#/builders/36/builds/25424 error: comparison of integers of different signs: 'const unsigned long' and 'const int' [-Werror,-Wsign-compare] Reviewed By: Orlando Differential Revision: https://reviews.llvm.org/D125776	2022-09-27 12:48:44 +01:00
Daniel Kiss	712de9d171	[AArch64] Add all predecessor archs in target info A given function is compatible with all previous arch versions. To avoid compering values of the attribute this logic adds all predecessor architecture values. Reviewed By: dmgreen, DavidSpickett Differential Revision: https://reviews.llvm.org/D134353	2022-09-27 10:23:21 +02:00
David Sherwood	fbb119412f	[AArch64] Add Neoverse V2 CPU support Adds support for the Neoverse V2 CPU to the AArch64 backend. Differential Revision: https://reviews.llvm.org/D134352	2022-09-27 07:56:08 +00:00
Carlos Alberto Enciso	6584d1f930	[ADT] Add IntervalTree - light tree data structure to hold intervals. It allows finding all intervals that overlap with any given point. At this time, it does not support any deletion or rebalancing operations. The IntervalTree is designed to be set up once, and then queried without any further additions. Reviewed By: psamolysov, probinson Differential Revision: https://reviews.llvm.org/D125776	2022-09-27 08:22:28 +01:00
Yeting Kuo	04e1301f3d	[VP][RISCV] Add vp.maxnum and vp.minnum intrinsics and RISC-V support. Add vp.maxnum and vp.minnum which are vector predicted intrinsics of llvm.maxnum and llvm.minnum. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134639	2022-09-27 13:36:45 +08:00
Lang Hames	a5f0054915	[ORC] Update LinkGraph unit tests for API change in `75404e9ef8`.	2022-09-25 22:02:15 -07:00
Yeting Kuo	43c5fbdd3a	[VP][RISCV] Add vp.sqrt intrinsic and RISC-V support. The patch modeled vp.fabs patch D132793. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D133690	2022-09-26 10:47:40 +08:00
Daniel Kiss	7e1a873872	[Arm][AArch64] Make getArchFeatures to use TargetParser.def Prefixing the the SubArch with plus sign makes the ArchFeature name. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D134349	2022-09-23 10:25:37 +02:00
Shraiysh Vaishay	95eb5109af	[OpenMP][IRBuilder] Added if clause to task This patch adds support for if clause to task construct in OpenMP IRBuilder. Reviewed By: raghavendhra Differential Revision: https://reviews.llvm.org/D130615	2022-09-23 01:39:41 +00:00
Arthur Eubanks	a8f1da128d	[LazyCallGraph] Handle spurious ref edges when deleting a dead function Spurious ref edges are ref edges that still exist in the call graph even though the corresponding IR reference no longer exists. This can cause issues when deleting a dead function which has a spurious ref edge pointed at it because currently we expect the dead function's RefSCC to be trivial. In the case that the dead function's RefSCC is not trivial, remove all ref edges from other nodes in the RefSCC to it. Removing a ref edge can result in splitting RefSCCs. There's actually no reason to revisit those RefSCCs because currently we only run passes on SCCs, and we've already added all SCCs in the RefSCC to the worklist. (as opposed to removing the ref edge in updateCGAndAnalysisManagerForPass() which can modify the call graph of SCCs we have not visited yet). We also don't expect that RefSCC refinement will allow us to glean any more information for optimization use. Also, doing so would drastically increase the complexity of LazyCallGraph::removeDeadFunction(), requiring us to return a list of invalidated RefSCCs and new RefSCCs to add to the worklist. Fixes #56503 Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D133907	2022-09-22 15:01:15 -07:00
Amara Emerson	885a87033c	[GlobalISel] Enforce G_ASSERT_ALIGN to have a valid alignment > 0.	2022-09-22 16:05:07 +01:00
Tim Northover	677da09d02	AArch64: add support for newer Apple CPUs They're roughly ARMv8.6. This works in the .td file, but in AArch64TargetParser.def, marking them v8.6 brings in support for the SM4 cryptographic hash and we don't actually have that. So TargetParser side they're marked as v8.5, with the extra features (BF16 and I8MM added manually). Finally, A16 supports the HCX extension in addition to v8.6. This has no TargetParser implications.	2022-09-22 11:58:51 +01:00
Clement Courbet	e52f8406e8	Re-land "[llvm-exegesis] Support analyzing results from a different target." With Mips fixes. This reverts commit `7daf60e344`.	2022-09-22 11:39:52 +02:00
Clement Courbet	7daf60e344	Revert "[llvm-exegesis] Support analyzing results from a different target." Breaks MIPS compile. This reverts commit `cc61c822e0`.	2022-09-22 11:19:01 +02:00
Clement Courbet	cc61c822e0	[llvm-exegesis] Support analyzing results from a different target. We were using the native triple to parse the benchmarks. Use the triple from the benchmarks file. Right now this still only allows analyzing files produced by the current target until D133605 is in. This also makes the `Analysis` class much less ad-hoc. Differential Revision: https://reviews.llvm.org/D133697	2022-09-22 11:11:18 +02:00
Corentin Jabot	c932cef32a	Update Unicode to 15.0 Unicode 15.0 adds 4,489 characters, for a total of 149,186 characters. These additions include 2 new scripts along with 20 new emoji characters, and 4,193 CJK ideographs. This changes modify most existing tables including - XID_Start/XID_Continue in Clang - The character name database (used by \N{} in Clang) - The list of formattable/printable codepoints - The case folding algorithm (which we had not updated since Unicode 9) - The list of nonspacing/enclosing marks used by the column width computation algorithm. The rest of the column width algorithm is not updated. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D133807	2022-09-22 05:03:01 +02:00
Amara Emerson	85cd376f70	[GlobalISel] Fix known bits for G_ASSERT_ALIGN. I don't know what was going on originally with these tests. It seems reasonable to have the immediate be the same byte alignment unit as the IR, in which case we need to take the log2 in order to set the right number of low bits. This fixes a miscompile in chromium. Differential Revision: https://reviews.llvm.org/D134380	2022-09-21 21:34:05 +01:00
Shubham Sandeep Rastogi	636de2bf34	Change isLittleEndian to follow llvm style and add an accessor Differential Revision: https://reviews.llvm.org/D134290	2022-09-20 17:00:47 -07:00
Eric Li	86118ec2d0	[Support] Provide access to the full mapping in llvm::Annotations Providing access to the mapping of annotations allows test helpers to be expressive by using the annotations as expectations. For example, a matcher could verify that all annotated points were matched by a matcher, or that an refactoring surgically modifies specific ranges. Differential Revision: https://reviews.llvm.org/D134072	2022-09-20 11:06:21 -04:00
Simon Pilgrim	1146d40d9a	[UnitTests] Add ShuffleVectorInst unit test coverage for shuffle mask kind matchers Add tests for the core static shuffle pattern match helpers	2022-09-19 11:53:30 +01:00
Kazu Hirata	6b49f30fca	[llvm] Deprecate llvm::empty (NFC) This patch deprecates llvm::empty as I've migrated all known uses of llvm::empty(x) to x.empty(). Differential Revision: https://reviews.llvm.org/D134141	2022-09-18 22:01:32 -07:00
Kazu Hirata	1cd4563013	[llvm] Use has_value instead of hasValue (NFC)	2022-09-18 19:45:34 -07:00
Lang Hames	0e43f3b04d	[ORC][ORC-RT] Make WrapperFunctionCall::Create support void functions. Serialized calls to void-wrapper-functions should have zero bytes of argument data, but accessing ArgData[0] may (and will, in the case of SmallVector) fail if the argument data buffer is empty. This commit fixes the issue by adding a check for empty argument buffers.	2022-09-18 17:53:45 -07:00
Kazu Hirata	a2842a43a1	[llvm] Use x.empty() instead of llvm::empty(x) (NFC) I'm planning to deprecate and eventually remove llvm::empty. Note that no use of llvm::empty requires the ability of llvm::empty to determine the emptiness from begin/end only.	2022-09-18 11:21:16 -07:00
Aiden Grossman	e5e3dccd07	[mlgo] Add in-development instruction based features for regalloc advisor This patch adds in instruction based features to the regalloc advisor gated behind a flag so a user can decide at runtime whether or not they want to enable the feature. The features are only enabled when LLVM is compiled in MLGO develpment mode (LLVM_HAVE_TF_API) is set to true. To extract the instruction features, I'm taking a list of segments from each LiveInterval and noting the start and end SlotIndices. This list is then sorted based on the start SlotIndex and I iterate through each SlotIndex to grab instructions, making sure to check for overlaps. This results in a vector of opcodes and binary mapping matrix that maps live ranges to the opcodes of the instructions within that LR. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D131930	2022-09-17 19:54:45 +00:00
Fangrui Song	367997d0d6	[Support] Rename llvm::compression::{zlib,zstd}::uncompress to more appropriate decompress This improves consistency with other places (e.g. llvm::compression::decompress, llvm::object::Decompressor::decompress, llvm-objcopy). Note: when zstd::uncompress was added, we noticed that the API `ZSTD_decompress` is fine while the zlib API `uncompress` is a misnomer.	2022-09-17 12:35:17 -07:00
Kazu Hirata	29c841ce93	Revert "[llvm] Remove llvm::is_trivially_{copy/move}_constructible (NFC)" This reverts commit `01ffe31cbb`. A build breakage with GCC 7.3 has been reported: https://reviews.llvm.org/D132311#3797053 FWIW, GCC 7.5 is OK according to Pavel Chupin. I also personally tested GCC 8.4.0.	2022-09-16 18:26:20 -07:00
mbs	7061a3f3f8	[support] Prepare TimeProfiler for cross-thread support This NFC prepares the TimeProfiler to support the construction and completion of time profiling 'entries' across threads. Add ClockType alias so we can change the clock in one place. (trivial) Use c++ usings instead of typedefs Rename Entry to TimeTraceProfilerEntry since this type will eventually become public. Add an intro comment. Add some smoke unit tests. Reviewed By: russell.gallop, rriddle, lattner, jloser Differential Revision: https://reviews.llvm.org/D133153	2022-09-16 10:20:18 -06:00
Craig Topper	ace05124f5	[IntegerDivision][AMDGPU] Use CreateLogicalOr to block poison propagation. There are two ctlz intrinsics here with the zero_is_poison flag set. There are also two comparisons that check if either of the inputs the ctlzs are zero. We need to use a logical or to block the poison from the ctlz if either of the inputs is zero. Reviewed By: arsenm, aqjune Differential Revision: https://reviews.llvm.org/D130680	2022-09-15 09:38:02 -07:00
Michael Platings	f0c234d2a6	[NFC] Don't assume llvm directory is CMake root This makes the file consistent with ARM/CMakeLists.txt	2022-09-15 13:06:54 +01:00
Nikita Popov	b1cd393f9e	[AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI) Currently, FunctionModRefBehavior tracks whether the function reads or writes memory (ModRefInfo) and which locations it can access (argmem, inaccessiblemem and other). This patch changes it to track ModRef information per-location instead. To give two examples of why this is useful: * D117095 highlights a weakness of ModRef modelling in the presence of operand bundles. For a memcpy call with deopt operand bundle, we want to say that it can read any memory, but only write argument memory. This would allow them to be treated like any other calls. However, we currently can't express this and have to say that it can read or write any memory. * D127383 would ideally be modelled as a separate threadid location, where threadid Refs outside pre-split coroutines can be ignored (like other accesses to constant memory). The current representation does not allow modelling this precisely. The patch as implemented is intended to be NFC, but there are some obvious opportunities for improvements and simplification. To fully capitalize on this we would also want to change the way we represent memory attributes on functions, but that's a larger change, and I think it makes sense to separate out the FunctionModRefBehavior refactoring. Differential Revision: https://reviews.llvm.org/D130896	2022-09-14 16:34:41 +02:00
Chris Bieneman	4b96f8996a	[DX] DXContainer does not support COMDAT The DXContainer is pretty primitive, but doesn't support COMDAT. We need to set that in the Triple so that Clang won't try to emit COMDATs.	2022-09-13 13:59:47 -05:00
Amara Emerson	25bcc8c797	[GlobalISel][Legalizer] Fix minScalarEltSameAsIf to handle p0 element types. The mutation the action generates tries to change the input type into the element type of larger vector type. This doesn't work if the larger element type is a vector of pointers since it creates an illegal mutation between scalar and pointer types. Differential Revision: https://reviews.llvm.org/D133671	2022-09-13 00:01:37 +01:00
Sander de Smalen	cf72dddaef	[AArch64][SME] Add utility class for handling SME attributes. This patch adds a utility class that will be used in subsequent patches for parsing the function/callsite attributes and determining whether changes to PSTATE.SM are needed, or whether a lazy-save mechanism is required. It also implements some of the restrictions on the SME attributes in the IR Verifier pass. More details about the SME attributes and design can be found in D131562. Reviewed By: david-arm, aemerson Differential Revision: https://reviews.llvm.org/D131570	2022-09-12 12:41:30 +00:00
Clement Courbet	7053e863a1	[llvm-exegesis][NFC] Use factory function for LlvmState. This allows failing more gracefully.	2022-09-12 14:19:33 +02:00
Aiden Grossman	ec83c7e358	[MLGO] Make TFLiteUtils throw an error if some features haven't been passed to the model In the Tensorflow C lib utilities, an error gets thrown if some features haven't gotten passed into the model (due to differences in ordering which now don't exist with the transition to TFLite). However, this is not currently the case when using TFLiteUtils. This patch makes some minor changes to throw an error when not all inputs of the model have been passed, which when not handled will result in a seg fault within TFLite. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D133451	2022-09-10 22:59:03 +00:00

1 2 3 4 5 ...

7992 Commits