Commit Graph

49278 Commits

Author SHA1 Message Date
Stella Laurenzo 2dc68b5398 Add APFloat and MLIR type support for fp8 (e5m2).
This is a first step towards high level representation for fp8 types
that have been built in to hardware with near term roadmaps. Like the
BFLOAT16 type, the family of fp8 types are inspired by IEEE-754 binary
floating point formats but, due to the size limits, have been tweaked in
various ways in order to maximally use the range/precision in various
scenarios. The list of variants is small/finite and bounded by real
hardware.

This patch introduces the E5M2 FP8 format as proposed by Nvidia, ARM,
and Intel in the paper: https://arxiv.org/pdf/2209.05433.pdf

As the more conformant of the two implemented datatypes, we are plumbing
it through LLVM's APFloat type and MLIR's type system first as a
template. It will be followed by the range optimized E4M3 FP8 format
described in the paper. Since that format deviates further from the
IEEE-754 norms, it may require more debate and implementation
complexity.

Given that we see two parts of the FP8 implementation space represented
by these cases, we are recommending naming of:

* `F8M<N>` : For FP8 types that can be conceived of as following the
  same rules as FP16 but with a smaller number of mantissa/exponent
  bits. Including the number of mantissa bits in the type name is enough
  to fully specify the type. This naming scheme is used to represent
  the E5M2 type described in the paper.
* `F8M<N>F` : For FP8 types such as E4M3 which only support finite
  values.

The first of these (this patch) seems fairly non-controversial. The
second is previewed here to illustrate options for extending to the
other known variant (but can be discussed in detail in the patch
which implements it).

Many conversations about these types focus on the Machine-Learning
ecosystem where they are used to represent mixed-datatype computations
at a high level. At that level (which is why we also expose them in
MLIR), it is important to retain the actual type definition so that when
lowering to actual kernels or target specific code, the correct
promotions, casts and rescalings can be done as needed. We expect that
most LLVM backends will only experience these types as opaque `I8`
values that are applicable to some instructions.

MLIR does not make it particularly easy to add new floating point types
(i.e. the FloatType hierarchy is not open). Given the need to fully
model FloatTypes and make them interop with tooling, such types will
always be "heavy-weight" and it is not expected that a highly open type
system will be particularly helpful. There are also a bounded number of
floating point types in use for current and upcoming hardware, and we
can just implement them like this (perhaps looking for some cosmetic
ways to reduce the number of places that need to change). Creating a
more generic mechanism for extending floating point types seems like it
wouldn't be worth it and we should just deal with defining them one by
one on an as-needed basis when real hardware implements a new scheme.
Hopefully, with some additional production use and complete software
stacks, hardware makers will converge on a set of such types that is not
terribly divergent at the level that the compiler cares about.

(I cleaned up some old formatting and sorted some items for this case:
If we converge on landing this in some form, I will NFC commit format
only changes as a separate commit)

Differential Revision: https://reviews.llvm.org/D133823
2022-10-02 17:17:08 -07:00
Kazu Hirata 66bb0ac251 [ADT] Use std::common_type_t (NFC) 2022-10-01 17:24:52 -07:00
Arthur Eubanks 5df4ab55f9 [llvm] Migrate PAEval to new pass manager 2022-10-01 16:41:58 -07:00
Jessica Paquette 8aedb435db [GlobalISel] Combine abs(undef) -> 0
SDAG does this, GISel doesn't.

See https://gcc.godbolt.org/z/sqjMx3Tfv

More context:
https://github.com/llvm/llvm-project/issues/57256

Differential Revision: https://reviews.llvm.org/D135021
2022-10-01 15:16:32 -07:00
Jessica Paquette 24553df57d [GlobalISel] Combine `undef / X -> 0` and `undef % X -> 0`
This fixes the `urem_undef_lhs` case in the following:

https://gcc.godbolt.org/z/Wo9x7o679

Also see https://github.com/llvm/llvm-project/issues/57256 for more related
bugs.

This is equivalent to the undef bits in `simplifyDivRem` in the DAGCombiner.

Differential Revision: https://reviews.llvm.org/D135020
2022-10-01 13:48:55 -07:00
Florian Hahn 7c0ff64b0f
[LAA] Change to function analysis for new PM.
At the moment, LoopAccessAnalysis is a loop analysis for the new pass
manager. The issue with that is that LAI caches SCEV expressions and
modifications in a loop may impact SCEV expressions in other loops, but
we do not have a convenient way to invalidate LAI for other loops
withing a loop pipeline.

To avoid this issue, turn it into a function analysis which returns a
manager object that keeps track of the individual LAI objects per loop.

Fixes #50940.

Fixes #51669.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134606
2022-10-01 15:44:27 +01:00
Teresa Johnson 43417d8159 [MemProf] Update metadata during inlining
Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original).

Depends on D128142.

Reviewed By: snehasish

Differential Revision: https://reviews.llvm.org/D128143
2022-09-30 19:21:15 -07:00
Yeting Kuo cefb7aab61 [VP][RISCV] Add vp.copysign and RISC-V support.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134935
2022-10-01 10:19:10 +08:00
Teresa Johnson 4d243348fb Revert "[MemProf] Update metadata during inlining" and preceeding commit
This reverts commit 0d7f3464ce and
commit f9403ca41e. The latter was
"Profile matching and IR annotation for memprof profiles." and was left
from a bad rebase from a commit already pushed upstream.
2022-09-30 17:01:30 -07:00
Teresa Johnson 0d7f3464ce [MemProf] Update metadata during inlining
Update both memprof and callsite metadata to reflect inlined functions.

For callsite metadata this is simply a concatenation of each cloned
call's call stack with that of the inlined callsite's.

For memprof metadata, each profiled memory info block (MIB) is either
moved to the cloned allocation call or left on the original allocation
call depending on whether its context matches the newly refined call
stack context on the cloned call. We also reapply context trimming
optimizations based on the refined set of contexts on each of the calls
(cloned and original), via utilities in MemoryProfileInfo.

Depends on D128142.

Differential Revision: https://reviews.llvm.org/D128143
2022-09-30 16:46:17 -07:00
Michael Maitland 19f8176eb6 [TableGen] Add div bang operator
This patch adds the div bang operator which performs division.

Differential Revision: https://reviews.llvm.org/D134001
2022-09-30 12:08:28 -07:00
Nikita Popov 57f7f0d6cf [AST] Use BatchAA in aliasesPointer() (NFC) 2022-09-30 16:22:29 +02:00
Pierre van Houtryve 653beae5a1 [AMDGPU][GISel] Add Identity BUILD_VECTOR Combines
Folds-away BUILD_VECTOR-related noops in the post-legalizer combiner.

Depends on D134433

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134953
2022-09-30 14:07:13 +00:00
Lang Hames 565f4eb8e6 [JITLink] Update external symbol scopes to reflect scopes of resolved defs.
This is a counterpart to ffe2dda29f, and does for scope what that commit did
for linkage.

Making the scope of external definitions visible to JITLink plugins will
allow us to distinguish hidden weak defs (which do not need to be tracked by
default) from default-scoped weak defs (which need to be updated to point at
a single chosen definition at runtime).
2022-09-29 20:32:46 -07:00
Yeting Kuo 1cc02b05b7 [SelectionDAG] Add helper function to check whether a SDValue is neutral element. NFC.
Using this helper makes work about neutral elements more easier. Although I only
find one case now, I think it will have more chance to be used since so many
combine works are related to neutral elements.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133866
2022-09-30 11:29:11 +08:00
Ben Dunbobbin 7eee2a2d44 [IR] Don't allow DLL storage-class and local linkage
Disallow this meaningless combination. Doing so simplifies analysis
of LLVM code w.r.t t DLL storage-class, and prevents mistakes with
DLL storage class.

- Change the assembler to reject DLL storage class on symbols with
  local linkage.
- Change the bitcode reader to clear the DLL Storage class when the
  linkage is local for auto-upgrading
- Update LangRef.

There is an existing restriction on non-default visibility and local
linkage which this is modelled on.

Differential Review: https://reviews.llvm.org/D134784
2022-09-30 00:26:01 +01:00
Adrian Prantl c1a578b09f add missing textual header to modulemap 2022-09-29 15:56:31 -07:00
Florian Hahn 9933a2e9fd
[SCEVExpander] Move LCSSA fixup to ::expand.
Move LCSSA fixup from ::expandCodeForImpl to ::expand(). This has
the advantage that we directly preserve LCSSA nodes here instead of
relying on doing so in rememberInstruction. It also ensures that we
 don't add the non-LCSSA-safe value to InsertedExpressions.

Alternative to D132704.

Fixes #57000.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D134739
2022-09-29 20:49:56 +01:00
Chris Bieneman 49dc58f551 [DX] [ObjectYAML] Support DX shader feature flags
DXContainers contain a feature flag part, which stores a bitfield used
to denote what underlying hardware features the shader requires. This
change adds feature flags to the DXContainer YAML tooling to enable
testing generating feature flags during HLSL code generation.

Depends on D133980

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D134315
2022-09-29 12:37:11 -05:00
Chris Bieneman 63accaf46f [NFC] Refactor DXContainer to support more parts
This patch refactors some of the DXContainer Object and YAML code to
make it easier to add more part parsing.

DXContainer has a whole bunch of constant values, so I've added a
DXContainerConstants.def file which will grow with constant
definitions, but starts with just part identifiers. I've also added a
utility to parse the part magic string into an enum, and converted the
code to use that utility and the enum instead of the part literal
string.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D133980
2022-09-29 11:59:52 -05:00
Serge Pavlov b934be2c05 [Support] Class for response file expansion (NFC)
Functions that implement expansion of response and config files depend
on many options, which are passes as arguments. Extending the expansion
requires new options, it in turn causes changing calls in various places
making them even more bulky.

This change introduces a class ExpansionContext, which represents set of
options that control the expansion. Its methods implements expansion of
responce files including config files. It makes extending the expansion
easier.

No functional changes.

Differential Revision: https://reviews.llvm.org/D132379
2022-09-29 19:15:01 +07:00
Fangrui Song 04a65d62a0 Revert D134638 "[Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC"
This reverts commit b7baddc755.

Broke CodeGen/X86/callbr-asm-kill.mir
We shall pay attention when adding new constraints.
2022-09-29 00:54:56 -07:00
Weining Lu b7baddc755 [Clang][LoongArch] Add inline asm support for constraints k/m/ZB/ZC
k: A memory operand whose address is formed by a base register and
(optionally scaled) index register.

m: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as st.w and ld.w.

ZB: An address that is held in a general-purpose register. The offset
is zero.

ZC: A memory operand whose address is formed by a base register and
offset that is suitable for use in instructions with the same
addressing mode as ll.w and sc.w.

Differential Revision: https://reviews.llvm.org/D134638
2022-09-29 15:02:08 +08:00
Carlos Alberto Enciso b55e76de8b [ADT] IntervalTree - Fix random unittests failures in a debug builds.
On a debug build with _LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY
enabled from 100 executions around 80 are failing.

More details in https://reviews.llvm.org/D125776#3820399

The issue is related to the use of std::sort.

Reviewed By: antondaubert, jryans, probinson

Differential Revision: https://reviews.llvm.org/D134805
2022-09-29 05:17:36 +01:00
Jessica Paquette 1eb49bbab6 [GlobalISel][CallLowering] Use hasRetAttr for return flags on CallBases
Given something like this:

```
declare signext i16 @signext_callee()
define i32 @caller() {
  %res = call i16 @signext_callee()
  ...
}
```

CallLowering would miss that signext_callee's return value is sign extended,
because it isn't on the call.

Use hasRetAttr on the CallBase to allow us to catch this.

(This now inserts G_ASSERT_SEXT/G_ASSERT_ZEXT like in the original review.)

Differential Revision: https://reviews.llvm.org/D86228
2022-09-28 19:38:24 -07:00
Florian Mayer 0401dc2913 [MTE] [HWASan] unify isInterestingAlloca
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D134779
2022-09-28 15:52:34 -07:00
Jessica Paquette 704b2e162c [GlobalISel] Add isConstFalseVal helper to Utils
Add a utility function which returns true if the given value is a constant
false value.

This is necessary to port one of the compare simplifications in
TargetLowering::SimplifySetCC.

Differential Revision: https://reviews.llvm.org/D91754
2022-09-28 15:44:26 -07:00
Daniel Thornburgh e61d89efd7 [NFC] [Object] Create library to fetch debug info by build ID.
This creates a library for fetching debug info by build ID, whether
locally or remotely via debuginfod. The functionality was refactored
out of existing code in the Symboliize library. Existing utilities
were refactored to use this library.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D132504
2022-09-28 13:35:35 -07:00
serge-sans-paille 16544cbe64 [iwyu] Move <cmath> out of llvm/Support/MathExtras.h
Interestingly, MathExtras.h doesn't use <cmath> declaration, so move it out of
that header and include it when needed.

No functional change intended, but there's no longer a transitive include
fromMathExtras.h to cmath.
2022-09-28 20:49:01 +02:00
serge-sans-paille 99c7f83b99 [iwyu] Move <iostream> out of llvm/DebugInfo/Symbolize/Markup.h header
It's only used in the implementation. No functional change intended.
2022-09-28 20:49:00 +02:00
Qiongsi Wu ef399d1cfc [LTO][AIX] Invoking AIX System Assembler in LTO CodeGen
This patch teaches LTOCodeGenerator to call into the AIX system assembler to generate object files. This is in contrast to the approach taken on other platforms, where the LTOCodeGenerate calls the integrated assembler to generate object files.  We need to rely on the system assembler because the integrated assembler is incomplete at the moment.

Reviewed By: w2yehia, MaskRay

Differential Revision: https://reviews.llvm.org/D134375
2022-09-28 14:26:50 -04:00
Sameer Sahasrabuddhe 3f078b308b [AAPointerInfo] OffsetInfo: Unassigned is distinct from Unknown
A User like the PHINode may be visited multiple times for the same pointer along
different def-use edges. The uninitialized state of OffsetInfo at the first
visit needs to be distinct from the Unknown value that may be assigned after
processing the PHINode. Without that, a PHINode with all inputs Unknown is never
followed to its uses. This results in incorrect optimization because some
interfering accessess are missed.

Differential Revision: https://reviews.llvm.org/D134704
2022-09-28 20:31:36 +05:30
Archibald Elliott dcc0048279 [AArch64] Correct v9.x-a Features
A change to D109517 during review stated it was disabling the crypto
extensions by default in armv9a, but it also ended up removing two other
non-crypto features: i8mm and bf16. This error was also made in D116158.

This patch re-adds those two extensions to the feature bitmaps for the
affected armv9a versions in the target parser.

Differential Revision: https://reviews.llvm.org/D134647
2022-09-28 14:52:44 +01:00
Florian Hahn ed47bc8b58
[SCEVExpander] Remove dead Root argument from expandCodeForImpl (NFC).
The argument is unused and can be removed.
2022-09-28 12:08:36 +01:00
Alvin Wong bf0cda9ed2 [lldb][COFF] Rewrite ParseSymtab to list both export and symbol tables
This reimplements `ObjectFilePECOFF::ParseSymtab` to replace the manual
data extraction with what `COFFObjectFile` already provides. Also use
`SymTab::AddSymbol` instead of resizing the SymTab then assigning each
elements afterwards.

Previously, ParseSymTab loads symbols from both the COFF symbol table
and the export table, but if there are any entries in the export table,
it overwrites all the symbols already loaded from the COFF symbol table.
Due to the change to use AddSymbols, this no longer happens, and so the
SymTab now contains all symbols from both tables as expected.

The export symbols are now ordered by ordinal, instead of by the name
table order.

In its current state, it is possible for symbols in the COFF symbol
table to be duplicated by those in the export table. This behaviour will
be modified in a separate change.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D134196
2022-09-28 12:57:10 +03:00
River Riddle 50d96f59d0 [TableGen] Track reference locations of Records/RecordVals
This is extremely useful for language tooling as it allows
for providing go-to-def/find-references/etc. for many
more situations than what is currently possible.

Differential Revision: https://reviews.llvm.org/D134087
2022-09-27 23:48:16 -07:00
Serge Pavlov 5ddde5f80a Revert "[Support] Class for response file expansion (NFC)"
This reverts commit 6e491c48d6.
There are missed changes in flang.
2022-09-28 13:33:28 +07:00
Serge Pavlov 6e491c48d6 [Support] Class for response file expansion (NFC)
Functions that implement expansion of response and config files depend
on many options, which are passes as arguments. Extending the expansion
requires new options, it in turn causes changing calls in various places
making them even more bulky.

This change introduces a class ExpansionContext, which represents set of
options that control the expansion. Its methods implements expansion of
responce files including config files. It makes extending the expansion
easier.

No functional changes.

Differential Revision: https://reviews.llvm.org/D132379
2022-09-28 11:47:59 +07:00
eopXD 9677d70eb2 [VP][RISCV] Add vp.floor, vp.round, vp.roundeven and their RISC-V support
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134759
2022-09-27 19:45:58 -07:00
Philip Reames f6d110e26f [LAA] Make getPtrStride return Option instead of overloading zero as error value [nfc]
This is purely NFC restructure in advance of a change which actually exposes zero strides.  This is mostly because I find this interface confusing each time I look at it.
2022-09-27 15:55:44 -07:00
Martin Sebor e80e134c77 [InstCombine] Add support for stpncpy folding
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D130922
2022-09-27 14:44:33 -06:00
eopXD 163cb33854 [VP][RISCV] Add vp.ceil and RISC-V support
Previous commit 8b00b24f85 missed to add `int_ceil` anchor for the
llvm.ceil.* section under LangRef.rst

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134586
2022-09-27 12:04:09 -07:00
eopXD 384b8b3da7 Revert "[VP][RISCV] Add vp.ceil and RISC-V support"
This reverts commit 8b00b24f85.
2022-09-27 11:12:57 -07:00
eopXD 8b00b24f85 [VP][RISCV] Add vp.ceil and RISC-V support
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134586
2022-09-27 11:08:27 -07:00
Lang Hames ffe2dda29f [ORC][JITLink] Retain Weak flags in JITDylib interfaces, propagate to LinkGraph.
Previously we stripped Weak flags from JITDylib symbol table entries once they
were resolved (there was no particularly good reason for this). Now we want to
retain them and query them when setting the Linkage on external symbols in
LinkGraphs during symbol resolution (this was the motivation for 75404e9ef8).
Making weak linkage of external definitions discoverable in the LinkGraph will
in turn allow future plugins to implement correct handling for them (by
recording locations that depend on exported weak definitions and pointing all
of these at one chosen definition at runtime).
2022-09-27 10:04:59 -07:00
Lang Hames 57f6fd0a34 [JITLink] Fix typo in debugging output. 2022-09-27 10:04:59 -07:00
Craig Topper a6383bb51c [VP][RISCV] Add vp.fmuladd.
Expanded in SelectionDAGBuilder similar to llvm.fmuladd.

Reviewed By: frasercrmck, simoll

Differential Revision: https://reviews.llvm.org/D134474
2022-09-27 10:02:37 -07:00
Simon Pilgrim 404dc7af5d [TTI] getExtractWithExtendCost - remove default Index = -1 value
We currently always specify the index (even if we set it to -1), and a future patch will need to adjust this method to make it more compatible with getScalarizationOverhead
2022-09-27 15:44:29 +01:00
Daniel Kiss 712de9d171 [AArch64] Add all predecessor archs in target info
A given function is compatible with all previous arch versions.
To avoid compering values of the attribute this logic adds all predecessor
architecture values.

Reviewed By: dmgreen, DavidSpickett

Differential Revision: https://reviews.llvm.org/D134353
2022-09-27 10:23:21 +02:00
David Sherwood fbb119412f [AArch64] Add Neoverse V2 CPU support
Adds support for the Neoverse V2 CPU to the AArch64 backend.

Differential Revision: https://reviews.llvm.org/D134352
2022-09-27 07:56:08 +00:00