Commit Graph

49278 Commits

Author SHA1 Message Date
Jay Foad 3822a01e0b [AMDGPU] Add GFX11 ds_bvh_stack_rtn_b32 instruction
Differential Revision: https://reviews.llvm.org/D133928
2022-09-15 16:46:14 +01:00
Sander de Smalen 45d28779c5 [AArch64][SME] Fix lowering of llvm.aarch64.get.pstatesm()
A thread may not have access to SME or TPIDR2_EL0, so in order to
safely query PSTATE.SM in a streaming-compatible function, the
code should call `__arm_sme_state()`, as described in the ABI:

  c2bb09c4d4

This means that the value of pstate.sm is:
* 0 if the function is non-streaming.
* 1 if the function has `arm_streaming` or `arm_locally_streaming`.
* evaluated at runtime by a call to __arm_sme_state() otherwise.

This patch also adds a calling convention for calls to SME support routines.

At some point we can remove the need for the llvm.aarch64.get.pstatesm() intrinsic
and use function calls (with the corresponding cc) directly instead.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D131571
2022-09-15 15:14:13 +00:00
Matt Arsenault 63d1d37d35 RegAllocGreedy: Avoid overflowing priority bitfields
The class priority is expected to be at most 5 bits before it starts
clobbering bits used for other fields. Also clamp the instruction
distance in case we have millions of instructions.

AMDGPU was accidentally overflowing into the global priority bit in
some cases. I think in principal we would have wanted this, but in the
cases I've looked at, it had the counter intuitive effect and
de-prioritized the large register tuple.

Avoid using weird bit hack PPC uses for global priority. The
AllocationPriority field is really 5 bits, and PPC was relying on
overflowing this to 6-bits to forcibly set the global priority
bit. Split this out as a separate flag to avoid having magic behavior
for values above 31.
2022-09-15 10:38:40 -04:00
Alexey Lapshin adabfb5e32 [DWARFLinker][NFC] Set the target DWARF version explicitly.
Currently, DWARFLinker determines the target DWARF version internally.
It examines incoming object files, detects maximal
DWARF version and uses that version for the output file.
This patch allows explicitly setting output DWARF version by the consumer
of DWARFLinker. So that DWARFLinker uses a specified version instead
of autodetected one. It allows consumers to use different logic for
setting the target DWARF version. f.e. instead of the maximally used version
someone could set a higher version to convert from DWARFv4 to DWARFv5
(This possibility is not supported yet, but it would be good if
the interface will support it). Or another variant is to set the target
version through the command line. In this patch, the autodetection is moved
into the consumers(DwarfLinkerForBinary.cpp, DebugInfoLinker.cpp).

Differential Revision: https://reviews.llvm.org/D132755
2022-09-15 16:06:10 +03:00
Dhruva Chakrabarti 839ac62c50 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 7539e9cf81.
2022-09-15 03:08:46 +00:00
Vitaly Buka 72b776168c [IRBuilder] Add CreateMaskedExpandLoad and CreateMaskedCompressStore 2022-09-14 19:18:52 -07:00
Giorgis Georgakoudis 7539e9cf81 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6, ABataev

Differential Revision: https://reviews.llvm.org/D102107
2022-09-15 00:54:05 +00:00
Sam Clegg 8273ca1421 [MC] Fix typo in getSectionAddressSize comment. NFC
The comment was refering to a now non-existant function that was removed
in 93e3cf0ebd.

Differential Revision: https://reviews.llvm.org/D133098
2022-09-14 15:15:41 -07:00
Craig Topper 50a699e362 [IR][VP] Remove IntrArgMemOnly from vp.gather/scatter.
IntrArgMemOnly is only valid for intrinsics that use a scalar
pointer argument. These intrinsics use a vector of pointer.

Alias analysis will try to find a scalar pointer argument and
will return incorrect alias results when it doesn't find one.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133898
2022-09-14 15:00:07 -07:00
Arthur Eubanks ccc9107ad6 [OptBisect] Add flag to print IR when opt-bisect kicks in
-opt-bisect-print-ir-path=foo will dump the IR to foo when opt-bisect-limit starts skipping passes.

Currently we don't print the IR if the opt-bisect-limit is higher than the total number of times opt-bisect is called.

This makes getting the IR right before a bad transform easier.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D133809
2022-09-14 13:48:03 -07:00
Fangrui Song 25394c9d10 [llvm-objdump] Change printSymbolVersionDependency to use ELFFile API
When .gnu.version_r is empty (allowed by readelf but warned by objdump),
llvm-objdump -p may decode the next section as .gnu.version_r and may crash due
to out-of-bounds C string reference. ELFFile<ELFT>::getVersionDependencies
handles 0-entry .gnu.version_r gracefully. Just use it.

Fix https://github.com/llvm/llvm-project/issues/57707

Differential Revision: https://reviews.llvm.org/D133751
2022-09-14 12:30:34 -07:00
Nikita Popov b1cd393f9e [AA] Tracking per-location ModRef info in FunctionModRefBehavior (NFCI)
Currently, FunctionModRefBehavior tracks whether the function reads
or writes memory (ModRefInfo) and which locations it can access
(argmem, inaccessiblemem and other). This patch changes it to track
ModRef information per-location instead.

To give two examples of why this is useful:

* D117095 highlights a weakness of ModRef modelling in the presence
  of operand bundles. For a memcpy call with deopt operand bundle,
  we want to say that it can read any memory, but only write argument
  memory. This would allow them to be treated like any other calls.
  However, we currently can't express this and have to say that it
  can read or write any memory.
* D127383 would ideally be modelled as a separate threadid location,
  where threadid Refs outside pre-split coroutines can be ignored
  (like other accesses to constant memory). The current representation
  does not allow modelling this precisely.

The patch as implemented is intended to be NFC, but there are some
obvious opportunities for improvements and simplification. To fully
capitalize on this we would also want to change the way we represent
memory attributes on functions, but that's a larger change, and I
think it makes sense to separate out the FunctionModRefBehavior
refactoring.

Differential Revision: https://reviews.llvm.org/D130896
2022-09-14 16:34:41 +02:00
Martin Storsjö e280940bfb [Support] Access threadIndex via a wrapper function
On Unix platforms, this wrapper function is inline, so it should
expand to the same direct access to the thread local variable. On
Windows, it's a non-inline function within Parallel.cpp, allowing
making the thread_local variable static.

Windows Native TLS doesn't support direct access to thread local
variables in a different DLL, and GCC/binutils on Windows occasionally
has problems with non-static thread local variables too.

This fixes mingw dylib builds with native TLS after
e6aebff674.

At the same time, move the whole thread local variable within
    #if LLVM_ENABLE_THREADS
to fix builds without threading support.

Differential Revision: https://reviews.llvm.org/D133759
2022-09-14 09:19:27 +03:00
Pengxuan Zheng ecb5ea6a26 [Object][COFF] Allow section symbol to be common symbol
I ran into an lld-link error due to a symbol named ".idata$4" coming from some
static library:
  .idata$4 should not refer to special section 0.

Here is the symbol table entry for .idata$4:

  Symbol {
      Name: .idata$4
      Value: 3221225536
      Section: IMAGE_SYM_UNDEFINED (0)
      BaseType: Null (0x0)
      ComplexType: Null (0x0)
      StorageClass: Section (0x68)
      AuxSymbolCount: 0
  }

The symbol .idata$4 is a section symbol (IMAGE_SYM_CLASS_SECTION) and LLD
currently handles it as a regular defined symbol since isCommon() returns false
for this symbol. This results in the error ".idata$4 should not refer to special
section 0" because lld-link asserts that regular defined symbols should not
refer to section 0.

Should this symbol be handled as a common symbol instead? LLVM currently only
allows external symbols (IMAGE_SYM_CLASS_EXTERNAL) to be common
symbols. However, the PE/COFF spec (see section "Section Number Values") does
not seem to mention this restriction. Any thoughts?

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D133627
2022-09-13 18:07:02 -07:00
YongKang Zhu 5fa6b24354 Address feedback in https://reviews.llvm.org/D133637
https://reviews.llvm.org/D133637 fixes the problem where we should hash raw content of
register mask instead of the pointer to it.

Fix the same issue in `llvm::hash_value()`.

Remove the added API `MachineOperand::getRegMaskSize()` to avoid potential confusion.

Add an assert to emphasize that we probably should hash a machine operand iff it has
associated machine function, but keep the fallback logic in the original change.

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D133747
2022-09-13 16:12:41 -07:00
Chris Bieneman 4b96f8996a [DX] DXContainer does not support COMDAT
The DXContainer is pretty primitive, but doesn't support COMDAT. We need
to set that in the Triple so that Clang won't try to emit COMDATs.
2022-09-13 13:59:47 -05:00
theidexisted 0a1c8522f3 [NFC][ADT] Fix assert message
Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D129632
2022-09-13 18:55:57 +00:00
Hendrik Greving 393a17b5d1 [ValueTypes] Define MVTs for v256i2/v128i4.
Adds MVT::v256i2, MVT::v128i4.

Differential Revision: https://reviews.llvm.org/D133603
2022-09-13 09:02:23 -07:00
Pavel Samolysov 02aaf8e3d6 [NFC][ScheduleDAGInstrs] Use structure bindings and emplace_back
Some uses of std::make_pair and the std::pair's first/second members
in the ScheduleDAGInstrs.[cpp|h] files were replaced with using of the
vector's emplace_back along with structure bindings from C++17.
2022-09-13 12:49:04 +03:00
Sylvestre Ledru cd20a18286 Revert "[clang, llvm] Add __declspec(safebuffers), support it in CodeView"
Causing:
https://github.com/llvm/llvm-project/issues/57709

This reverts commit ab56719acd.
2022-09-13 10:53:59 +02:00
Max Kazantsev 86d5586d78 [SCEVExpander] Recompute poison-generating flags on hoisting. PR57187
Instruction being hoisted could have nuw/nsw flags inferred from the old
context, and we cannot simply move it to the new location keeping them
because we are going to introduce new uses to them that didn't exist before.

Example in https://github.com/llvm/llvm-project/issues/57187 shows how
this can produce branch by poison from initially well-defined program.

This patch forcefully recomputes poison-generating flag in the new context.

Differential Revision: https://reviews.llvm.org/D132022
Reviewed By: fhahn, nikic
2022-09-13 12:56:35 +07:00
Aiden Grossman eec183c171 [nfc] Refactor SlotIndex::getInstrDistance to better reflect actual functionality
This patch refactors SlotIndex::getInstrDistance to
SlotIndex::getApproxInstrDistance to better describe the actual
functionality of this function. This patch also adds in some additional
comments better documenting the assumptions that this function makes to
increase clarity.

Based on discussion on the LLVM Discourse:
https://discourse.llvm.org/t/odd-behavior-in-slotindex-getinstrdistance/64934/5

Reviewed By: mtrofin, foad

Differential Revision: https://reviews.llvm.org/D133386
2022-09-12 23:33:35 +00:00
Amara Emerson 25bcc8c797 [GlobalISel][Legalizer] Fix minScalarEltSameAsIf to handle p0 element types.
The mutation the action generates tries to change the input type into the
element type of larger vector type. This doesn't work if the larger element
type is a vector of pointers since it creates an illegal mutation between
scalar and pointer types.

Differential Revision: https://reviews.llvm.org/D133671
2022-09-13 00:01:37 +01:00
David Majnemer ab56719acd [clang, llvm] Add __declspec(safebuffers), support it in CodeView
__declspec(safebuffers) is equivalent to
__attribute__((no_stack_protector)).  This information is recorded in
CodeView.

While we are here, add support for strict_gs_check.
2022-09-12 21:15:34 +00:00
Kazu Hirata 9606608474 [llvm] Use x.empty() instead of llvm::empty(x) (NFC)
I'm planning to deprecate and eventually remove llvm::empty.

I thought about replacing llvm::empty(x) with std::empty(x), but it
turns out that all uses can be converted to x.empty().  That is, no
use requires the ability of std::empty to accept C arrays and
std::initializer_list.

Differential Revision: https://reviews.llvm.org/D133677
2022-09-12 13:34:35 -07:00
YongKang Zhu 481a32f587 Bug fix on stable hash calculation for machine operands RegisterMask and RegisterLiveOut
MachineOperand::getRegMask() returns a pointer to register mask.  We should hash the raw content of register mask instead of its pointer.

Reviewed By: kyulee

Differential Revision: https://reviews.llvm.org/D133637
2022-09-12 13:25:04 -07:00
Fangrui Song e6aebff674 [ELF] Parallelize relocation scanning
* Change `Symbol::flags` to a `std::atomic<uint16_t>`
* Add `llvm::parallel::threadIndex` as a thread-local non-negative integer
* Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
* Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.

MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.

Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:

* clang (Release): 1.27x as fast
* clang (Debug): 1.06x as fast
* chrome (default): 1.05x as fast
* scylladb (default): 1.04x as fast

Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):

* clang (Release): 1.31x as fast
* scylladb (default): 1.06x as fast

Reviewed By: andrewng

Differential Revision: https://reviews.llvm.org/D133003
2022-09-12 12:56:35 -07:00
Craig Topper 38ffa2bb96 [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.
For remainder:
If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves
together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer
type, this urem will be expand by DAGCombiner using multiply by magic
constant. We do have to take into account that adding high and low
together can produce a carry, making it a (BitWidth / 2)+1 bit number.
So we need to also add back in the carry from the first addition.

For division:
We can use the above trick to compute the remainder, subtract that
remainder from the dividend, then multiply by the multiplicative
inverse of the Divisor modulo (1 << BitWidth).

This is based on the section "Remainder by Summing Digits" in
Hacker's delight.

The remainder trick is similar to a trick you may have learned for
determining if a decimal number is divisible by 3. You can add all the
digits together and see if the sum is divisible by 3. If you're not sure
if the sum is divisible by 3, you can add its digits together. This
can be repeated until you have a single decimal digit. If that digit
is 3, 6, or 9, then the original number is divisible by 3. This works
because 10 % 3 == 1.

gcc already does this same trick. There are additional tricks gcc
does urem as well as srem, udiv, and sdiv that I plan to add in
future patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130862
2022-09-12 10:34:52 -07:00
Matthias Gehre c1502425ba Move TargetTransformInfo::maxLegalDivRemBitWidth -> TargetLowering::maxSupportedDivRemBitWidth
Also remove new-pass-manager version of ExpandLargeDivRem because there is no way
yet to access TargetLowering in the new pass manager.

Differential Revision: https://reviews.llvm.org/D133691
2022-09-12 17:06:16 +01:00
Jay Foad 210e6a993d [GlobalISel] Simplify extended add/sub to add/sub with carry
Simplify extended add/sub (with carry-in and carry-out) to add/sub with
carry (with carry-out only) if carry-in is known to be zero.

Differential Revision: https://reviews.llvm.org/D133702
2022-09-12 17:05:44 +01:00
Alexey Bataev dfe1e9dd79 [SLP]Improve reordering of clustered reused scalars.
If the reused scalars are clustered, i.e. each part of the reused mask
contains all elements of the original scalars exactly once, we can
reorder those clusters to improve the whole ordering of of the clustered
vectors.

Differential Revision: https://reviews.llvm.org/D133524
2022-09-12 06:52:25 -07:00
Matt Arsenault 7834194837 TableGen: Introduce generated getSubRegisterClass function
Currently there isn't a generic way to get a smaller register class
that can be produced from a subregister of a larger class. Replaces a
manually implemented version for AMDGPU. This will be used to improve
subregister support in the allocator.
2022-09-12 09:03:37 -04:00
Matt Arsenault bb70b5d406 CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer
Previously this was assuming piontsToConstantMemory implies
dereferenceable.
2022-09-12 08:38:35 -04:00
David Spickett 739b69e655 [LLVM][AArch64] Explain that X19 is used as the frame base pointer register
Fixes #50098

LLVM uses X19 as the frame base pointer, if it needs to. Meaning you
can get warnings if you clobber that with inline asm.

However, it doesn't explain why. The frame base register is not part
of the ABI so it's pretty confusing why you get that warning out of the blue.

This adds a method to explain a reserved register with X19 as the first one.
The logic is the same as getReservedRegs.

I could have added a return parameter to isASMClobberable and friends
but found that there's a lot of things that call isReservedReg in various
ways.

So while one more method on the pile isn't great design, it is simpler
right now to do it this way and only pay the cost if you are actually using
a reserved register.

Reviewed By: lenary

Differential Revision: https://reviews.llvm.org/D133213
2022-09-12 09:18:09 +00:00
Johannes Doerfert c922cac868 Revert "[Attributor] AAPointerInfo should allow "harmless" uses"
Revert "[Attributor] Teach AAPointerInfo to look into aggregates"

This reverts commit 844f6c5d03 and
4ed0a88cd8 as they broke the buildbots
that run openmp/libomptarget/test/offloading/bug49021.cpp.
2022-09-11 21:37:54 -07:00
Johannes Doerfert 4ed0a88cd8 [Attributor] Teach AAPointerInfo to look into aggregates
If we have a constant aggregate, e.g., as an initializer, we usually
failed to extract the proper value/type from it. This patch provides the
size and offset information necessary to extract the right part of the
constant.
2022-09-11 20:16:11 -07:00
Johannes Doerfert 21711039e3 [OpenMP] Allow the Attributor to look at functions we also internalized
This is important as we have accesses to globals in those which we need to
categorize.
2022-09-11 20:16:11 -07:00
Kazu Hirata a21c8be1cc [llvm] Use std::aligned_storage_t (NFC) 2022-09-11 16:11:39 -07:00
Junduo Dong 6975ab7126 [Clang] Reimplement time tracing of NewPassManager by PassInstrumentation framework
The previous implementation of time tracing in NewPassManager is direct but messive.

The key codes are like the demo below:
```
  /// Runs the function pass across every function in the module.
  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
                        LazyCallGraph &CG, CGSCCUpdateResult &UR) {
      /// ...
      PreservedAnalyses PassPA;
      {
        TimeTraceScope TimeScope(Pass.name());
        PassPA = Pass.run(F, FAM);
      }
      /// ...
 }
```

It can be bothered to judge where should we add the tracing codes by hands.

With the PassInstrumentation framework, we can easily add `Before/After` callback
functions to add time tracing codes.

Differential Revision: https://reviews.llvm.org/D131960
2022-09-11 05:42:55 -07:00
sunho d1c4d96126 [ORC][ORC_RT][COFF] Remove public bootstrap method.
Removes public bootstrap method that is not really necessary and not consistent with other platform API.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D132780
2022-09-10 15:25:50 +09:00
sunho 73c4033987 [ORC][ORC_RT][COFF] Support dynamic VC runtime.
Supports dynamic VC runtime. It implements atexits handling which is required to load msvcrt.lib successfully. (the object file containing atexit symbol somehow resolves to static vc runtim symbols) It also default to dynamic vc runtime which tends to be more robust.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D132525
2022-09-10 15:25:49 +09:00
Joe Loser 62b8a61d6c [llvm] Remove includes of `llvm/Support/STLArrayExtras.h`
`llvm` and downstream internal callers no longer use `array_lengthof`, so drop
the include everywhere.

Differential Revision: https://reviews.llvm.org/D133600
2022-09-09 17:44:00 -06:00
Fangrui Song 058f17d3af [ADT] Move LLVM_DEPRECATED before type after D133502
`[[deprecated(...)]]` cannot appear between `inline size_t`.
2022-09-09 15:56:58 -07:00
Joe Loser 5758c824da [ADT] Mark `llvm::array_lengthof` as deprecated
As a follow-up of 5e96cea1db, mark
`llvm::array_lengthof` as deprecated in favor of using `std::size` function
directly.

Differential Revision: https://reviews.llvm.org/D133502
2022-09-09 15:31:00 -06:00
Augie Fackler 4fea8ee540 OpenMP: mark allocptr attribute on __kmpc_free_shared
Differential Revision: https://reviews.llvm.org/D124491
2022-09-09 14:09:18 -04:00
Nikita Popov a9f312c7f4 [AST] Use BatchAA in aliasesUnknownInst() (NFCI) 2022-09-09 15:54:48 +02:00
Sebastian Neubauer c7750c522e Add helper func to get first non-alloca position
The LLVM performance tips suggest that allocas should be placed at the
beginning of the entry block. So far, llvm doesn’t provide any helper to
find that position.

Add BasicBlock::getFirstNonPHIOrDbgOrAlloca and IRBuilder::SetInsertPointPastAllocas(Function*)
that get an insert position after the (static) allocas at the start of a
function and use it in ShadowStackGCLowering.

Differential Revision: https://reviews.llvm.org/D132554
2022-09-09 15:39:53 +02:00
Namhyung Kim 43efb5e445 [llvm-objdump] Create name for fake sections
It doesn't have a section header string table so add a vector to have
the strings and create name based on the program header type and the
index.

Differential Revision: https://reviews.llvm.org/D131290
2022-09-09 12:27:07 +01:00
Serge Pavlov 7b9fae05b4 [Clang] Use virtual FS in processing config files
Clang has support of virtual file system for the purpose of testing, but
treatment of config files did not use it. This change enables VFS in it
as well.

Differential Revision: https://reviews.llvm.org/D132867
2022-09-09 18:24:45 +07:00
Serge Pavlov 55e1441f7b Revert "[Clang] Use virtual FS in processing config files"
This reverts commit 9424497e43.
Some buildbots failed, reverted for investigation.
2022-09-09 16:43:15 +07:00
Serge Pavlov 9424497e43 [Clang] Use virtual FS in processing config files
Clang has support of virtual file system for the purpose of testing, but
treatment of config files did not use it. This change enables VFS in it
as well.

Differential Revision: https://reviews.llvm.org/D132867
2022-09-09 16:28:51 +07:00
Fangrui Song 781dea021a [Support] Rename DebugCompressionType::Z to Zlib
"Z" was so named when we had both gABI ELFCOMPRESS_ZLIB and the legacy .zdebug support.
Now we have just one zlib format, we should use the more descriptive name.
2022-09-08 16:11:29 -07:00
raghavmedicherla 5d3cf8267f Revert "Support: Add mapped_file_region::sync(), equivalent to msync"
This reverts commit 142f51fc2f.

This shouldn't be committed, it got committed accidentally.
2022-09-08 12:49:52 -04:00
Thomas Lively ac3b8df8f2 [WebAssembly] Prototype `f32x4.relaxed_dot_bf16x8_add_f32`
As proposed in https://github.com/WebAssembly/relaxed-simd/issues/77. Only an
LLVM intrinsic and a clang builtin are implemented. Since there is no bfloat16
type, use u16 to represent the bfloats in the builtin function arguments.

Differential Revision: https://reviews.llvm.org/D133428
2022-09-08 08:07:49 -07:00
Joe Loser 5e96cea1db [llvm] Use std::size instead of llvm::array_lengthof
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.

Change call sites to use `std::size` instead.

Differential Revision: https://reviews.llvm.org/D133429
2022-09-08 09:01:53 -06:00
Eric Wang d8a2d3f7d4 [NFC][Regalloc] Introduce the RegAllocPriorityAdvisorAnalysis
This patch introduces the priority analysis and the priority advisor,
the default implementation, and the scaffolding for introducing the
other implementations of the advisor.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D132835
2022-09-08 07:50:03 -07:00
David Spickett e428baf001 [LLVM][ARM] Remove options for armv2, 2A, 3 and 3M
Fixes #57486

These pre v4 architectures are not specifically supported
by codegen. As demonstrated in the linked issue.

GCC has not supported 3M since GCC 9 and presumably
2 and 2A earlier than that. So we are aligned in that sense.

(see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2abd6e34fcf3bd9f9ffafcaa47cdc3ed443f9add)

This removes the options and associated testing.

The Pre_v4 build attribute remains mainly because its absence
would be more confusing. It will not be used other than to
complete the list of build attributes as shown in the ABI.

https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#3352the-target-related-attributes

Reviewed By: nickdesaulniers, peter.smith, rengolin

Differential Revision: https://reviews.llvm.org/D133109
2022-09-08 09:49:48 +00:00
Nikita Popov 96cb7c2273 [ConstantExpr] Remove fneg expression
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
this removes the fneg constant expression (which is, incidentally,
the only unary operator expression).

Differential Revision: https://reviews.llvm.org/D133418
2022-09-08 10:24:55 +02:00
Fangrui Song b6e1fd761d [llvm-objcopy] Support --{,de}compress-debug-sections for zstd
Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal:
https://groups.google.com/g/generic-abi/c/satyPkuMisk
("Add new ch_type value: ELFCOMPRESS_ZSTD")

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Reviewed By: jhenderson, dblaikie

Differential Revision: https://reviews.llvm.org/D130458
2022-09-08 00:59:14 -07:00
Fangrui Song a41977dd0f [Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}
as high-level API on top of `llvm::compression::{zlib,zstd}::*`:

* getReasonIfUnsupported: return nullptr if the specified format is
  supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...`
* compress: dispatch to zlib::uncompress or zstd::uncompress
* decompress: dispatch to zlib::uncompress or zstd::uncompress

Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic
dependency. There are 40+ uses in llvm-project.

Add another enum class `llvm::compression::Format` to represent supported
compression formats, which may be a superset of ELF compression formats.

See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use
case.

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

---

Note: this patch alone will cause -Wswitch to llvm/lib/ObjCopy/ELF/ELFObject.cpp

Reviewed By: ckissane, dblaikie

Differential Revision: https://reviews.llvm.org/D130506
2022-09-08 00:58:55 -07:00
Nikita Popov 0444b40ed3 Revert "[Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}"
This reverts commit 19dc3cff0f.
This reverts commit 5b19a1f8e8.
This reverts commit 9397648ac8.
This reverts commit 10842b4475.

Breaks the GCC build, as reported here:
https://reviews.llvm.org/D130506#3776415
2022-09-08 09:33:12 +02:00
Fangrui Song 10842b4475 [Support] Work around GCC's enum support 2022-09-08 00:13:25 -07:00
Fangrui Song 5b19a1f8e8 [llvm-objcopy] Support --{,de}compress-debug-sections for zstd
Also, add ELFCOMPRESS_ZSTD (2) from the approved generic-abi proposal:
https://groups.google.com/g/generic-abi/c/satyPkuMisk
("Add new ch_type value: ELFCOMPRESS_ZSTD")

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Reviewed By: jhenderson, dblaikie

Differential Revision: https://reviews.llvm.org/D130458
2022-09-07 23:53:40 -07:00
Fangrui Song 19dc3cff0f [Support] Add llvm::compression::{getReasonIfUnsupported,compress,decompress}
as high-level API on top of `llvm::compression::{zlib,zstd}::*`:

* getReasonIfUnsupported: return nullptr if the specified format is
  supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...`
* compress: dispatch to zlib::uncompress or zstd::uncompress
* decompress: dispatch to zlib::uncompress or zstd::uncompress

Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic
dependency. There are 40+ uses in llvm-project.

Add another enum class `llvm::compression::Format` to represent supported
compression formats, which may be a superset of ELF compression formats.

See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use
case.

Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")

Differential Revision: https://reviews.llvm.org/D130506
2022-09-07 23:53:14 -07:00
gonglingqin d5f7a2182d [LoongArch] Add codegen support for atomicrmw xchg operation on LA32
Depends on D131228

Differential Revision: https://reviews.llvm.org/D131229
2022-09-08 13:57:53 +08:00
gonglingqin b60f801607 [LoongArch] Add codegen support for atomicrmw xchg operation on LA64
In order to avoid the patch being too large, the atomicrmw xchg operation
on LA32 will be added later

Differential Revision: https://reviews.llvm.org/D131228
2022-09-08 13:57:26 +08:00
Fangrui Song f48931f3a8 [NewPM] Switch -filter-passes from ClassName to pass-name
NewPM -filter-passes (D86360) uses ClassName instead of pass-name as used in
`-passes`, `-print-after`, etc. D87216 has added a mechanism to map
ClassName to pass-name. Adopt it for -filter-passes.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133263
2022-09-07 22:02:26 -07:00
Marco Elver 97c2220565 [SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass
Introduces the SanitizerBinaryMetadata instrumentation pass which uses
the new MD_pcsections metadata kinds to instrument certain types of
instructions and functions required for breakpoint-based sanitizers.

The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. GWP-TSan will require information about atomic
accesses; to unambiguously determine if an access is atomic or not, we
also require "covered" information which code has been compiled with
SanitizerBinaryMetadata instrumentation enabled.

[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D130887
2022-09-07 21:25:40 +02:00
Andrea Di Biagio 3262794804 [MCA] Correctly check pipeline availability for partially overlapping resource groups.
This patch mostly reverts commit 70b37f4c03 which fixed PR50725.

In case of explicit consumption of multiple partially overlapping group
resources, the ResourceManager was not correctly checking pipeline
esources availability.

The fix for PR50725 only partially addressed a few instances of that issue.
This is a more general (although, technically slower) fix for that same issue.

It also fixes Issue #57548

Thanks to Haohai Wen for the small reproducible.
2022-09-07 12:17:59 +01:00
Marco Elver 343700358f [AsmPrinter] Emit PCs into requested PCSections
Interpret MD_pcsections in AsmPrinter emitting the requested metadata to
the associated sections. Functions and normal instructions are handled.

Differential Revision: https://reviews.llvm.org/D130879
2022-09-07 11:36:02 +02:00
Marco Elver 31a548021b [GlobalISel] Propagate PCSections metadata to MachineInstr
Propagate (most) PC sections metadata to MachineInstr when GlobalISel is
doing instruction selection.

This change results in support for architectures using GlobalISel (such
as -O0 with AArch64). Not all instructions may be supported yet, and
requires further target-specific handling (such as done for AArch64
pseudo-atomics). Expanding supported instructions is planned on a
case-by-case basis and new use cases for PC sections metadata.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130886
2022-09-07 11:36:02 +02:00
Marco Elver 0ba8886af5 [FastISel] Propagate PCSections metadata to MachineInstr
Propagate PC sections metadata to MachineInstr when FastISel is doing
instruction selection.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130884
2022-09-07 11:36:01 +02:00
Nikita Popov 98a3a340c3 [ConstantExpr] Don't create fneg expressions
Don't create fneg expressions unless explicitly requested by IR or
bitcode.
2022-09-07 11:27:25 +02:00
Marco Elver da695de628 [MachineInstrBuilder] Introduce MIMetadata to simplify metadata propagation
In many places DebugLoc and PCSections metadata are just copied along to
propagate them through MachineInstrs. Simplify doing so by bundling them
up in a MIMetadata class that replaces the DebugLoc argument to most
BuildMI() variants.

The DebugLoc-only constructors allow implicit construction, so that
existing usage of `BuildMI(.., DL, ..)` works as before, and the rest of
the codebase using BuildMI() does not require changes.

NFC.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130883
2022-09-07 11:22:50 +02:00
Marco Elver 4c58b00801 [SelectionDAG] Propagate PCSections through SDNodes
Add a new entry to SDNodeExtraInfo to propagate PCSections through
SelectionDAG.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130882
2022-09-07 11:22:50 +02:00
Jay Foad 1427d55d70 [TableGen] Document sequence with stride
Document (in comments) the optional fourth "stride" argument to the
sequence operator, which was added in svn r157416.

Differential Revision: https://reviews.llvm.org/D133297
2022-09-07 09:58:22 +01:00
Vitaly Buka 4c18670776 [NFC][sancov] Rename ModuleSanitizerCoveragePass 2022-09-06 20:55:39 -07:00
Vitaly Buka 5e38b2a456 [NFC][msan] Rename ModuleMemorySanitizerPass 2022-09-06 20:30:35 -07:00
Vitaly Buka 93600eb50c [NFC][asan] Rename ModuleAddressSanitizerPass 2022-09-06 15:02:11 -07:00
Vitaly Buka e7bac3b9fa [msan] Convert Msan to ModulePass
MemorySanitizerPass function pass violatied requirement 4 of function
pass to do not insert globals. Msan nees to insert globals for origin
tracking, and paramereters tracking.

https://llvm.org/docs/WritingAnLLVMPass.html#the-functionpass-class

Reviewed By: kstoimenov, fmayer

Differential Revision: https://reviews.llvm.org/D133336
2022-09-06 15:01:04 -07:00
Arthur Eubanks 7f57c97d30 [ThinLTOBitcodeWriter] Mark pass as required
Or else with -opt-bisect-limit we don't write ThinLTO bitcode.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D133378
2022-09-06 14:47:34 -07:00
bzcheeseman 716b9f7a1a [LLVM][Support/ADT] Add assert for isPresent to dyn_cast.
This change adds an assert to dyn_cast that the value passed-in is present. In the past, this relied on the isa_impl assertion (which still works in many cases) but which we can tighten up for a better QoI.

The PointerUnion change is because it seems like (based on the call sites) the semantics of the member dyn_cast are actually dyn_cast_if_present.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D133221
2022-09-06 13:58:56 -07:00
raghavmedicherla 142f51fc2f Support: Add mapped_file_region::sync(), equivalent to msync
Add mapped_file_region::sync(), equivalent to POSIX msync,
synchronizing written content to disk without unmapping the region.
Asserts if the mode is not mapped_file_region::readwrite.

Note that I don't have access to a Windows machine, so I can't
easily run those unit tests.

Change by dexonsmith

Differential Revision: https://reviews.llvm.org/D95494
2022-09-06 16:46:37 -04:00
Markus Böck f049b2c3fc [MC] Emit Stackmaps before debug info
This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment.

The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections.

This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op.

Differential Revision: https://reviews.llvm.org/D132708
2022-09-06 20:20:56 +02:00
Joseph Huber 58645d3252 [OpenMP] Fix `omp_get_wtime` function being marked incorrectly as readonly
OpenMP has a list of of optimistic attributes that can be attached to
known runtime functions to aid some analysis. The `omp_get_wtime`
function incorrectly used the `readonly` attribute. This is not correct
at the `omp_get_wtime` function changes values depending on some
external state. This is more correctly modeled with
`inaccessiblememonly` meaning that the value does not depend on anything
within the module, but can not be removes as it depends on external
state.

Fixes #57578

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D133360
2022-09-06 12:59:00 -05:00
Jakub Kuderski 20573d11b7 [ADT] Remove is_splat
`is_splat` is superseded by `all_equal` and marked as deprecated.
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D132336
2022-09-06 13:49:26 -04:00
Matthias Gehre 7948d89afe Fix "[llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64" compilation on Windows 2022-09-06 16:11:14 +01:00
Marco Elver 7d63983c65 [SelectionDAG] Properly copy ExtraInfo on RAUW
During SelectionDAG legalization SDNodes with associated extra info may
be replaced with a new SDNode. Preserve associated extra info on
ReplaceAllUsesWith and remove entries in DeallocateNode.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130881
2022-09-06 16:32:50 +02:00
Marco Elver cc3faf4226 [SelectionDAG] Rename CallSiteDbgInfo to NodeExtraInfo
For information infrequently attached to SDNodes, it is useful to
provide a way to add this information out-of-line. This is already done
for call-site specific information.

Rename CallSiteDbgInfo to NodeExtraInfo in preparation of adding
additional information not necessarily related to call sites only.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130880
2022-09-06 16:32:50 +02:00
Matthias Gehre 2090e85fee [llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64
This adds the ExpandLargeDivRem to the default pass pipeline.
The limit at which it expands div/rem instructions is configured
via a new TargetTransformInfo hook (default: no expansion)
X86, Arm and AArch64 backends implement this hook to expand div/rem
instructions with more than 128 bits.

Differential Revision: https://reviews.llvm.org/D130076
2022-09-06 15:32:04 +01:00
Joseph Huber 5dbc7cf7ca [Object] Refactor code for extracting offload binaries
We currently extract offload binaries inside of the linker wrapper.
Other tools may wish to do the same extraction operation. This patch
simply factors out this handling into the `OffloadBinary.h` interface.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D132689
2022-09-06 08:55:16 -05:00
Marco Elver 42836e283f [MachineInstr] Allow setting PCSections in ExtraInfo
Provide MachineInstr::setPCSection(), to propagate relevant metadata
through the backend. Use ExtraInfo to store the metadata.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D130876
2022-09-06 15:52:44 +02:00
Marco Elver c70f6e1362 [Metadata] Introduce MD_pcsections
Introduces MD_pcsections metadata kind. See added documentation for
more details.

Subsequent patches enable propagating PC sections metadata through code
generation to the AsmPrinter.

RFC: https://discourse.llvm.org/t/rfc-pc-keyed-metadata-at-runtime/64191

Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D130875
2022-09-06 15:52:44 +02:00
Amara Emerson 3dd861818a [GlobalISel] Combine G_INSERT/EXTRACT_VECTOR_ELT with out of bounds indices to undef.
Differential Revision: https://reviews.llvm.org/D133309
2022-09-06 13:45:04 +01:00
luxufan 2e7aed1947 [MemorySSA][NFC] Simplify if condition
Differential Revision: https://reviews.llvm.org/D133332
2022-09-05 10:43:17 +00:00
Eli Friedman 63335afb4e [ARM64EC 2/?] Add target triple, and allow targeting it.
Part of patchset to add initial support for ARM64EC.

Per discussion on review, using the triple arm64ec-pc-windows-msvc. The
parsing works the same way as Apple's alternate Arm ABI "arm64e".

Differential Revision: https://reviews.llvm.org/D125412
2022-09-05 12:27:10 -07:00
Eli Friedman 488ad99ecf [ARM64EC 1/?] Add parsing support to llvm-objdump/llvm-readobj.
This is the first patch of a patchset to add initial support for
ARM64EC. Basic documentation is available at
https://docs.microsoft.com/en-us/windows/uwp/porting/arm64ec-abi .
(Discourse post:
https://discourse.llvm.org/t/initial-patches-for-arm64ec-windows-11-now-posted/62449
.)

The file format for ARM64EC is basically identical to normal ARM64.
There are a few extra sections, but the existing code for reading ARM64
object files just works.

Differential Revision: https://reviews.llvm.org/D125411
2022-09-05 12:25:08 -07:00
Joseph Huber c1d19a8489 [ELF] Provide the GNU hash function in libObject
GNU uses a different hashing function compared to the sys-V standard
function already provided in libObject. This is already used internally
in LLD for generating synthetic sections. This patch simply extracts
this definition and makes it availible to other users of `libObject`.
This is done in preparation for supporting symbol name lookups via the
GNU hash table.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D132696
2022-09-05 11:04:57 -05:00
Kazu Hirata 2bb43d72d9 [ADT] Use std::tuple_element_t (NFC) 2022-09-03 23:27:24 -07:00
Kazu Hirata 03c3c2db10 [llvm] Use std::remove_reference_t (NFC) 2022-09-03 23:27:22 -07:00
Kazu Hirata 230e57d221 [ADT] Use std::add_pointer_t (NFC) 2022-09-03 23:27:18 -07:00
Kazu Hirata 9dc6223117 [ADT] Use std::add_lvalue_reference_t (NFC) 2022-09-03 23:27:17 -07:00
Kazu Hirata 2423cf4f88 [Support] Simplify reverseBits with constexpr if (NFC)
Differential Revision: https://reviews.llvm.org/D132814
2022-09-03 23:27:15 -07:00
Kazu Hirata ee40ef7aaf [Support] Simplify isInt and isUInt with constexpr if (NFC)
Differential Revision: https://reviews.llvm.org/D132813
2022-09-03 23:27:13 -07:00
Kazu Hirata 86e8164a8f [llvm] Qualify auto in range-based for loops (NFC)
Identified with readability-qualified-auto.
2022-09-03 11:17:49 -07:00
Kazu Hirata 32aa35b504 Drop empty string literals from static_assert (NFC)
Identified with modernize-unary-static-assert.
2022-09-03 11:17:47 -07:00
Kazu Hirata baee196abb [llvm] Use std::remove_const_t (NFC) 2022-09-03 11:17:45 -07:00
Kazu Hirata 9eca5ed790 [llvm] Use std::enable_if_t (NFC) 2022-09-03 11:17:44 -07:00
Kazu Hirata a7a2872bb7 [ADT] Use std::add_const_t (NFC) 2022-09-03 11:17:42 -07:00
Simon Pilgrim e2d140e9c3 [TTI] Add isExpensiveToSpeculativelyExecute wrapper
CGP uses a raw `getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >= TCC_Expensive` check to see if its better to move an expensive instruction used in a select behind a branch instead.

This is causing issues with upcoming improvements to TCK_SizeAndLatency costs on X86 as we need to use TCK_SizeAndLatency as an uop count (so its compatible with various target-specific buffer sizes - see D132288), but we can have instructions that have a low TCK_SizeAndLatency value but should still be treated as 'expensive' (FDIV for example) - by adding a isExpensiveToSpeculativelyExecute wrapper we can keep the current behaviour but still add an x86 override in a future patch when the cost tables are updated to compensate.
2022-09-03 13:12:22 +01:00
Alexey Lapshin 79c8f51c34 [DWARFLinker] Refactor clang modules loading code.
Current implementation of registerModuleReference() function not only
"registers" module reference, but also clones referenced module
(inside loadClangModule()). That may lead to cloning the module with
incorrect options (registerModuleReference() examines module references
and additionally accumulates MaxDwarfVersion and accel tables info).
Since accumulated options may differ from the current values,
it is incorrect to clone modules before options are fully accumulated.

This patch separates "cloning" code from "registering" code. So,
that accumulating option is done in the "registering stage" and
"cloning" is done after all modules are registered and options accumulated.
It also adds a callback for loaded compile units which can be used for
D132755 and D132371(to allow doing options accumulation outside
of DWARFLinker).

Differential Revision: https://reviews.llvm.org/D133047
2022-09-03 11:23:52 +03:00
Craig Topper 5cf510115a [VP] Correct LEGALPOS for more VP nodes.
LEGALPOS appears to only be used by LegalizeVectorOps. It needs
to point at a vector operand. Stores need to point at the second
operand since the result and the first operand are MVT::Other.
Reductions need to point at the second operand since the result
and the first operand are scalsrs.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D133048
2022-09-02 08:04:28 -07:00
Kadir Cetinkaya 4940f205d4
[llvm][Support] Add DenseMapInfo for std::variant
Differential Revision: https://reviews.llvm.org/D133200
2022-09-02 15:36:10 +02:00
Simon Pilgrim 7338f9709b [TTI] Improve description of TargetCostKind enums to aid targets in choosing cost values
I'm not sure how much to add to the description as we've tried to allow targets to interpret the TargetCostKind enums in their own way. But we need to make it clear that certain cost kinds need to match threshold numbers used by various passes (and vice-versa when passes are determining a cost-benefit threshold).

I'm not keen on the "The weighted sum of size and latency" description, but its very difficult to come up with anything else that's suitably generic (e.g. X86 will use uop counts here to easily work with LoopMicroOpBufferSize thresholds, even though high latency fdiv/fsqrt instructions still often have low uop counts).

Differential Revision: https://reviews.llvm.org/D132288
2022-09-02 11:09:06 +01:00
Lang Hames 6ca9f42189 [ORC][ORC-RT] Consistently use pointed-to type as template arg to wrap/unwrap.
Saves wrap/unwrap implementers from having to use std::remove_pointer_t to get
at the pointed-to type.
2022-09-01 20:54:24 -07:00
Lang Hames 06c4634483 [JITLink] Sink ELFX86RelocationKind into implementation file (ELF_x86_64.cpp).
The ELF/x86-64 backend uses the generic x86_64 edges now, so the
ELFX86RelocationKind is just an implementation detail.
2022-09-01 13:36:49 -07:00
Simon Pilgrim e5804a5a61 [ADT] bit.h - replace <stdint.h> with <cstdint>
This is a C++ header after all.
2022-09-01 20:44:56 +01:00
Fangrui Song 8d95fd7e56 [MachineFunctionPass] Support -filter-passes for -print-changed
[MachineFunctionPass] Support -filter-passes for -print-changed

-filter-passes specifies a `PassID` (a lower-case dashed-separated pass name,
also used by -print-after, -stop-after, etc) instead of a CamelCasePass.

`-filter-passes=CamelCaseNewPMPass` seems like a workaround for new PM passes before
we can use lower-case dashed-separated pass names (as used by `-passes=`).

Example:
```
# getPassName() is "IRTranslator". PassID is "irtranslator"
llc -mtriple=aarch64 -print-changed -filter-passes=irtranslator < print-changed-machine.ll
```

Close https://github.com/llvm/llvm-project/issues/57453

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D133055
2022-09-01 11:06:06 -07:00
Wei Yi Tee f6b66cbc7d [llvm][Testing/ADT] Implement `IsStringMapEntry` testing matcher for verifying the entries in a `StringMap`.
Reviewed By: gribozavr2, ymandel, sgatev

Differential Revision: https://reviews.llvm.org/D132753
2022-09-01 17:30:41 +00:00
Ilia Diachkov 698c800142 [SPIRV] support builtin types and ExtInsts selection
The patch adds the support of OpenCL and SPIR-V built-in types. It also
implements ExtInst selection and adds spv_unreachable and spv_alloca
intrinsics which improve the generation of the corresponding SPIR-V code.
Five LIT tests are included to demonstrate the improvement.

Differential Revision: https://reviews.llvm.org/D132648

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-09-01 16:44:54 +03:00
Amara Emerson 4cf3db41da [GlobalISel] Add sdiv exact (X, constant) -> mul combine.
This port of the SDAG optimization is only for exact sdiv case.

Differential Revision: https://reviews.llvm.org/D130517
2022-09-01 13:34:00 +01:00
David Spickett 6829cd17b5 [LLVM] Add missing stdint include to Bit.h
To fix failing builds on Windows on Arm:
https://lab.llvm.org/staging/#/builders/59/builds/928/steps/4/logs/stdio

<...>/ADT/bit.h(50,5): error: unknown type name 'uint32_t'
    uint32_t v = Value;
    ^
2022-09-01 09:17:35 +00:00
Sam Clegg 92920c4fe3 [MC][WebAssembly] Allow accurate errors in doBeforeLabelEmit
Although we only currently have one error produced in this function I am
working on changes right now that add some more.  This change makes the
error location more accurate.

Differential Revision: https://reviews.llvm.org/D133016
2022-09-01 01:26:33 -07:00
Arthur Eubanks 04f3c20989 [NFC][LICM] Stop passing around unused BFI
Uses of this were removed in 1a25d0bfbb.
2022-08-31 19:15:34 -07:00
Mark Zhuang 62454e83b0 [NFC] Fix typo
Reviewed By: eopXD

Differential Revision: https://reviews.llvm.org/D133079
2022-08-31 19:08:46 -07:00
Craig Topper 8dce3507a0 [VP] Correct the LEGALPOS for VP_STORE.
VP_STORE has a Chain for operand 0, so the LEGALPOS should be 1.

VP_STORE is always considered Legal for MVT::Other. So I suspect this
was causing vp_store to be ignored by LegalizeVectorOps and instead
handled in LegalizeDAG.

VP_LOAD is Custom expanded in LegalizeVectorOps for RISC-V.

Differential Revision: https://reviews.llvm.org/D132972
2022-08-31 11:15:47 -07:00
Wei Yi Tee d45c04da7c [llvm][ADT] Overload output stream operator `<<` for `StringMapEntry` and `StringMap`.
Printing support enables the production of more useful error messages in unit testing e.g. when using matchers such as `UnorderedElementsAre()` to inspect the contents of a `StringMap`.

Reviewed By: gribozavr2, sgatev, ymandel

Differential Revision: https://reviews.llvm.org/D132747
2022-08-31 17:37:58 +00:00
Arthur Eubanks d0b9c9c0a3 [NFC] clang-format Any.h
To trigger some bots.

Differential Revision: https://reviews.llvm.org/D133033
2022-08-31 10:21:30 -07:00
Daniel Thornburgh ea99225521 [Symbolizer] Handle {{{bt}}} symbolizer markup element.
This adds support for backtrace generation to the llvm-symbolizer markup
filter, which is likely the largest use case.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D132706
2022-08-31 09:49:32 -07:00
Daniel Bertalan f7b752d277
[lld-macho] Set the SG_READ_ONLY flag on __DATA_CONST
This flag instructs dyld to make the segment read-only after fixups have
been performed.

I'm not sure why this flag is needed, as on macOS 13 beta at least,
__DATA_CONST is read-only even without this flag; but ld64 sets it as
well.

Differential Revision: https://reviews.llvm.org/D133010
2022-08-31 17:04:20 +02:00
Hassnaa Hamdi a6d9c944df [AArch64 - SVE]: Use SVE to lower reduce.fadd.
Differential Revision: https://reviews.llvm.org/D132573

skip custom-lowering for v1f64 to be expanded instead, because it has only one lane

Differential Revision: https://reviews.llvm.org/D132959
2022-08-31 12:31:06 +00:00
Alvin Wong 12d865415f [COFF] Use the more accurate GuardFlags definition everywhere
This also modifies llvm-readobj to be more future-proof when printing
the guard FIDs table by calculating the entry size correctly according
to MS docs.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D132924
2022-08-31 15:11:34 +03:00
Alvin Wong 94baaa6a5c [llvm-readobj][COFF] Print load config GuardFlags as enum flags
Print flags as documented in MS docs.
https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#load-configuration-layout
https://docs.microsoft.com/en-us/windows/win32/secbp/pe-metadata

EH_CONTINUATION_TABLE_PRESENT is not mentioned in the docs but is
instead taken from Windows SDK headers.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D132823
2022-08-31 15:01:57 +03:00
Nikita Popov 972840aa3b [IR] Add Instruction::getInsertionPointAfterDef()
Transforms occasionally want to insert an instruction directly
after the definition point of a value. This involves quite a few
different edge cases, e.g. for phi nodes the next insertion point
is not the next instruction, and for invokes and callbrs its not
even in the same block. Additionally, the insertion point may not
exist at all if catchswitch is involved.

This adds a general Instruction::getInsertionPointAfterDef() API to
implement the necessary logic. For now it is used in two places
where this should be mostly NFC. I will follow up with additional
uses where this fixes specific bugs in the existing implementations.

Differential Revision: https://reviews.llvm.org/D129660
2022-08-31 10:50:10 +02:00
Daniel Bertalan 389e0a81a1
[lld-macho] Support synthesizing __TEXT,__init_offsets
This section stores 32-bit `__TEXT` segment offsets of initializer
functions, and is used instead of `__mod_init_func` when chained fixups
are enabled.

Storing the offsets lets us avoid emitting fixups for the initializers.

Differential Revision: https://reviews.llvm.org/D132947
2022-08-31 10:13:45 +02:00
Greg Clayton ea9ac3519c An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests.
Resubmission of https://reviews.llvm.org/D126254 with where decodeBase64Byte is no longer a lambda but a static function. Some compilers have different errors or warnings with respect to what needs to be captured and what doesn't (see comments in https://reviews.llvm.org/D126254 for details).

Differential Revision: https://reviews.llvm.org/D128560
2022-08-30 15:52:08 -07:00
Pavel Samolysov 88581db62f [LazyCallGraph] Reformat the code in accordance with the code style. NFC
Also, some local variables were renamed in accordance with the code
style as well as `std::tie` occurrences and `.first`/`.second` member
uses were replaced with structure bindings.

Differential Revision: https://reviews.llvm.org/D132806
2022-08-30 11:06:42 +03:00
Rong Xu 7bc182ed8a fix buildbot build error. 2022-08-29 17:01:27 -07:00
Rong Xu d7ef0c3970 [llvm-profdata] Improve profile supplementation
Current implementation promotes a non-cold function in the SampleFDO profile
into a hot function in the FDO profile. This is too aggressive. This patch
promotes a hot functions in the SampleFDO profile into a hot function, and a
warm function in SampleFDO into a warm function in FDO.

Differential Revision: https://reviews.llvm.org/D132601
2022-08-29 16:50:42 -07:00
Rong Xu db18f26567 [llvm-profdata] Handle internal linkage functions in profile supplementation
This patch has the following changes:
(1) Handling of internal linkage functions (static functions)
Static functions in FDO have a prefix of source file name, while they do not
have one in SampleFDO. Current implementation does not handle this and we are
not updating the profile for static functions. This patch fixes this.

(2) Handling of -funique-internal-linakge-symbols
Again this is for the internal linkage functions. Option
-funique-internal-linakge-symbols can now be applied to both FDO and SampleFDO
compilation. When it is used, it demangles internal linkage function names and
adds a hash value as the postfix.

When both SampleFDO and FDO profiles use this option, or both
not use this option, changes in (1) should handle this.

Here we also handle when the SampleFDO profile using this option while FDO
profile not using this option, or vice versa.

There is one case where this patch won't work: If one of the profiles used
mangled name and the other does not. For example, if the SampleFDO profile
uses clang c-compiler and without -funique-internal-linakge-symbols, while
the FDO profile uses -funique-internal-linakge-symbols. The SampleFDO profile
contains unmangled names while the FDO profile contains mangled names. If
both profiles use c++ compiler, this won't happen. We think this use case
is rare and does not justify the effort to fix.

Differential Revision: https://reviews.llvm.org/D132600
2022-08-29 16:15:12 -07:00
Craig Topper 2f811a6c7f [VP][RISCV] Add vp.fabs intrinsic and RISC-V support.
Mostly just modeled after vp.fneg except there is a
"functional instruction" for fneg while fabs is always an
intrinsic.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D132793
2022-08-29 09:32:06 -07:00
Wei Yi Tee 72ebcf1a53 [llvm][ADT] Fix formatting for files relevant to `StringMap`.
Differential Revision: https://reviews.llvm.org/D132744
2022-08-29 06:57:29 +00:00
Wei Yi Tee af6a35597f Revert "[llvm][ADT] Fix formatting for files relevant to `StringMap`."
This reverts commit d23df9c9e8.
Revert due to missing review link.
2022-08-29 06:43:48 +00:00
Wei Yi Tee d23df9c9e8 [llvm][ADT] Fix formatting for files relevant to `StringMap`. 2022-08-29 06:40:07 +00:00
Kazu Hirata 8feb60756c [llvm] Use range-based for loops (NFC) 2022-08-28 23:28:58 -07:00
Kazu Hirata 87c38323a2 [Support] Remove greatestCommonDivisor and GreatestCommonDivisor64 (NFC)
This patch removes greatestCommonDivisor and GreatestCommonDivisor64
as I've migrated all the uses to std::gcd.
2022-08-28 17:35:08 -07:00
Kazu Hirata ec8605ff52 [llvm] Use std::is_unsigned instead of std::numeric_limits (NFC) 2022-08-28 17:35:06 -07:00
Kazu Hirata ce9f007c7c [llvm] Use llvm::find_if (NFC) 2022-08-28 10:41:48 -07:00
Daniel Bertalan 47e4663c4e
[llvm-objdump] Add -dyld_info to llvm-otool
This option outputs the location, encoded value and target of chained
fixups, using the same format as `otool -dyld_info`.

This initial implementation only supports the DYLD_CHAINED_PTR_64 and
DYLD_CHAINED_PTR_64_OFFSET pointer encodings, which are used in x86_64
and arm64 userspace binaries.

When Apple's effort to upstream their chained fixups code continues,
we'll replace this code with the then-upstreamed code. But we need
something in the meantime for testing ld64.lld's chained fixups code.

Differential Revision: https://reviews.llvm.org/D132036
2022-08-28 09:22:41 +02:00
Benjamin Kramer 1bcf21ca7f Use std::uninitialized_move where appropriate. NFCI. 2022-08-27 14:56:43 +02:00
Anubhab Ghosh c69df92b4f [Orc] Use MapperJITLinkMemoryManager with InProcessMapper in llvm-jitlink tool
MapperJITLinkMemoryManager has slab allocation. Combined with
InProcessMapper, it can replace InProcessMemoryManager.

It can also replace JITLinkSlabAllocator through the InProcessDeltaMapper
that adds an offset to the executor addresses for use in tests.

Differential Revision: https://reviews.llvm.org/D132315
2022-08-27 11:07:09 +05:30
Lang Hames f828135f91 Reapply "[ORC] Add "wrap" and "unwrap" steps to ExecutorAddr..." with fixes.
Reapplies f14cb494a3 (which was reverted in 2f08f8426c) with a fix for UB in
the ExecutorAddr::Unwrap::Unwrap constructor (which caused failures on some
bots).
2022-08-26 14:53:51 -07:00
Lang Hames 2f08f8426c Revert "[ORC] Add "wrap" and "unwrap" steps to ExecutorAddr toPtr/fromPtr."
This reverts commit f14cb494a3.

Reverting while I investigate bot failures, e.g.
https://lab.llvm.org/buildbot#builders/117/builds/8701
2022-08-26 13:54:30 -07:00
Florian Hahn 9405af1c85
[LAA] Require AddRecs to be in the innermost loop for diff-checks.
The simpler diff-checks require pointers with add-recs from the same
innermost loop, but this property wasn't check completely. Add the
missing check to ensure both addrecs are in the innermost loop.

Fixes #57315.
2022-08-26 20:39:52 +01:00
Lang Hames f14cb494a3 [ORC] Add "wrap" and "unwrap" steps to ExecutorAddr toPtr/fromPtr.
The wrap/unwrap operations are applied to pointers after/before conversion to/from
raw addresses. They can be used to tag, untag, sign, or strip signing from
pointers. They currently default to 'rawPtr' (identity) on all platforms, but it
is expected that the default will be set based on the host architecture, e.g.
they would default to signing/stripping for arm64e.
2022-08-26 12:32:44 -07:00
Daniil Fukalov 9c710ebbdb [TTI] NFC: Reduce InstructionCost::getValue() usage...
in order to propagate `InstructionCost` value upper.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D103406
2022-08-26 16:37:32 +03:00
Benjamin Kramer 2c1796b3d6 [ADT] GCC 7 doesn't have constexpr char_traits, add a workaround
LLVM still supports GCC 7. This workaround can be removed when GCC 8
becomes the oldest supported GCC version.

Fixes #57057
2022-08-26 14:11:21 +02:00
Matthias Gehre 3e39b27101 [llvm/CodeGen] Add ExpandLargeDivRem pass
Adds a pass ExpandLargeDivRem to expand div/rem instructions
with more than 128 bits into a loop computing that value.

As discussed on https://reviews.llvm.org/D120327, this approach has the advantage
that it is independent of the runtime library. This also helps the clang driver,
which otherwise would need to understand enough about the runtime library
to know whether to allow _BitInts with more than 128 bits.

Targets are still free to disable this pass and instead provide a faster
implementation in a runtime library.

Fixes https://github.com/llvm/llvm-project/issues/44994

Differential Revision: https://reviews.llvm.org/D126644
2022-08-26 11:55:15 +01:00
Matthias Gehre 6d13b80fcb Revert "[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers"
This reverts https://reviews.llvm.org/D120329.
I abandoned the PR [0] to add __divei4 functions to compiler-rt
in favor of adding a pass to transform div/rem [1].

This removes the backend code that was supposed to emit calls to the __divei4 functions.

[0] https://reviews.llvm.org/D120327
[1] https://reviews.llvm.org/D130076

Differential Revision: https://reviews.llvm.org/D130079
2022-08-26 10:52:56 +01:00
Alex Richardson 0483b00875 Mark the $local function begin symbol as a function
While this does not matter for most targets, when building for Arm Morello,
we have to mark the symbol as a function and add size information, so that
LLD can correctly evaluate relocations against the local symbol.
Since Morello is an out-of-tree target, I tried to reproduce this with
in-tree backends and with the previous reviews applied this results in
a noticeable difference when targeting Thumb.

Background: Morello uses a method similar Thumb where the encoding mode is
specified in the LSB of the symbol. If we don't mark the target as a
function, the relocation will not have the LSB set and calls will end up
using the wrong encoding mode (which will almost certainly crash).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D131429
2022-08-26 09:34:04 +00:00
Nicolai Hähnle 5e812e9580 Revert "ManagedStatic: remove from DebugCounter"
This reverts commit b5b6ef1500.
2022-08-26 11:02:58 +02:00
Nicolai Hähnle b5b6ef1500 ManagedStatic: remove from DebugCounter
Follow the pattern used in MLIR for the cl::opt instances.

v2:
- make DebugCounter::isCountingEnabled public so that the
  DebugCounterOwner doesn't have to be a nested class. This simplifies
  later changes

v3:
- remove the indirection via DebugCounterOwner::instance()

Differential Revision: https://reviews.llvm.org/D129116
2022-08-26 09:22:11 +02:00
Joe Loser 77eac32716 [ADT] Make `llvm::identity` a transparent function object
`llvm::identity` is similar to `std::identity` from C++20, but one surprising
thing is that `llvm::identity` is not a transparent function object. Add the
`is_transparent` type alias to denote it can be used as a transparent function
object.

Differential Revision: https://reviews.llvm.org/D132628
2022-08-25 21:06:42 -06:00
Nicolai Hähnle a0a2ddfcc5 Revert "ManagedStatic: remove from DebugCounter"
This reverts commit 51d82502d9.

There is a regression in the flang-aarch64-dylib buildbot which is most
likely caused by this change. Reverting until I can investigate.
2022-08-25 19:45:04 +02:00
Nicolai Hähnle af2e54992d [Timer][Statistics] Make global constructor ordering more robust
It was observed in D129117 that the subtle dependency between statistic
and timer code is not entirely robust: the global destructor
~StatisticInfo indirectly calls CreateInfoOutputFile, which requires
the LibSupportInfoOutputFilename to not have been destructed.

By constructing LibSupportInfoOutputFilename before the StatisticInfo
object, the order of destruction is guaranteed.

Differential Revision: https://reviews.llvm.org/D131059
2022-08-25 19:09:49 +02:00
Nicolai Hähnle 51d82502d9 ManagedStatic: remove from DebugCounter
Follow the pattern used in MLIR for the cl::opt instances.

v2:
- make DebugCounter::isCountingEnabled public so that the
  DebugCounterOwner doesn't have to be a nested class. This simplifies
  later changes

Differential Revision: https://reviews.llvm.org/D129116
2022-08-25 19:09:48 +02:00
Dan McGregor 3922ec46b8 [MCContext] Reverse order of DebugPrefixMap sort for generated assembly debug info
Match Clang's sorting, so that longer (more specific) prefix paths will match
before less specific paths.

Reviewed By: MaskRay, raj.khem, #debug-info

Differential Revision: https://reviews.llvm.org/D132390
2022-08-24 21:43:41 -07:00
Valery N Dmitriev a4c8fb9d1f [SLP][NFC] Refactor SLPVectorizerPass::vectorizeRootInstruction method.
The goal is to separate collecting items for post-processing
and processing them. Post processing also outlined as
dedicated method.

Differential Revision: https://reviews.llvm.org/D132603
2022-08-24 17:07:53 -07:00
Mircea Trofin 5ce4c9aa04 [mlgo] Use TFLite for 'development' mode.
TLite is a lightweight, statically linkable[1], model evaluator, supporting a
subset of what the full tensorflow library does, sufficient for the
types of scenarios we envision having. It is also faster.

We still use saved models as "source of truth" - 'release' mode's AOT
starts from a saved model; and the ML training side operates in terms of
saved models.

Using TFLite solves the following problems compared to using the full TF
C API:

- a compiler-friendly implementation for runtime-loadable (as opposed
  to AOT-embedded) models: it's statically linked; it can be built via
  cmake;
- solves an issue we had when building the compiler with both AOT and
  full TF C API support, whereby, due to a packaging issue on the TF
  side, we needed to have the pip package and the TF C API library at
  the same version. We have no such constraints now.

The main liability is it supporting a subset of what the full TF
framework does. We do not expect that to cause an issue, but should that
be the case, we can always revert back to using the full framework
(after also figuring out a way to address the problems that motivated
the move to TFLite).

Details:

This change switches the development mode to TFLite. Models are still
expected to be placed in a directory - i.e. the parameters to clang
don't change; what changes is the directory content: we still need
an `output_spec.json` file; but instead of the saved_model protobuf and
the `variables` directory, we now just have one file, `model.tflite`.

The change includes a utility showing how to take a saved model and
convert it to TFLite, which it uses for testing.

The full TF implementation can still be built (not side-by-side). We
intend to remove it shortly, after patching downstream dependencies. The
build behavior, however, prioritizes TFLite - i.e. trying to enable both
full TF C API and TFLite will just pick TFLite.

[1] thanks to @petrhosek's changes to TFLite's cmake support and its deps!
2022-08-24 16:07:24 -07:00
Sami Tolvanen cff5bef948 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Relands 67504c9549 with a fix for
32-bit builds.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 22:41:38 +00:00
Peter Cooper 6113998069 Add MachO MH_FILESET support to objdump
https://reviews.llvm.org/D131909
2022-08-24 13:34:43 -07:00
Sami Tolvanen a79060e275 Revert "KCFI sanitizer"
This reverts commit 67504c9549 as using
PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.
2022-08-24 19:30:13 +00:00
Sami Tolvanen 67504c9549 KCFI sanitizer
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.

Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.

KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.

A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:

```
.c:
  int f(void);
  int (*p)(void) = f;
  p();

.s:
  .4byte __kcfi_typeid_f
  .global f
  f:
    ...
```

Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.

As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.

Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.

Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay

Differential Revision: https://reviews.llvm.org/D119296
2022-08-24 18:52:42 +00:00
Daniel Bertalan 686d8ce1ab
[llvm-objdump] Complete -chained_fixups support
This commit adds definitions for the `dyld_chained_import*` structs.
The imports array is now printed with `llvm-otool -chained_fixups`. This
completes this option's implementation.

A slight difference from cctools otool is that we don't yet dump the
raw bytes of the imports entries.

When Apple's effort to upstream their chained fixups code continues,
we'll replace this code with the then-upstreamed code. But we need
something in the meantime for testing ld64.lld's chained fixups code.

Differential Revision: https://reviews.llvm.org/D131982
2022-08-24 19:29:11 +02:00
spupyrev 8d5b694da1 extending code layout alg
The diff modifies ext-tsp code layout algorithm in the following ways:
(i) fixes merging of cold block chains (this is a port of D129397);
(ii) adjusts the cost model utilized for optimization;
(iii) adjusts some APIs so that the implementation can be used in BOLT; this is
a prerequisite for D129895.

The only non-trivial change is (ii). Here we introduce different weights for
conditional and unconditional branches in the cost model. Based on the new model
it is slightly more important to increase the number of "fall-through
unconditional" jumps, which makes sense, as placing two blocks with an
unconditional jump next to each other reduces the number of jump instructions in
the generated code. Experimentally, this makes a mild impact on the performance;
I've seen up to 0.2%-0.3% perf win on some benchmarks.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D129893
2022-08-24 09:40:25 -07:00
Fangrui Song 3b4d800911 [ELF] Parallelize writes of different OutputSections
We currently process one OutputSection at a time and for each OutputSection
write contained input sections in parallel. This strategy does not leverage
multi-threading well. Instead, parallelize writes of different OutputSections.

The default TaskSize for parallelFor often leads to inferior sharding. We
prepare the task in the caller instead.

* Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup
* Add llvm::parallel::TaskGroup::execute.
* Change writeSections to declare TaskGroup and pass it to writeTo.

Speed-up with --threads=8:

* clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast
* clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast
* chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast
* scylladb build/release: 1.09x as fast

On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast.

Differential Revision: https://reviews.llvm.org/D131247
2022-08-24 09:40:03 -07:00
Jonas Devlieghere e854c17b02
[llvm] Teach LLVM about filesets
Teach LLVM about filesets. Filesets were added in macOS 11 (Big Sur) to
combine multiple Mach-O files. They introduce a new load command
(LC_FILESET_ENTRY) consisting of a fileset_entry_command.

  struct fileset_entry_command {
      uint32_t     cmd;        /* LC_FILESET_ENTRY */
      uint32_t     cmdsize;    /* includes entry_id string */
      uint64_t     vmaddr;     /* memory address of the entry */
      uint64_t     fileoff;    /* file offset of the entry */
      union lc_str entry_id;   /* contained entry id */
      uint32_t     reserved;   /* reserved */
  };

This patch teaches LLVM about the new load command and the corresponding
data.

Differential revision: https://reviews.llvm.org/D132432
2022-08-24 09:33:45 -07:00
Simon Pilgrim f9de13232f [X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch
This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.

For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.

Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.

Differential Revision: https://reviews.llvm.org/D132520
2022-08-24 17:28:18 +01:00
Pierre van Houtryve 59cf9dd923 [AMDGPU][GISel] Enable Selection of ADD3 for G_PTR_ADD
Allows things like `(G_PTR_ADD (G_PTR_ADD a, b), c)` to be
simplified into a single ADD3 instruction instead of two adds.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D131254
2022-08-24 14:44:19 +00:00
Alex Richardson 38107171ed [RegisterInfoEmitter] Generate isConstantPhysReg(). NFCI
This commit moves the information on whether a register is constant into
the Tablegen files to allow generating the implementaiton of
isConstantPhysReg(). I've marked isConstantPhysReg() as final in this
generated file to ensure that changes are made to tablegen instead of
overriding this function, but if that turns out to be too restrictive,
we can remove the qualifier.

This should be pretty much NFC, but I did notice that e.g. the AMDGPU
generated file also includes the LO16/HI16 registers now.

The new isConstant flag will also be used by D131958 to ensure that
constant registers are marked as call-preserved.

Differential Revision: https://reviews.llvm.org/D131962
2022-08-24 14:16:20 +00:00
Teresa Johnson d10c1b88f0 [memprof] Correct max size and access count computations
The existing code resulted in the max size and access counts being equal
to the min. Compute the max instead (max lifetime was already correct).

Differential Revision: https://reviews.llvm.org/D132515
2022-08-23 16:53:46 -07:00
Simon Pilgrim 9317e6311f [TTI] Add SK_Splice shuffle mask detection and X86 costs
Enables fixed sized vectors to detect SK_Splice shuffle patterns and provides basic X86 cost support

Differential Revision: https://reviews.llvm.org/D132374
2022-08-23 20:07:30 +01:00
Simon Pilgrim 336a4e03a4 [ADT] Add llvm::has_single_bit helper similar to the c++20 std::has_single_bit implementation
Converted the llvm::isPowerOf2_32/64 helpers into wrappers
2022-08-23 19:51:05 +01:00
Simon Pilgrim 75767a0f9a [Support] MathExtras.h - use llvm::bitcast<> for float-bits cast helpers. NFCI. 2022-08-23 18:27:13 +01:00
Jakub Kuderski 6fa87ec10f [ADT] Deprecate is_splat and replace all uses with all_equal
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D132335
2022-08-23 11:36:27 -04:00
Philip Reames c9608d57b8 [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet.  The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
2022-08-23 07:55:42 -07:00
Stephen Tozer 89d0cc99ec [DebugInfo][InstrRef] Handle transfers of variadic debug values in LDV
This patch adds the last of the changes required to enable
DBG_VALUE_LIST handling in InstrRefLDV, handling variadic debug values
during the transfer tracking step. Most of the changes are fairly
straightforward, and based around tracking multiple locations per
variable in TransferTracker::VLocTracker.

Differential Revision: https://reviews.llvm.org/D128211
2022-08-23 15:01:28 +01:00
Florian Hahn 5913d77056
[Globals] Treat nobuiltin fns as maybe-derefined.
Callsites could be marked as `builtin` while calling `nobuiltin`
functions. This can lead to problems, if local optimizations apply
transformations based on the semantics of the builtin, but then IPO
treats the function as `nobuiltin` and applies a transform that breaks
builtin semantics (assumed earlier).

To avoid this, mark such functions as maybey-derefined, to avoid IPO
transforms on them that may break assumptions of earlier calls.

Fixes #57075
Fixes #48366

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D97735
2022-08-23 13:45:10 +01:00
Simon Pilgrim 42a9c1819c [ADT] Add llvm::popcount to <bit> helper wrapper
This patch proposes to move the llvm::detail::PopulationCounter internal helpers into ADT/bit.h and provide a llvm::popcount implementation.

I've left the countPopulation implementation in place in MathExtras.h for now, but updated it to use llvm::popcount.

Hopefully I've got the type_traits correct - I don't use them very often.

Someday we'll move to C++20 with an actual <bit> std header, and we already have this header in place to simplify matters. We'd probably benefit from moving the other <bit> helpers here at some point, but this is a first step.

Differential Revision: https://reviews.llvm.org/D132407
2022-08-23 10:36:43 +01:00
Florian Hahn ff34432649
[LoopUtils] Remove unused Loop arg from addDiffRuntimeChecks (NFC).
The argument is no longer used, remove it.
2022-08-23 10:15:28 +01:00
liqinweng eaa539afa1 [LV][NFC] Modify code comments
Reviewed By: jacquesguan

Differential Revision: https://reviews.llvm.org/D132093
2022-08-23 12:21:53 +08:00
Jakub Kuderski c9e52fbe4d [ADT] Add all_equal predicate
`llvm::all_equal` checks if all values in the given range are equal, i.e., there are no two elements that are not equal.
Similar to `llvm::all_of`, it returns `true` when the range is empty.

`llvm::all_equal` is intended to supersede `llvm::is_splat`, which will be deprecated and removed in future patches.
See the discussion thread for more details:
https://discourse.llvm.org/t/adt-is-splat-and-empty-ranges/64692.

Reviewed By: dblaikie, shchenz

Differential Revision: https://reviews.llvm.org/D132334
2022-08-22 23:55:23 -04:00
Philip Reames 104fa367ee [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it.  In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact.  This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance.  For instance, every parameter which changes type in this change also changes name.  This was intentional to make sure that every call site possible effected must show up in the diff.  This let me audit each one closely.
2022-08-22 15:16:39 -07:00
Philip Reames 478cf94378 [X86][AArch64][WebAsm][RISCV] Query operand properties instead of using enums directly [nfc]
This is part of an ongoing transition to use OperandValueInfo which combines OperandValueKind and OperandValueProperties.  This change adds some accessor methods and uses them to simplify backend code.  The primary motivation of doing so is removing uses of the parameters so that an upcoming api change is less error prone.
2022-08-22 13:37:59 -07:00
David Penry ced705c440 [ModuloSchedule] Add interface call to accept/reject SMS schedules
This interface allows a target to reject a proposed
SMS schedule.  For Hexagon/PowerPC, all schedules
are accepted, leaving behavior unchanged.  For ARM,
schedules which exceed register pressure limits are
rejected.

Also, two RegisterPressureTracker methods now need to be public so
that register pressure can be computed by more callers.

Reapplication of D128941/(reversion:D132037) with small fix.

Differential Revision: https://reviews.llvm.org/D132170
2022-08-22 12:10:13 -07:00
Philip Reames 27d3321c4f [TTI] Use OperandValueInfo in getMemoryOpCost client api [nfc]
This removes the last use of OperandValueKind from the client side API, and (once this is fully plumbed through TTI implementation) allow use of the same properties in store costing as arithmetic costing.
2022-08-22 11:26:31 -07:00
Philip Reames 274f86e7a6 [TTI] Remove OperandValueKind/Properties from getArithmeticInstrCost interface [nfc]
This completes the client side transition to the OperandValueInfo version of this routine.  Backend TTI implementations still use the prior versions for now.
2022-08-22 11:06:32 -07:00
Philip Reames c42a5f1cc2 [TTI] Migrate getOperandInfo to OperandVaueInfo [nfc]
This is part of merging OperandValueKind and OperandValueProperties.
2022-08-22 10:19:02 -07:00
Philip Reames 5cd427106d [TTI] Start process of merging OperandValueKind and OperandValueProperties [nfc]
OperandValueKind and OperandValueProperties both provide facts about the operands of an instruction for purposes of cost modeling.  We've discussed merging them several times; before I plumb through more flags, let's go ahead and do so.

This change only adds the client side interface for getArithmeticInstrCost and makes a couple of minor changes in client code to prove that it works.  Target TTI implementations still use the split flags.  I'm deliberately splitting what could be one big change into a series of smaller ones so that I can lean on the compiler to catch errors along the way.
2022-08-22 09:48:15 -07:00
Matthias Braun b2542c40b9 RegisterClassInfo: Fix CSR cache invalidation
`RegisterClassInfo` caches information like allocation orders and reuses
it for multiple machine functions where possible. However the `MCPhysReg
*CalleeSavedRegs` field used to test whether the set of callee saved
registers changed did not work: After D28566
`MachineRegisterInfo::getCalleeSavedRegs()` can return dynamically
computed CSR sets that are only valid while the `MachineRegisterInfo`
object of the current function exists.

This changes the code to make a copy of the CSR list instead of keeping
a possibly invalid pointer around.

Differential Revision: https://reviews.llvm.org/D132080
2022-08-22 09:28:26 -07:00