Improve the cycle definition, by avoiding usage of not yet defined
or only vaguely defined terminology inside definitions.
More precisely, the existing definition defined "outermost cycles",
and then proceeded to use the term "cycles" for further definitions,
which in turn were used to actually define "cycles".
Now, instead only define "cycles". This does not change the meaning
of a cycle, which depends on the chosen surrounding (subgraph) of a CFG.
Also mention the function CFG in the first definition, because later
later definitions require it anyways.
Also slightly improve the definition of a closed path, by explicitly
requiring the inner nodes to be distinct.
Differential Revision: https://reviews.llvm.org/D130891
This belongs to a series of patches which try to solve the thread
identification problem in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for a full background.
The problem consists of two concrete problems: TLS variable and readnone
functions. This patch tries to convert the TLS problem to readnone
problem by converting the access of TLS variable to an intrinsic which
is marked as readnone.
The readnone problem would be addressed in following patches.
Reviewed By: nikic, jyknight, nhaehnle, ychen
Differential Revision: https://reviews.llvm.org/D125291
The table of contents in the HTML version of this doc takes up 25 pages
(in my browser, on my 4K monitor) and is too long for me to navigate
comfortably. And most of it is irrelevant detail like this:
- Bitwise Binary Operations
- 'shl' Instruction
- Syntax:
- Overview:
- Arguments:
- Semantics:
- Example:
- 'lshr' Instruction
- Syntax:
- Overview:
- Arguments:
- Semantics:
- Example:
Reducing the contents depth from 4 to 3 removes most of this detail,
leaving just a list of instructions, which only takes up 7 pages and I
find it much easier to navigate.
Incidentally the depth was set to 3 when this document was first
converted to reST and was only increased to 4 in what looks like an
accidental change: 080133453b
Differential Revision: https://reviews.llvm.org/D130635
The SGI page doesn't exist anymore and isn't really relevant at this day
and age.
While at it, added the "other" main C++ website and moved all URLs to
HTTPS.
1. Slightly document the "mark advanced" variable used to control the
installed CMake package dir.
I would document it more, but I am considering in the future adding
pkg-config support in this manner, after which `_PACKGE_DIR` is
probably better called `_CMAKE_PACKGE_DIR` or similar.
2. Convey the custom path to the legacy `llvm-config` binary.
Reviewed By: sebastian-ne
Differential Revision: https://reviews.llvm.org/D130539
Otherwise we have to work pretty hard to ensure a discarded alloc/free
pair doesn't remove a return value that's still useful.
Differential Revision: https://reviews.llvm.org/D130568
This teaches ProcessElfCore to recognise the MTE tag segments.
https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support
These segments contain all the tags for a matching memory segment
which will have the same size in virtual address terms. In real terms
it's 2 tags per byte so the data in the segment is much smaller.
Since MTE is the only tag type supported I have hardcoded some
things to those values. We could and should support more formats
as they appear but doing so now would leave code untested until that
happens.
A few things to note:
* /proc/pid/smaps is not in the core file, only the details you have
in "maps". Meaning we mark a region tagged only if it has a tag segment.
* A core file supports memory tagging if it has at least 1 memory
tag segment, there is no other flag we can check to tell if memory
tagging was enabled. (unlike a live process that can support memory
tagging even if there are currently no tagged memory regions)
Tests have been added at the commands level for a core file with
mte and without.
There is a lot of overlap between the "memory tag read" tests here and the unit tests for
MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's
worth keeping to check ProcessElfCore doesn't cause an assert.
Depends on D129487
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D129489
Update the GEP FAQ to use opaque pointers. This requires more than
a syntactic change in some place, because some of the concerns just
don't make sense anymore (trying to index past a ptr member in a
struct for example).
This also fixes uses of incorrect syntax to declare or reference
globals.
Differential Revision: https://reviews.llvm.org/D130353
Update LangRef examples to use opaque pointers in most places.
I've retained typed pointers in a few cases where opaque pointers
don't make much sense, e.g. pointer to pointer bitcasts.
Differential Revision: https://reviews.llvm.org/D130356
Summary:
1. Added a new option object mode -X for llvm-ar. In AIX OS , there is a object mode option -X for ar command.
please see the "-X mode" part of https://www.ibm.com/docs/ko/aix/7.1?topic=ar-command
Specifies the type of object file ar should examine. The mode must be one of the following:
32
Processes only 32-bit object files
64
Processes only 64-bit object files
32_64
Processes both 32-bit and 64-bit object files
any
Processes all of the supported object files.
The default is to process 32-bit object files (ignore 64-bit objects). The mode can also be set with the OBJECT_MODE environment variable. For example, OBJECT_MODE=64 causes ar to process any 64-bit objects and ignore 32-bit objects. The -X flag overrides the OBJECT_MODE variable.
2. Before adding the new option -X, the default behaviors of llvm-ar like -Xany, but after the adding the new option -X, the default behaviors of llvm-ar change to -X32 ,in order to let some test cases which has 32bit and 64bit object file in the same llvm-ar command, we need to add the "export OBJECT_MODE=any" into test case to change the default behaviors of llvm-ar's object mode.
Reviewers: James Henderson, Owen Reynolds, Fangrui Song
Differential Revision: https://reviews.llvm.org/D127864
Opaque pointers support is complete and default. Specify ptr as
the normal pointer type and i8* as something supported under
non-default options.
A larger update of examples in LangRef is still needed.
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.
Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.
Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D129519
To solve the readnone problems in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for details.
According to the discussion, we decide to fix the problem by inserting
isPresplitCoroutine() checks in different passes instead of
wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes.
In this direction, we might not be able to cover every case at first.
Let's take a "find and fix" strategy.
Reviewed By: nikic, nhaehnle, jyknight
Differential Revision: https://reviews.llvm.org/D127383
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):
```
./llvm-dwarfutil [options] <input file> <output file>
--garbage-collection Do garbage collection for debug info(default)
-j <value> Alias for --num-threads
--no-garbage-collection Don`t do garbage collection for debug info
--no-odr-deduplication Don`t do ODR deduplication for debug types
--no-odr Alias for --no-odr-deduplication
--no-separate-debug-file
Create single output file, containing debug tables(default)
--num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
--odr-deduplication Do ODR deduplication for debug types(default)
--odr Alias for --odr-deduplication
--separate-debug-file Create two output files: file w/o debug tables and file with debug tables
--tombstone [bfd,maxpc,exec,universal]
Tombstone value used as a marker of invalid address(default: universal)
=bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
=maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
=exec - Match with address ranges of executable sections
=universal - Both: bfd and maxpc
```
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D86539
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):
```
./llvm-dwarfutil [options] <input file> <output file>
--garbage-collection Do garbage collection for debug info(default)
-j <value> Alias for --num-threads
--no-garbage-collection Don`t do garbage collection for debug info
--no-odr-deduplication Don`t do ODR deduplication for debug types
--no-odr Alias for --no-odr-deduplication
--no-separate-debug-file
Create single output file, containing debug tables(default)
--num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
--odr-deduplication Do ODR deduplication for debug types(default)
--odr Alias for --odr-deduplication
--separate-debug-file Create two output files: file w/o debug tables and file with debug tables
--tombstone [bfd,maxpc,exec,universal]
Tombstone value used as a marker of invalid address(default: universal)
=bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
=maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
=exec - Match with address ranges of executable sections
=universal - Both: bfd and maxpc
```
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D86539
Back in 2017, a table was added to the codegen documentation listing
which features various backends support. It received a few updates since
then, but not since the end of 2019. Having such a table is a nice idea,
but it hasn't been kept up to date, it isn't easy to ensure that it is
up to date, and the table probably isn't very discoverable for most
users who would be interested in this information anyway (it would be
better suited to some kind of "what can LLVM do for me?" page).
For all of the above reasons, I believe it makes sense to remove it.
Differential Revision: https://reviews.llvm.org/D129996
D123493 introduced llvm::Module::Min to encode module flags metadata for AArch64
BTI/PAC-RET. llvm::Module::Min does not take effect when the flag is absent in
one module. This behavior is misleading and does not address backward
compatibility problems (when a bitcode with "branch-target-enforcement"==1 and
another without the flag are merged, the merge result is 1 instead of 0).
To address the problems, require Min flags to be non-negative and treat absence
as having a value of zero. For an old bitcode without
"branch-target-enforcement"/"sign-return-address", its value is as if 0.
Differential Revision: https://reviews.llvm.org/D129911
This change introduces the dynamic stack boolean field to code-object-v3
and above under the code properties of the kernel descriptor and under
the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to
the is_dynamic_callstack field of amd_kernel_code_t.
Differential Revision: https://reviews.llvm.org/D128344
is out of range. Both intrinsics return a poison value.
Consequently, mark the intrinsics speculatable.
Differential Revision: https://reviews.llvm.org/D129656
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:
; Before:
%res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "", "=r,r,!i"(i8* %x)
to label %asm.fallthrough [label %foo]
The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:
* Allow unrolling/peeling/rotation of callbr, or any other
clone-based optimizations
(https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
(https://github.com/llvm/llvm-project/issues/45248)
This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.
Differential Revision: https://reviews.llvm.org/D129288
The request is mentioned on D129053. I feel that having this functionality is
mildly useful (not strong).
* Rename .ctors to .init_array and change sh_type to SHT_INIT_ARRAY (GNU objcopy
detects the special name but we don't).
* Craft tests for a new SHT_LLVM_* extension
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129337
sanitize_none was never actually committed, and should be removed.
no_sanitize_memtag is to be removed in D128950.
sanitize_memtag is new in D128950.
Also update the comments on other no_sanitize_* to indicate that they're
impacted by the sanitizer ignorelist and the global-disable attribute.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D129410
It is illegal to merge two `llvm.coro.save` calls unless their
`llvm.coro.suspend` users are also merged. Marks it "nomerge" for
the moment.
This reverts D129025.
Alternative to D129025, which affects other token type users like WinEH.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D129530
Users upgrading to PHP 8.1 might start observing failures with `arc`.
Commit @ychen's suggestions as a patch in tree that can be applied since
arcanist is no longer accepting patches.
Also, remove the suggestion to apply an external patch updating CA
certs. It seems that this was fixed in upstream arcanist before they
stopped accepting patches. Compare
e3659d43d8
vs
13d3a3c3b1
Link: https://secure.phabricator.com/book/phabcontrib/article/contributing_code/
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D129232
* GNU objcopy supports --set-section-flags src=... --rename-section src=tst and --set-section-flags runs first.
* GNU objcopy processes --update-section before --rename-section.
To match the two behaviors, postpone --rename-section and allow its use together
with --set-section-flags.
As a side effect, --rename-section=.foo1=.foo2 --add-section=.foo1=/dev/null
leads to .foo2 while GNU objcopy surprisingly produces .foo1 (so
--set-section-flags --add-section --rename-section do not form a total order).
I think the deviation is fine as a total order makes more sense.
Rename set-section-flags-and-rename.test to
set-section-attr-and-rename.test and additionally test --set-section-alignment
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129336
* Remove crc32 from zlib compression namespace, people should use the `llvm::crc32` instead.
Reviewed By: MaskRay, leonardchan
Differential Revision: https://reviews.llvm.org/D128754
* Refactor compression namespaces across the project, making way for a possible
introduction of alternatives to zlib compression.
Changes are as follows:
* Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.
Reviewed By: MaskRay, leonardchan, phosek
Differential Revision: https://reviews.llvm.org/D128953
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.
The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.
I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
For code contribution, GettingStarted.rst duplicates information in Contributing.rst.
The dedicated Contributing.rst is a better place for code contribution, so move
the content there.
Notes:
* D41665 added `Contributing.rst`
* D110976 mentioned `git cherry-pick e3659d43d8911e91739f3b0c5935598bceb859aa` workaround
Reviewed By: cjdb, fhahn, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D129255
This patchs adds a new metadata kind `exclude` which implies that the
global variable should be given the necessary flags during code
generation to not be included in the final executable. This is done
using the ``SHF_EXCLUDE`` flag on ELF for example. This should make it
easier to specify this flag on a variable without needing to explicitly
check the section name in the target backend.
Depends on D129053 D129052
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D129151
Currently we use the `.llvm.offloading` section to store device-side
objects inside the host, creating a fat binary. The contents of these
sections is currently determined by the name of the section while it
should ideally be determined by its type. This patch adds the new
`SHT_LLVM_OFFLOADING` section type to the ELF section types. Which
should make it easier to identify this specific data format.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129052
Currently we use the `embedBufferInModule` function to store binary
strings containing device offloading data inside the host object to
create a fatbinary. In the case of LTO, we need to extract this object
from the LLVM-IR. This patch adds a metadata node for the embedded
objects containing the embedded pointers and the sections they were
stored at. This should create a cleaner interface for identifying these
values.
In the future it may be worthwhile to also encode an `ID` in the
metadata corresponding to the object's special section type if relevant.
This would allow us to extract the data from an object file and LLVM-IR
using the same ID.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D129033
Not deleting the loose instruction with metadata associated to it causes
an assertion when the LLVMContext is destroyed. This was previously
hidden by the fact that llvm-c-test does not call LLVMShutdown. The
planned removal of ManagedStatic exposed this issue.
Differential Revision: https://reviews.llvm.org/D129114
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw`
instruction. For now (at least in this patch), the instruction will be expanded
to CAS loop. There are already a couple of targets supporting the feature. I'll
create another patch(es) to enable them accordingly.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127041
Add support for the RDPRU instruction on Zen2 processors.
User-facing features:
- Clang option -m[no-]rdpru to enable/disable the feature
- Support is implicit for znver2/znver3 processors
- Preprocessor symbol __RDPRU__ to indicate support
- Header rdpruintrin.h to define intrinsics
- "rdpru" mnemonic supported for assembler code
Internal features:
- Clang builtin __builtin_ia32_rdpru
- IR intrinsic @llvm.x86.rdpru
Differential Revision: https://reviews.llvm.org/D128934
This adds a release note entry for the new -mframe-chain option
introduced on D125094.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D129085
D128820 stopped creating div/rem constant expressions by default;
this patch removes support for them entirely.
The getUDiv(), getExactUDiv(), getSDiv(), getExactSDiv(), getURem()
and getSRem() on ConstantExpr are removed, and ConstantExpr::get()
now only accepts binary operators for which
ConstantExpr::isSupportedBinOp() returns true. Uses of these methods
may be replaced either by corresponding IRBuilder methods, or
ConstantFoldBinaryOpOperands (if a constant result is required).
On the C API side, LLVMConstUDiv, LLVMConstExactUDiv, LLVMConstSDiv,
LLVMConstExactSDiv, LLVMConstURem and LLVMConstSRem are removed and
corresponding LLVMBuild methods should be used.
Importantly, this also means that constant expressions can no longer
trap! This patch still keeps the canTrap() method to minimize diff --
I plan to drop it in a separate NFC patch.
Differential Revision: https://reviews.llvm.org/D129148
This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is
an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI,
both of which are enabled by default.
Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and
David Green.
Differential Revision: https://reviews.llvm.org/D128415
Specifically:
- Diffs are not passed around on mailing lists any more.
- Diffs should be `-U999999`.
- Clarify part about automated emails.
Differential review: https://reviews.llvm.org/D128645
This removes the insertvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
This is very similar to the extractvalue removal from D125795.
insertvalue is also not supported in bitcode, so no auto-ugprade
is necessary.
ConstantExpr::getInsertValue() can be replaced with
IRBuilder::CreateInsertValue() or ConstantFoldInsertValueInstruction(),
depending on whether a constant result is required (with the latter
being fallible).
The ConstantExpr::hasIndices() and ConstantExpr::getIndices()
methods also go away here, because there are no longer any constant
expressions with indices.
Differential Revision: https://reviews.llvm.org/D128719
Summary of changes:
- Remove dst for global_atomic_add_f32, global_atomic_pk_add_f16.
- Make vdata input-only for buffer_atomic_add_f32, buffer_atomic_pk_add_f16.
- Other minor improvements.
Removes an unused function argument from a code listing in the Kaleidoscope turorial in step 9.
Reviewed By: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D128628
With this change, fuzz targets may choose to return -1
to indicate that the input should not be added to the corpus
regardless of the coverage it generated.
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D128749
GNU objdump disassembles all unknown instructions by default. Match this user
friendly behavior with the cpu value `future`.
Differential Revision: https://reviews.llvm.org/D127824
GNU objdump disassembles all unknown instructions by default. Match this user
friendly behavior with the target feature "all" (D128029) designed for disassemblers.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D128030
This option allows printing all sources used by an object file.
Reviewed By: dblaikie, jhenderson
Differential Revision: https://reviews.llvm.org/D87656
clang 14 removed -gz=zlib-gnu support and ld.lld removed linker input support
for zlib-gnu in D126793. Now let's remove zlib-gnu from llvm-objcopy.
* .zdebug* sections are no longer recognized as debug sections. --strip* don't remove them.
They are copied like other opaque sections
* --decompress-debug-sections does not uncompress .zdebug* sections
* --compress-debug-sections=zlib-gnu is not supported
It is very rare but in case a user has object files using .zdebug . They can use
llvm-objcopy<15 or GNU objcopy for uncompression.
--compress-debug-sections=zlib-gnu is unlikely ever used by anyone, so I do not
add a custom diagnostic.
Differential Revision: https://reviews.llvm.org/D128688
From binutils 2.34 onwards, ar supports --output to specify a directory
where archive members should be extracted to. Port this feature.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D128626
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.
Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.
This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D121346
This removes the extractvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
extractvalue is already not supported in bitcode, so we do not need
to worry about bitcode auto-upgrade.
Uses of ConstantExpr::getExtractValue() should be replaced with
IRBuilder::CreateExtractValue() (if the fact that the result is
constant is not important) or ConstantFoldExtractValueInstruction()
(if it is). Though for this particular case, it is also possible
and usually preferable to use getAggregateElement() instead.
The C API function LLVMConstExtractValue() is removed, as the
underlying constant expression no longer exists. Instead,
LLVMBuildExtractValue() should be used (which will constant fold
or create an instruction). Depending on the use-case,
LLVMGetAggregateElement() may also be used instead.
Differential Revision: https://reviews.llvm.org/D125795
Information in the function `Prologue Data` is intentionally opaque.
When a function with `Prologue Data` is duplicated. The self (global
value) references inside `Prologue Data` is still pointing to the
original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.
This patch detaches the information from function `Prologue Data`
and attaches it to a function metadata node.
This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D115844
This adds a --filter option to llvm-symbolizer. This takes log-bearing
symbolizer markup from stdin and writes a human-readable version to
stdout.
For now, this only implements the "symbol" markup tag; all others are
passed through unaltered. This is a proof-of-concept bit of
functionalty; implement the various tags is more-or-less just a matter
of hooking up various parts of the Symbolize library to the architecture
established here.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D126980
Adding release note entries for LLVM & Clang to introduce the HLSL &
DirectX support that is being added.
Reviewed By: aaron.ballman, MaskRay
Differential Revision: https://reviews.llvm.org/D127890
When a function does a dynamic stack allocation, the function's stack
size (in the stack map) is reported as UINT64_MAX.
This change tests and documents this property.
Differential Revision: https://reviews.llvm.org/D128525
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.
Differential Revision: https://reviews.llvm.org/D127976
The `DW_AT_LLVM_address_space` attribute is mentioned as `DW_AT_address_space` in
one of the notes.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D128210
This adds LLVMGetAggregateElement() as a wrapper for
Constant::getAggregateElement(), which allows fetching a
struct/array/vector element without handling different possible
underlying representations.
As the changed echo test shows, previously you for example had to
treat ConstantArray (use LLVMGetOperand) and ConstantDataArray
(use LLVMGetElementAsConstant) separately, not to mention all the
other possible representations (like PoisonValue).
I've deprecated LLVMGetElementAsConstant() in favor of the new
function, which is strictly more powerful (but I could be convinced
to drop the deprecation).
This is partly motivated by https://reviews.llvm.org/D125795,
which drops LLVMConstExtractValue() because the underlying constant
expression no longer exists. This function could previously be used
as a poor man's getAggregateElement().
Differential Revision: https://reviews.llvm.org/D128417
Let's introduce and publish an LLVM community calendar.
The idea is that organizers of events such as online sync-ups or office hours
invite calendar@llvm.org to the event they're creating. That way, the calendar
publicly visible at
https://calendar.google.com/calendar/u/0/embed?src=calendar@llvm.org will show
the event.
The hope is that having a single calendar showing all LLVM events makes it
easier for both new comers and experienced people to discover events they're
interested in.
This patch partially implements https://github.com/llvm/llvm-project/issues/55426
We could also give pointers to the calendar in a few other places, e.g. from
the main LLVM page, but let's introduce the incrementally.
Differential Revision: https://reviews.llvm.org/D127852
We can cast a string to a record via !cast, but we have no mechanism
to check if it is valid and TableGen will raise an error if failed to
cast. Besides, we have no semantic null in TableGen (we have `?` but
different backends handle uninitialized value differently), so operator
like `dyn_cast<>` is hard to implement.
In this patch, we add a new operator `!exists<T>(s)` to check whether
a record with type `T` and name `s` exists. Self-references are allowed
just like `!cast`.
By doing these, we can write code like:
```
class dyn_cast_to_record<string name> {
R value = !if(!exists<R>(name), !cast<R>(name), default_value);
}
defvar v = dyn_cast_to_record<"R0">.value; // R0 or default_value.
```
Reviewed By: tra, nhaehnle
Differential Revision: https://reviews.llvm.org/D127948
The added section and table here list the object file formats LLVM MC
supports and which targets support each format.
Differential Revision: https://reviews.llvm.org/D127645
This document is a work in progress to begin fleshing out documentation
for the DirectX backend and related changes in the LLVM project.
This is not intended to be exhaustive or complete, it is intended as a
starting
point so taht future changes have a place for documentation to land.
Differential Revision: https://reviews.llvm.org/D127640
This resolves problems reported in commit 1a20252978.
1. Promote to float lowering for nodes XINT_TO_FP
2. Bail out f16 from shuffle combine due to vector type is not legal in the version
Automatically generate a reproducer when dsymutil crashes. We already
support generating reproducers with the --gen-reproducer flag, which
emits a reproducer on exit. This patch adds support for doing the same
on a crash and makes it the default behavior.
rdar://68357665
Differential revision: https://reviews.llvm.org/D127441
GCC and Clang/LLVM will support `_Float16` on X86 in C/C++, following
the latest X86 psABI. (https://gitlab.com/x86-psABIs)
_Float16 arithmetic will be performed using native half-precision. If
native arithmetic instructions are not available, it will be performed
at a higher precision (currently always float) and then truncated down
to _Float16 immediately after each single arithmetic operation.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D107082
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.
The idea is to get support from the compiler to easily write efficient memory function implementations.
This patch could be split in two:
- one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
- and another one for the Clang part providing the instrinsic as a builtin.
Differential Revision: https://reviews.llvm.org/D126903
In GFX10 dlc controlled L1 cache bypass. In GFX11 it has been repurposed
to control MALL NOALLOC, and glc controls L1 as well as L0 cache bypass.
Update the documentation and SIMemoryLegalizer accordingly. Set dlc for
nontemporal and volatile accesses.
Differential Revision: https://reviews.llvm.org/D127405
I've seen a few people try to enable opaque pointers with LLVM 14
already. While LLVM 14 has pretty good baseline support, there are
enough missing pieces that you're definitely going to hit assertion
failures if you try this.
Add some wording to make it clear what the support (or planned
support) for opaque/typed pointers is across LLVM 14, 15, and 16.
They are now `using` aliases and thus the comments about iplist are now
incorrect. Remove them here.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D95210
This adds code blocks and inline code formatting to improve the readability of
the instruction referencing doc.
Reviewed By: Orlando
Differential Revision: https://reviews.llvm.org/D126767
This patch updates the document with some advanced use cases and examples on how to set up and use LLVM-style RTTI. It includes a few motivating examples to get readers comfortable with the concepts.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D126943
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:
* If IR that contains *neither* explicit ptr nor %T* types is passed
to tools, we will now use opaque pointer mode, unless
-opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
It is possible to opt-out by calling setOpaquePointers(false) on
LLVMContext.
A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.
Differential Revision: https://reviews.llvm.org/D126689
LTO code may end up mixing bitcode files from various sources varying in
their use of opaque pointer types. The current strategy to decide
between opaque / typed pointers upon the first bitcode file loaded does
not work here, since we could be loading a non-opaque bitcode file first
and would then be unable to load any files with opaque pointer types
later.
So for LTO this:
- Adds an `lto::Config::OpaquePointer` option and enforces an upfront
decision between the two modes.
- Adds `-opaque-pointers`/`-no-opaque-pointers` options to the gold
plugin; disabled by default.
- `--opaque-pointers`/`--no-opaque-pointers` options with
`-plugin-opt=-opaque-pointers`/`-plugin-opt=-no-opaque-pointers`
aliases to lld; disabled by default.
- Adds an `-lto-opaque-pointers` option to the `llvm-lto2` tool.
- Changes the clang driver to pass `-plugin-opt=-opaque-pointers` to
the linker in LTO modes when clang was configured with opaque
pointers enabled by default.
This fixes https://github.com/llvm/llvm-project/issues/55377
Differential Revision: https://reviews.llvm.org/D125847
While working on a clang-format option RemoveBracesLLVM that removes
braces following the guideline, we were unsure about what to do with
the braces of do-while loops. The ratio of using to omitting the
braces is about 4:1 in the llvm-project source, so it will help to
add an example to the guideline.
Also cleans up the original examples including making the nested if
example more targeted on avoiding potential dangling else situations.
Differential Revision: https://reviews.llvm.org/D126512
I chose to encode the allockind information in a string constant because
otherwise we would get a bit of an explosion of keywords to deal with
the possible permutations of allocation function types.
I'm not sure that CodeGen.h is the correct place for this enum, but it
seemed to kind of match the UWTableKind enum so I put it in the same
place. Constructive suggestions on a better location most certainly
encouraged.
Differential Revision: https://reviews.llvm.org/D123088
This patch fixes formatting inside Functions section of declare
by making it consistent with the way how define is written.
Fixes#39844
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D125581
This change is a small refactor of Functions section
to update placement of define syntax.
Reviewed By: RKSimon
Differential revision: https://reviews.llvm.org/D125831
This adds support for pointer types for `atomic xchg` and let us write
instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This
is similar to the patch for allowing atomicrmw xchg on floating point
types: https://reviews.llvm.org/D52416.
Differential Revision: https://reviews.llvm.org/D124728
Latest GNU nm (milestone: 2.39) has added -W/--no-weak and changed -U to mean
--defined-only (instead of --unicode=). The changes match our semantics.
Close#55297
Reviewed by: jhenderson, keith
Differential Revision: https://reviews.llvm.org/D126133
Currently added versions are from v1.0 to v1.5, other versions
can be added as needed.
This change also adds documentation about SPIR-V target support
in LLVM.
Differential Revision: https://reviews.llvm.org/D124776
This used to be D102158, but all the code it describes got re-written, so I
figured I'd take another shot at documenting the new instruction referencing
variable locations, this time from a higher level. Happily there's no longer any
need to describe LiveDebugValues in any detail seeing how it's all SSA-based
now.
Probably the most important part is the explanation of what targets need to do
to support instruction referencing. The list is small, mostly because there's
nothing especially complicated that targets need to do: just instrument their
target-specific optimisations and implement the stack spill/restore recognition
target hooks.
This is a small amount of text (which is a virtue), I'm extremely happy to
expand on anything.
Differential Revision: https://reviews.llvm.org/D113586
Co-authored-by: Jeremy Morse <jeremy.morse@sony.com>
These no longer exist. A few have been added since but I'm not enough of
an expert to provide a useful blurb on them outside of what you see with
`--help`.
Differential Revision: https://reviews.llvm.org/D122361
This summarises the changes made by d9398a91e2.
Which forms the bulk of the fixes needed for non-address bit handling.
Note that in the previous releases we noted memory tagging support,
which is a subset of non-address bits. The recent changes enable
debugging of programs using memory tagging, pointer authentication
and top byte ignore (all at once) on AArch64.
This is off by default. If you get a result and that
memory has memory tags, when --show-tags is given you'll
see the tags inline with the memory content.
```
(lldb) memory read mte_buf mte_buf+64 --show-tags
<...>
0xfffff7ff8020: 00 00 00 00 00 00 00 00 0d f0 fe ca 00 00 00 00 ................ (tag: 0x2)
<...>
(lldb) memory find -e 0xcafef00d mte_buf mte_buf+64 --show-tags
data found at location: 0xfffff7ff8028
0xfffff7ff8028: 0d f0 fe ca 00 00 00 00 00 00 00 00 00 00 00 00 ................ (tags: 0x2 0x3)
0xfffff7ff8038: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (tags: 0x3 0x4)
```
The logic for handling alignments is the same as for memory read
so in the above example because the line starts misaligned to the
granule it covers 2 granules.
Depends on D125089
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D125090
This reverts commit 3e928c4b9d.
This fixes an issue seen on Windows where we did not properly
get the section names of regions if they overlapped. Windows
has regions like:
[0x00007fff928db000-0x00007fff949a0000) ---
[0x00007fff949a0000-0x00007fff949a1000) r-- PECOFF header
[0x00007fff949a0000-0x00007fff94a3d000) r-x .hexpthk
[0x00007fff949a0000-0x00007fff94a85000) r-- .rdata
[0x00007fff949a0000-0x00007fff94a88000) rw- .data
[0x00007fff949a0000-0x00007fff94a94000) r-- .pdata
[0x00007fff94a94000-0x00007fff95250000) ---
I assumed that you could just resolve the address and get the section
name using the start of the region but here you'd always get
"PECOFF header" because they all have the same start point.
The usual command repeating loop used the end address of the previous
region when requesting the next, or getting the section name.
So I've matched this in the --all scenario.
In the example above, somehow asking for the region at
0x00007fff949a1000 would get you a region that starts at
0x00007fff949a0000 but has a different end point. Using the load
address you get (what I assume is) the correct section name.
Steve Klabnik recently left the Rust project. Josh Stone (the other member of
the Rust Security Response WG) replaces him as one of the vendor contacts for
Rust.
Differential Revision: https://reviews.llvm.org/D119137
This brings clang/llvm into line with GCC. The Pass is still enabled for
the affected cores, but is now opt-in when using `-march=`.
I also took the opportunity to add release notes for this change.
Reviewed By: john.brawn
Differential Revision: https://reviews.llvm.org/D125775
This adds an option to the memory region command
to print all regions at once. Like you can do by
starting at address 0 and repeating the command
manually.
memory region [-a] [<address-expression>]
(lldb) memory region --all
[0x0000000000000000-0x0000000000400000) ---
[0x0000000000400000-0x0000000000401000) r-x <...>/a.out PT_LOAD[0]
<...>
[0x0000fffffffdf000-0x0001000000000000) rw- [stack]
[0x0001000000000000-0xffffffffffffffff) ---
The output matches exactly what you'd get from
repeating the command. Including that it shows
unmapped areas between the mapped regions.
(this is why Process GetMemoryRegions is not
used, that skips unmapped areas)
Help text has been updated to show that you can have
an address or --all but not both.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D111791
this review is extracted from D86539.
1. Rename AccelTableKind to DwarfLinkerAccelTableKind
(to differentiate from AccelTableKind from CodeGen/AsmPrinter/DwarfDebug.h)
2. Add None value to the DwarfLinkerAccelTableKind.
3. added 'None' value for 'accelerator' option of dsymutil.
Differential Revision: https://reviews.llvm.org/D125474
`--symbolize-operands` already symbolizes branch targets based on the disassembly. When the object file is created with `-fbasic-block-sections=labels` (ELF-only) it will include a SHT_LLVM_BB_ADDR_MAP section which maps basic blocks to their addresses. In such case `llvm-objdump` can annotate the disassembly based on labels inferred on this section.
In contrast to the current labels, SHT_LLVM_BB_ADDR_MAP-based labels are created for every machine basic block including empty blocks and those which are not branched into (fallthrough blocks).
The old logic is still executed even when the SHT_LLVM_BB_ADDR_MAP section is present to handle functions which have not been received an entry in this section.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D124560
st_size may not be of importance to the abi if you are not using
copy relocations. This is helpful when you want to check the abi
of a shared object both when instrumented and not because asan
will increase the size of objects to include the redzone.
Differential revision: https://reviews.llvm.org/D124792
st_size may not be of importance to the abi if you are not using
copy relocations. This is helpful when you want to check the abi
of a shared object both when instrumented and not because asan
will increase the size of objects to include the redzone.
Differential revision: https://reviews.llvm.org/D124792
While it originally did, this option no longer affects the cc1
interface. For the cc1 interface, -no-opaque-pointers has to be
passed, there is no cmake option.
There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users.
Reviewed By: #libc, Mordante, martong, ldionne
Differential Revision: https://reviews.llvm.org/D124708
After a lot of discussion in this diff the consensus was that it is really hard to guess the users intention with their LLVM build. Instead of trying to guess if Debug or Release is the correct default option we opted for just not specifying CMAKE_BUILD_TYPE a error.
Discussion on discourse here:
https://discourse.llvm.org/t/rfc-select-a-better-linker-by-default-or-warn-about-using-bfd
Reviewed By: hans, mehdi_amini, aaron.ballman, jhenderson, MaskRay, awarzynski
Differential Revision: https://reviews.llvm.org/D124153
X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee.
It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion.
- For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller.
- For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match.
The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`.
This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.
Differential Revision: https://reviews.llvm.org/D123284
This is the first patch of a series to upstream support for the new
subtarget.
Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>
Patch 1/N for upstreaming AMDGPU gfx11 architectures.
Reviewed By: foad, kzhuravl, #amdgpu
Differential Revision: https://reviews.llvm.org/D124536
For the AIX linker, under default options, global or weak symbols which
have no visibility bits set to zero (i.e. no visibility, similar to ELF
default) are only exported if specified on an export list provided to
the linker. So AIX has an additional visibility style called
"exported" which indicates to the linker that the symbol should
be explicitly globally exported.
This change maps "dllexport" in the LLVM IR to correspond to XCOFF
exported as we feel this best models the intended semantic (discussion
on the discourse RFC thread: https://discourse.llvm.org/t/rfc-adding-exported-visibility-style-to-the-ir-to-model-xcoff-exported-visibility/61853)
and allows us to enable writing this visibility for the AIX target
in the assembly path.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D123951
This syntax allows to modify RUN lines based on features
available. For example:
RUN: ... | FileCheck %s --check-prefix=%if windows %{CHECK-W%} %else %{CHECK-NON-W%}
CHECK-W: ...
CHECK-NON-W: ...
The whole command can be put under %if ... %else:
RUN: %if tool_available %{ %tool %} %else %{ true %}
or:
RUN: %if tool_available %{ %tool %}
If tool_available feature is missing, we'll have an empty command in
this RUN line. LIT used to emit an error for empty commands, but now
it treats such commands as nop in all cases.
Multi-line expressions are also supported:
RUN: %if tool_available %{ \
RUN: %tool \
RUN: %} %else %{ \
RUN: true \
RUN: %}
Background and motivation:
D121727 [NVPTX] Integrate ptxas to LIT tests
https://reviews.llvm.org/D121727
Differential Revision: https://reviews.llvm.org/D122569
This continues the push away from hard-coded knowledge about functions
towards attributes. We'll use this to annotate free(), realloc() and
cousins and obviate the hard-coded list of free functions.
Differential Revision: https://reviews.llvm.org/D123083
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks
if the provided floating-point number belongs to any of the the specified
value classes. The intrinsic implements the checks made by C standard
library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`,
`issignaling` and corresponding IEEE-754 operations.
The primary motivation for this intrinsic is the support of strict FP
mode. In this mode using compare instructions or other FP operations is
not possible, because if the value is a signaling NaN, floating-point
exception `Invalid` is raised, but the aforementioned functions must
never raise exceptions.
Currently there are two solutions for this problem, both are
implemented partially. One of them is using integer operations to
implement the check. It was implemented in https://reviews.llvm.org/D95948
for `isnan`. It solves the problem of exceptions, but offers one
solution for all targets, although some can do the check in more
efficient way.
The other, implemented in https://reviews.llvm.org/D96568, introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target
specific code into IR to implement `isnan` and some other functions. It is
convenient for targets that have dedicated instruction to determine FP data
class. However using target-specific intrinsic complicates analysis and can
prevent some optimizations.
A special intrinsic for value class checks allows representing data class
tests with enough flexibility. During IR transformations it represents the
check in target-independent way and saves it from undesired transformations.
In the instruction selector it allows efficient lowering depending on the
used target and mode.
This implementation is an extended variant of `llvm.isnan` introduced
in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic
support. Target-specific treatment will be implemented in separate
patches.
Differential Revision: https://reviews.llvm.org/D112025
This patch provides the response and reporting guides for Code of Conduct reports. It also removes the word draft from all the documents,
adds information about the CoC committee, and a place to put transparency reports. A post will also be made on Discourse to provide more details about the history of the code of conduct, this patch, and next steps. Please see that post for full details. I will put a link below once I have it.
Reviewed By: aaron.ballman, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D122937
This patch unifies the wording used for readnone, readonly and writeonly
attributes. The definitions now more specifically refer to memory visible
outside the function
The motivation for the clarification is D123473.
Reviewed By: nlopes
Differential Revision: https://reviews.llvm.org/D124124
This emits an `st_size` that represents the actual useable size of an object before the redzone is added.
Reviewed By: vitalybuka, MaskRay, hctim
Differential Revision: https://reviews.llvm.org/D123010
In some case, GitHub will not send notification for commit access invitation. Add invitation link for people don't get notification from GitHub.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D124191
Debugify in OriginalDebugInfo mode, does (DebugInfo) collect-before-pass & check-after-pass
for each instruction, which is pretty expensive. When used to analyze DebugInfo losses
in large projects (like LLVM), this raises the build time unacceptably.
This patch introduces a limit for the number of processed functions per compile unit.
By default, the limit is set to UINT_MAX (practically unlimited), and by using the introduced
option -debugify-func-limit the limit could be set to any positive integer number.
Differential revision: https://reviews.llvm.org/D115714
This patch contains enough for lib/Target/SPIRV to compile: a basic
SPIRVTargetMachine and SPIRVTargetInfo.
Differential Revision: https://reviews.llvm.org/D115009
Authors: Aleksandr Bezzubikov, Lewis Crawford, Ilia Diachkov,
Michal Paszkowski, Andrey Tretyakov, Konrad Trifunovic
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Ilia Diachkov <iliya.diyachkov@intel.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
specifying DW_AT_trampoline as a string. Also update the signature
of DIBuilder::createFunction to reflect this addition.
Differential Revision: https://reviews.llvm.org/D123697
We've found that there are cases where it's useful to be able to include
the same target in multiple distributions (e.g. if you want a
distribution that's a superset of another distribution, for convenience
purposes), and that there are cases where the distribution of a target
and its umbrella can legitimately differ (e.g. the LTO library would
commonly be distributed alongside your tools, but it also falls under
the llvm-libraries umbrella, which would commonly be distributed
separately). Relax the restrictions while providing an option to restore
them (which is mostly useful to ensure you aren't accidentally placing
targets in the wrong distributions).
There could be further refinements here (e.g. excluding a target from an
umbrella if it's explicitly included in some other distribution, or
having variables to control which targets are allowed to be duplicated
or placed in a separate distribution than their umbrellas), but we can
punt on those until there's an actual need.
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.
This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).
These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.
Review: Xiang Zhang and Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D122220
Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset to check whether the multigrid synchronization
pointer is used. If yes, we remove this attribute and also remove
amdgpu-no-implicitarg-ptr. We generate metadata for the hidden_multigrid_sync_arg
only when the amdgpu-no-multigrid-sync-arg attribute is removed from the function.
Reviewers: arsenm, sameerds, b-sumner and foad
Differential Revision: https://reviews.llvm.org/D123548
The description has been updated to reflect AMDGPU MC changes:
- enabled literals for src0 of v_fmaak_f*, v_fmamk_f*, v_madak_f32, v_madmk_f32;
- enabled global_atomic_fcmpswap and global_atomic_fcmpswap_x2;
- enabled dlc with flat_atomic* and global_atomic_*.
Bug fixing and improvements:
- enabled s_wait_idle;
- enabled s_waitcnt_depctr;
- added description of s_waitcnt_depctr syntactic sugar;
- disabled SYSMSG_OP_HOST_TRAP_ACK (it is not supported on GFX10);
- corrected description of lgkmcnt (accept values from 0 to 63).
Or rather, error out if it is set to something other than ON. This
removes the ability to enable the legacy pass manager by default,
but does not remove the ability to explicitly enable it through
various flags like -flegacy-pass-manager or -enable-new-pm=0.
I checked, and our test suite definitely doesn't pass with
LLVM_ENABLE_NEW_PASS_MANAGER=OFF anymore.
Differential Revision: https://reviews.llvm.org/D123126
The diagnostic is unreliable, and triggers even for dead uses of
hostcall that may exist when linking the device-libs at lower
optimization levels.
Eliminate the diagnostic, and directly document the limitation for
OpenCL before code object V5.
Make some NFC changes to clarify the related code in the
MetadataStreamer.
Add a clang test to tie OCL sources containing printf to the backend IR
tests for this situation.
Reviewed By: sameerds, arsenm, yaxunl
Differential Revision: https://reviews.llvm.org/D121951
Point people to the cc1 instead of the mllvm flag, as the mllvm
flag will stop working for clang usage at some point.
Update transition state to mention that support in Clang/LLVM is
complete, and only the default switch is pending.
I noticed that when --update-section was added to llvm-objcopy it was
not added to the command guide, see
25bcd94234. This change adds it to the
docs and updates the help text.
Differential Revision: https://reviews.llvm.org/D122907
This option tells the host clang to use the new pass manager.
Given that it's been the default for a while, this seems unnecessary.
This was added in D57068.
(this does not affect any LLVM/Clang functionality)
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D122947
https://reviews.llvm.org/D116787
This reverts commit 33b3c86afa.
New change: fixed build failures:
- in stabs-sorted:restore the the ERR-KEY statements, which were accidentally deleted during refactoring
- in ObjDumper.h/MachODumper.cpp: refactor so that current dumpers which didn't provide an impl that accept a SymCom still works
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
Add documentation describing how to
- Use `llvm-remark-size-diff`
- Interpret the output from the tool
Differential Revision: https://reviews.llvm.org/D122744
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122729
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D121292