This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
This was previously attempted in 2016 by colinl's D18770, but LLD tests
were missed, which caused the change to be reverted.
Setting --print-imm-hex by default brings llvm-objdump's behavior closer
in line with objdump, and it makes it easier to read addresses and
alignment from the disassembly. It may make non-address immediates
harder to interpret, but it still seems the better default, barring more
context-sensitive base selection logic.
Differential Revision: https://reviews.llvm.org/D136972
Although i32 type is illegal in the backend, RV64I has pretty good support for i32 types by using W instructions.
By adding n32 to the DataLayout string, middle end optimizations will consider i32 to be a native type. One known effect of this is enabling LoopStrengthReduce on loops with i32 induction variables. This can be beneficial because C/C++ code often has loops with i32 induction variables due to the use of `int` or `unsigned int`.
If this patch exposes performance issues, those are better addressed by tuning LSR or other passes.
Reviewed By: asb, frasercrmck
Differential Revision: https://reviews.llvm.org/D116735
This reverts commit 233659c7ae.
I see some sanitizer build bot failures. Not sure if it is change
causing it, but let's see if a revert returns the bots to green...
The current implementation outputs JSON in the following way:
[{'<filename>':{'FileSummary':{},...}}]
Using the filename as a key makes processing the JSON data awkward, and
should be avoided. This patch removes that outer key, since the
'FileSummary' data also includes a 'File' field, and so we lose no data.
Reviewed By: jhenderson, leonardchan
Differential Revision: https://reviews.llvm.org/D134843
LoopFlatten has been in the code base off by default for years, but this
enables it to run by default. Downstream this has been running for
years, so it has been exposed to quite some code. Then around the time
we switched to the NPM, several fixes went in related to updating the
MemorySSA state and we moved it to a loop pass manager, which both
helped preventing rerunning certain analysis passes, and thus helped a
bit with compile-times.
About compile-times, adding a pass isn't free, but this should see only
very minor increases. The pass is relatively simple and there shouldn't
be anything algorithmically expensive because all it does is looking at
inner/outer loops and it checks assumptions on loop increments and
indices. If we see increases, I expect this to mainly come from
invalidation of analysis info, and perhaps subsequent passes to trigger
and do more. Despite its simplicity/restrictions, it triggers in most
code-bases, which makes it worth to enable this by default.
Differential Revision: https://reviews.llvm.org/D109958
Another alternative to fix the thread identification problem in
coroutines.
We plan to fix this problem by unifying memory effecting attributes. See
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
But it may be a long-term project. And it is a pity that the coroutines
can't resume in different threads for years. So this one is temporary
fix. It may cause unnecessary performance regression for coroutines. But
correctness are more important. And this one is planned to be reverted
after we are able to unify the memory effecting attributes actually.
Reviewed By: jdoerfert, rjmccall
Differential Revision: https://reviews.llvm.org/D135550
This extension does not appear to be on its way to ratification.
Out of the unratified bitmanip extensions, this one had the
largest impact on the compiler.
Posting this patch to start a discussion about whether we should
remove these extensions. We'll talk more at the RISC-V sync meeting this
Thursday.
Reviewed By: asb, reames
Differential Revision: https://reviews.llvm.org/D133834
This works with the automatic export of all symbols; in MinGW mode,
when a DLL has no explicit dllexports, it exports all symbols (except
for some that are hardcoded to be excluded, including some toolchain
libraries).
By hooking up the hidden visibility to the -exclude-symbols: directive,
the automatic export of all symbols can be controlled in an easier
way (with a mechanism that doesn't require strict annotation of every
single symbol, but which allows gradually marking more unnecessary
symbols as hidden).
The primary use case is dylib builds of LLVM/Clang. These can be done
in MinGW mode but not in MSVC mode, as MinGW builds can export all
symbols (and the calling code can use APIs without corresponding
dllimport directives). However, as all symbols are exported, it can
easily overflow the max number of exported symbols in a DLL (65536).
In the llvm-mingw distribution, only the X86, ARM and AArch64 backends
are enabled; for the LLVM 13.0.0 release, libLLVM-13.dll ended up with
58112 exported symbols. For LLVM 14.0.0, it was 62015 symbols. Current
builds of the 15.x branch end up at around 64650 symbols - i.e. extremely
close to the limit.
The msys2 packages of LLVM have had to progressively disable more
of their backends in their builds, to be able to keep building with a
dylib.
This allows improving the current mingw dylib situation significantly,
by using the same hidden visibility options and attributes as on Unix.
With those in place, a current build of LLVM git main ends up at 35142
symbols instead of 64650.
For code using hidden visibility, this now requires linking with either
a current git lld or ld.bfd. (Older lld error out on the unknown
directives, older ld.bfd will successfully link, but will print huge
amounts of warnings.)
Differential Revision: https://reviews.llvm.org/D130121
Also make the soft toolchain requirements hard. This allows
us to use C++17 features in LLVM now.
If we find patterns with C++17 that improve readability
it should be recommended in the coding standards.
Reviewed By: jhenderson, cor3ntin, MaskRay
Differential Revision: https://reviews.llvm.org/D130689
This teaches ProcessElfCore to recognise the MTE tag segments.
https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support
These segments contain all the tags for a matching memory segment
which will have the same size in virtual address terms. In real terms
it's 2 tags per byte so the data in the segment is much smaller.
Since MTE is the only tag type supported I have hardcoded some
things to those values. We could and should support more formats
as they appear but doing so now would leave code untested until that
happens.
A few things to note:
* /proc/pid/smaps is not in the core file, only the details you have
in "maps". Meaning we mark a region tagged only if it has a tag segment.
* A core file supports memory tagging if it has at least 1 memory
tag segment, there is no other flag we can check to tell if memory
tagging was enabled. (unlike a live process that can support memory
tagging even if there are currently no tagged memory regions)
Tests have been added at the commands level for a core file with
mte and without.
There is a lot of overlap between the "memory tag read" tests here and the unit tests for
MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's
worth keeping to check ProcessElfCore doesn't cause an assert.
Depends on D129487
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D129489
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:
; Before:
%res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "", "=r,r,!i"(i8* %x)
to label %asm.fallthrough [label %foo]
The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:
* Allow unrolling/peeling/rotation of callbr, or any other
clone-based optimizations
(https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
(https://github.com/llvm/llvm-project/issues/45248)
This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.
Differential Revision: https://reviews.llvm.org/D129288
* GNU objcopy supports --set-section-flags src=... --rename-section src=tst and --set-section-flags runs first.
* GNU objcopy processes --update-section before --rename-section.
To match the two behaviors, postpone --rename-section and allow its use together
with --set-section-flags.
As a side effect, --rename-section=.foo1=.foo2 --add-section=.foo1=/dev/null
leads to .foo2 while GNU objcopy surprisingly produces .foo1 (so
--set-section-flags --add-section --rename-section do not form a total order).
I think the deviation is fine as a total order makes more sense.
Rename set-section-flags-and-rename.test to
set-section-attr-and-rename.test and additionally test --set-section-alignment
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D129336
* Remove crc32 from zlib compression namespace, people should use the `llvm::crc32` instead.
Reviewed By: MaskRay, leonardchan
Differential Revision: https://reviews.llvm.org/D128754
* Refactor compression namespaces across the project, making way for a possible
introduction of alternatives to zlib compression.
Changes are as follows:
* Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`.
Reviewed By: MaskRay, leonardchan, phosek
Differential Revision: https://reviews.llvm.org/D128953
Not deleting the loose instruction with metadata associated to it causes
an assertion when the LLVMContext is destroyed. This was previously
hidden by the fact that llvm-c-test does not call LLVMShutdown. The
planned removal of ManagedStatic exposed this issue.
Differential Revision: https://reviews.llvm.org/D129114
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw`
instruction. For now (at least in this patch), the instruction will be expanded
to CAS loop. There are already a couple of targets supporting the feature. I'll
create another patch(es) to enable them accordingly.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127041
Add support for the RDPRU instruction on Zen2 processors.
User-facing features:
- Clang option -m[no-]rdpru to enable/disable the feature
- Support is implicit for znver2/znver3 processors
- Preprocessor symbol __RDPRU__ to indicate support
- Header rdpruintrin.h to define intrinsics
- "rdpru" mnemonic supported for assembler code
Internal features:
- Clang builtin __builtin_ia32_rdpru
- IR intrinsic @llvm.x86.rdpru
Differential Revision: https://reviews.llvm.org/D128934
This adds a release note entry for the new -mframe-chain option
introduced on D125094.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D129085