* Change `Symbol::flags` to a `std::atomic<uint16_t>`
* Add `llvm::parallel::threadIndex` as a thread-local non-negative integer
* Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
* Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.
MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.
Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:
* clang (Release): 1.27x as fast
* clang (Debug): 1.06x as fast
* chrome (default): 1.05x as fast
* scylladb (default): 1.04x as fast
Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):
* clang (Release): 1.31x as fast
* scylladb (default): 1.06x as fast
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D133003
`llvm` and downstream internal callers no longer use `array_lengthof`, so drop
the include everywhere.
Differential Revision: https://reviews.llvm.org/D133600
Clang has support of virtual file system for the purpose of testing, but
treatment of config files did not use it. This change enables VFS in it
as well.
Differential Revision: https://reviews.llvm.org/D132867
Clang has support of virtual file system for the purpose of testing, but
treatment of config files did not use it. This change enables VFS in it
as well.
Differential Revision: https://reviews.llvm.org/D132867
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133429
as high-level API on top of `llvm::compression::{zlib,zstd}::*`:
* getReasonIfUnsupported: return nullptr if the specified format is
supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...`
* compress: dispatch to zlib::uncompress or zstd::uncompress
* decompress: dispatch to zlib::uncompress or zstd::uncompress
Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic
dependency. There are 40+ uses in llvm-project.
Add another enum class `llvm::compression::Format` to represent supported
compression formats, which may be a superset of ELF compression formats.
See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use
case.
Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")
---
Note: this patch alone will cause -Wswitch to llvm/lib/ObjCopy/ELF/ELFObject.cpp
Reviewed By: ckissane, dblaikie
Differential Revision: https://reviews.llvm.org/D130506
as high-level API on top of `llvm::compression::{zlib,zstd}::*`:
* getReasonIfUnsupported: return nullptr if the specified format is
supported, or (if unsupported) a string like `LLVM was not built with LLVM_ENABLE_ZLIB ...`
* compress: dispatch to zlib::uncompress or zstd::uncompress
* decompress: dispatch to zlib::uncompress or zstd::uncompress
Move `llvm::DebugCompressionType` from MC to Support to avoid Support->MC cyclic
dependency. There are 40+ uses in llvm-project.
Add another enum class `llvm::compression::Format` to represent supported
compression formats, which may be a superset of ELF compression formats.
See D130458 (llvm-objcopy --{,de}compress-debug-sections for zstd) for a use
case.
Link: https://discourse.llvm.org/t/rfc-zstandard-as-a-second-compression-method-to-llvm/63399
("[RFC] Zstandard as a second compression method to LLVM")
Differential Revision: https://reviews.llvm.org/D130506
This is a minimalist implementation which simply adds the extension (in the experimental namespace since its not ratified), and wires up the setting of the required ELF header flag. Future changes will include codegen changes to exploit the stronger memory model.
This is intended to implement v0.1 of the proposed specification which can be found in Chapter 25 of https://github.com/riscv/riscv-isa-manual/releases/download/draft-20220723-10eea63/riscv-spec.pdf.
Differential Revision: https://reviews.llvm.org/D133239
Add mapped_file_region::sync(), equivalent to POSIX msync,
synchronizing written content to disk without unmapping the region.
Asserts if the mode is not mapped_file_region::readwrite.
Note that I don't have access to a Windows machine, so I can't
easily run those unit tests.
Change by dexonsmith
Differential Revision: https://reviews.llvm.org/D95494
Part of patchset to add initial support for ARM64EC.
Per discussion on review, using the triple arm64ec-pc-windows-msvc. The
parsing works the same way as Apple's alternate Arm ABI "arm64e".
Differential Revision: https://reviews.llvm.org/D125412
add LLVM_PREFER_STATIC_ZSTD (default TRUE) cmake config flag
(compression test seems to fail for shared zstd on windows, note that zstd multithread is by default disabled in the static build so it may be a hidden variable)
propagate variable zstd_DIR in LLVMConfig.cmake.in
fix llvm-config CMakeLists.txt behavior for absolute libs windows
get zstd lib name
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D132870
Follow the pattern used in MLIR for the cl::opt instances.
v2:
- make DebugCounter::isCountingEnabled public so that the
DebugCounterOwner doesn't have to be a nested class. This simplifies
later changes
v3:
- remove the indirection via DebugCounterOwner::instance()
Differential Revision: https://reviews.llvm.org/D129116
This reverts commit 51d82502d9.
There is a regression in the flang-aarch64-dylib buildbot which is most
likely caused by this change. Reverting until I can investigate.
It was observed in D129117 that the subtle dependency between statistic
and timer code is not entirely robust: the global destructor
~StatisticInfo indirectly calls CreateInfoOutputFile, which requires
the LibSupportInfoOutputFilename to not have been destructed.
By constructing LibSupportInfoOutputFilename before the StatisticInfo
object, the order of destruction is guaranteed.
Differential Revision: https://reviews.llvm.org/D131059
Follow the pattern used in MLIR for the cl::opt instances.
v2:
- make DebugCounter::isCountingEnabled public so that the
DebugCounterOwner doesn't have to be a nested class. This simplifies
later changes
Differential Revision: https://reviews.llvm.org/D129116
We currently process one OutputSection at a time and for each OutputSection
write contained input sections in parallel. This strategy does not leverage
multi-threading well. Instead, parallelize writes of different OutputSections.
The default TaskSize for parallelFor often leads to inferior sharding. We
prepare the task in the caller instead.
* Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup
* Add llvm::parallel::TaskGroup::execute.
* Change writeSections to declare TaskGroup and pass it to writeTo.
Speed-up with --threads=8:
* clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast
* clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast
* chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast
* scylladb build/release: 1.09x as fast
On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast.
Differential Revision: https://reviews.llvm.org/D131247
Check ScopedPrinter pointer before attempting to print the attribute's
parsed information.
Patch by Michael Platings and Victor Campos
Reviewed By: pratlucas
Differential Revision: https://reviews.llvm.org/D132214
The previous implementation translated from names like sifive-7-series
to sifive-7-rv32 or sifive-7-rv64. This also required sifive-7-rv32
and sifive-7-rv64 to be valid CPU names. As those are not real
CPUs it doesn't make sense to accept them in -mcpu.
This patch does away with the translation and adds sifive-7-series
directly to RISCV.td. Removing sifive-7-rv32 and sifive-7-rv64.
sifive-7-series is only allowed in -mtune.
I've also added "rocket" to RISCV.td but have not removed rocket-rv32
or rocket-rv64.
To prevent -mcpu=sifive-7-series or -mcpu=rocket being used with llc,
I've added a Feature32Bit to all rv32 CPUs. And made it an error to
have an rv32 triple without Feature32Bit. sifive-7-series and rocket
do not have Feature32Bit or Feature64Bit set so the user would need
to provide -mattr=+32bit or -mattr=+64bit along with the -mcpu to
avoid the error.
SiFive no longer names their newer products with 3, 5, or 7 series.
Instead we have p200 series, x200 series, p500 series, and p600 series.
Following the previous behavior would require a sifive-p500-rv32 and
sifive-p500-rv64 in order to support -mtune=sifive-p500-series. There
is currently no p500 product, but it could start getting confusing if
there was in the future.
I'm open to hearing alternatives for how to achieve my main goal
of removing sifive-7-rv32/rv64 as a CPU name.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D131708
This patch adds support for part of Zc extension which will be frozen soon.
This extension is designed to continue reducing the binary size of RISC-V programs.
In this patch:
`Zca` is a subset of C extension instructions that are compatible with the Zc extension.
The spec of Zc ext is [[ https://github.com/riscv/riscv-code-size-reduction/releases | Here ]]
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D130141
There is an existing mechanism to escape strings, therefore the
functions created to escape Tag_also_compatible_with values are not
really needed. We can simply use the pre-existing utilities.
Reviewed By: pratlucas
Differential Revision: https://reviews.llvm.org/D131680
The ARM Attribute Parser used to parse the value of also_compatible_with
as it is, disregarding the way it is encoded.
This patch does a context aware parsing of the also_compatible_with
attribute. Additionally, some error handling is also done for incorrect
cases.
Reviewed By: pratlucas
Differential Revision: https://reviews.llvm.org/D130913
Make the sched_getaffinity based implementation available to all architectures
(except s390x/x86 which have a custom implementation). The `CPU_ALLOC(2048)`
code supports all `CONFIG_NR_CPUS` values in Linux kernel `arch/*/configs/`.
The function is mainly used by in-process ThinLTO to decide the default number
of threads. Returning -1 will use just one thread.
Android is excluded because of the higher API level requirement:
`sched_getaffinity; # introduced-arm=12 introduced-arm64=21 introduced-x86=12 introduced-x86_64=21`
We were dereferencing an empty Optional if IgnoreErrors was true and the
stat failed.
rdar://60887887
Differential Revision: https://reviews.llvm.org/D131791
The code was relicensed by its owner (Unicode.org) a long time back,
but we still had the old (problematic) license in our fork.
Note that the source files have not been distributed from unicode.org
since 2009 (due to being buggy and unmaintained upstream), but they
were given this license before that.
Fixes https://github.com/llvm/llvm-project/issues/32309
Differential Revision: https://reviews.llvm.org/D66390
As mentioned on D128934 - we weren't including the CPUID bit handling for the RDPRU instruction
AMD's APMv3 (24594) lists it as CPUID Fn8000_0008_EBX Bit#4
This is not used as general CPU alias. Only to support -mtune. Name it as such.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D131602
This allow optimization without LTO. Also remove some useless else-ifs.
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D131313