The result must be less than or equal to the LHS side, so any
leading zeros in the left hand side must also exist in the result.
This is stronger than the previous behavior where we only considered
the sign bit being 0.
The affected test case used the sign bit being known 0 to change
a sign extend to a zero extend pre type legalization. After type
legalization the types were promoted to i64, but we no longer
knew bit 31 was zero. This shifts are are the equivalent of an
AND with 0xffffffff or zext_inreg X, i32. This patch allows us to
see that bit 31 is zero and remove the shifts.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D97124
This matches the platform default for GCC. It primarily matters when the
integrated assembler is not used as there is no default CPU defined for
ARMv7-A and GNU as is upset with -mcpu=generic.
real_path returns an `std::error_code` which evaluates to `true` in case an
error happens and `false` if not. This code was checking the inverse, so
case-insensitive file systems ended up being detected as case sensitive.
Tested using an LLDB reproducer test as we anyway need a real file system and
also some matching logic to detect whether the respective file system is
case-sensitive (which the test is doing via some Python checks that we can't
really emulate with the usual FileCheck logic).
Fixes rdar://67003004
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D96795
Adds an *unaudited* SHA-256 implementation to `llvm/Support`. The ongoing lld-macho effort needs this to emit an adhoc code signature for macho files on macOS Big Sur.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D96540
As of binutils 2.36, GNU strip calls chown(2) for "sudo strip foo" and
"sudo strip foo -o foo", but no "sudo strip foo -o bar" or "sudo strip
foo -o ./foo". In other words, while "sudo strip foo -o bar" creates a
new file bar with root access, "sudo strip foo" will keep the owner and
group of foo unchanged. Currently llvm-objcopy and llvm-strip behave
differently, always changing the owner and gropu to root. The
discrepancy prevents Chrome OS from migrating to llvm-objcopy and
llvm-strip as they change file ownership and cause intended users/groups
to lose access when invoked by sudo with the following sequence
(recommended in man page of GNU strip).
1.<Link the executable as normal.>
1.<Copy "foo" to "foo.full">
1.<Run "strip --strip-debug foo">
1.<Run "objcopy --add-gnu-debuglink=foo.full foo">
This patch makes llvm-objcopy and llvm-strip follow GNU's behavior.
Link: crbug.com/1108880
In addition to wall time etc. this should allow us to get less noisy
values for time measurements.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D96049
If context is enabled/disabled and queried concurrently then this
results in a data race/TSAN failure with RunSafely (where boolean
variable was not locked).
There doesn't seem to be a reasonable way to enable threads that enable
and disable recovery in parallel (without also keeping
gCrashRecoveryEnabled's lock held during Fn execution which seems
undesirable). This makes enable checking if enabled thread local and
consistent with other thread local usage of crash context here.
Differential Revision: https://reviews.llvm.org/D93907
glibc deprecates `mallinfo` in the latest version of 2.33. This patch replaces the usage of `mallinfo` with the new `mallinfo2` when it's available.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D96359
We implement getHostCPUName() for AIX via systemcfg interfaces since access to the processor version register is a privileged operation. We return a value based on the current processor implementation mode.
This fixes the cpu detection used by clang for `-mcpu=native`.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D95966
As noted in https://reviews.llvm.org/D93459, the formatting of
multi-line descriptions of clEnumValN and the likes is unfavorable.
Thus this patch adds support for correctly indenting these.
Reviewed By: serge-sans-paille
Differential Revision: https://reviews.llvm.org/D93494
Previously file entries in the -ivfsoverlay yaml could map to a file in the
external file system, but directories had to list their contents in the form of
other file entries or directories. Allowing directory entries to map to a
directory in the external file system makes it possible to present an external
directory's contents in a different location and (in combination with the
'fallthrough' option) overlay one directory's contents on top of another.
rdar://problem/72485443
Differential Revision: https://reviews.llvm.org/D94844
As a fixme notes, both of these directory iterator implementations are
conceptually similar and duplicate the functionality of returning and uniquing
entries across two or more directories. This patch combines them into a single
class 'CombiningDirIterImpl'.
This also drops the 'Redirecting' prefix from RedirectingDirEntry and
RedirectingFileEntry to save horizontal space. There's no loss of clarity as
they already have to be prefixed with 'RedirectingFileSystem::' whenever
they're referenced anyway.
rdar://problem/72485443
Differential Revision: https://reviews.llvm.org/D94857
This change fixes two issues with building LLVM on Haiku. The first issue is
that LLVM requires wait4(), which on Haiku is hidden behind the _BSD_SOURCE
feature flag when using the --std=c++14 flag. Additionally, the wait4()
function is only available in libbsd.so, so this is now a dependency.
The other fix is that Haiku does not have the (non-standard) rusage.maxrss
member, so by default the used memory info will be set to 0 on this platform.
Reviewed By: sepavloff
Differential Revision: https://reviews.llvm.org/D87920
Patch by Niels Sascha Reedijk.
Add a new `raw_pwrite_ostream` variant, `buffer_unique_ostream`, which
is like `buffer_ostream` but with unique ownership of the stream it's
wrapping. Use this in CompilerInstance to simplify the ownership of
non-seeking output streams, avoiding logic sprawled around to deal with
them specially.
This also simplifies future work to encapsulate output files in a
different class.
Differential Revision: https://reviews.llvm.org/D93260
Refactor the duplicated canonicalize-path logic in `FileCollector` and
`ModulesDependencyCollector` into a new utility called
`PathCanonicalizer` that's shared. This popped up when tracking down a
bug common to both in https://reviews.llvm.org/D95202.
As drive-bys, update a few names and comments to better reflect the
effect of the code, delay removal of `..`s to avoid an unnecessary extra
string copy, and leave behind a couple of FIXMEs for future
consideration.
Differential Revision: https://reviews.llvm.org/D95279
Don't emit an output dash for an empty sequence. Take emitting a vector
of strings for example:
std::vector<std::string> Strings = {"foo", "bar"};
LLVM_YAML_IS_SEQUENCE_VECTOR(std::string)
yout << Strings;
This emits the following YAML document.
---
- foo
- bar
...
When the vector is empty, this generates the following result:
---
- []
...
Although this is valid YAML, it does not match what we meant to emit.
The result is a one-element sequence consisting of an empty list.
Indeed, if we were to try to read this again we get an error:
YAML:2:4: error: not a mapping
- []
The problem is the output dash before the empty list. The correct output
would be:
---
[]
...
This patch fixes that by not emitting the output dash for an empty
sequence.
Differential revision: https://reviews.llvm.org/D95280
Rather than reimplement, use a `using` declaration to bring in
`SmallVectorImpl<char>`'s assign and append implementations in
`SmallString`.
The `SmallString` versions were missing reference invalidation
assertions from `SmallVector`. This patch also fixes a bug in
`llvm::FileCollector::addFileImpl`, which was a copy/paste from
`clang::ModuleDependencyCollector::copyToRoot`, both caught by the
no-longer-skipped assertions.
As a drive-by, this also sinks the `const SmallVectorImpl&` versions of
these methods down into `SmallVectorImpl`, since I imagine they'd be
useful elsewhere.
Differential Revision: https://reviews.llvm.org/D95202
This patch addresses inconsistencies in the way fallthrough is handled
in the RedirectingFileSystem. Rather than trying to change the working
directory of the external filesystem, the RedirectingFileSystem will
canonicalize every path before handing it down. This guarantees that
relative paths are resolved relative to the RedirectingFileSystem's
working directory.
This allows us to have a strictly virtual working directory, and still
fallthrough for absolute paths, but not for relative paths that would
get resolved incorrectly at the lower layer (for example, in case of the
RealFileSystem, because the strictly virtual path does not exist).
Differential revision: https://reviews.llvm.org/D95188
This fixes the final (I think?) reference invalidation in `SmallVector`
that we need to fix to align with `std::vector`. (There is still some
left in the range insert / append / assign, but the standard calls that
UB for `std::vector` so I think we don't care?)
For POD-like types, reimplement `emplace_back()` in terms of
`push_back()`, taking a copy even for large `T` rather than lose the
realloc optimization in `grow_pod()`.
For other types, split the grow operation in three and construct the new
element in the middle.
- `mallocForGrow()` calculates the new capacity and returns the result
of `safe_malloc()`. We only need a single definition per
`SmallVectorBase` so this is defined in SmallVector.cpp to avoid code
size bloat. Moving this part of non-POD grow to the source file also
allows the logic to be easily shared with `grow_pod`, and
`report_size_overflow()` and `report_at_maximum_capacity()` can move
there too.
- `moveElementsForGrow()` moves elements from the old to the new
allocation.
- `takeAllocationForGrow()` frees the old allocation and saves the
new allocation and capacity .
`SmallVector:assign(size_type, const T&)` also uses the split-grow
operations for non-POD, but it also has a semantic change when not
growing. Previously, assign would start with `clear()`, and so the old
elements were destructed and all elements of the new vector were
copy-constructed (potentially invalidating references). The new
implementation skips destruction and uses copy-assignment for the prefix
of the new vector that fits. The new semantics match what libc++ does
for `std::vector::assign()`.
Note that the following is another possible implementation:
```
void assign(size_type NumElts, ValueParamT Elt) {
std::fill_n(this->begin(), std::min(NumElts, this->size()), Elt);
this->resize(NumElts, Elt);
}
```
The downside of this simpler implementation is that if the vector has to
grow there will be `size()` redundant copy operations.
(I had planned on splitting this patch up into three for committing
(after getting performance numbers / initial review), but I've realized
that if this does for some reason need to be reverted we'll probably
want to revert the whole package...)
Differential Revision: https://reviews.llvm.org/D94739
All of these families were claiming to be a73 based, which was causing
-mcpu/mtune=native to never use the newer features available to these
cores.
Goes through each and bumps the individual cores to their respective Big
counterparts. Since this code path doesn't support big.little detection,
there was already a precedent set with the Qualcomm line to choose the
big cores only.
Adds a comment on each line for the product's name that the part number
refers to. Confirmed on-device and through Linux header naming
convections.
Additionally newer SoCs mix CPU implementer parts from multiple
implementers. Both 0x41 (ARM) and 0x51 (Qualcomm) in the Snapdragon case
This was causing a desync in information where the scan at the start to
find the implementer would mismatch the part scan later on.
Now scan for both implementer and part at the start so these stay in
sync.
Differential Revision: https://reviews.llvm.org/D94954
Add the aarch64[_be]-*-gnu_ilp32 targets to support the GNU ILP32 ABI for AArch64.
The needed codegen changes were mostly already implemented in D61259, which added support for the watchOS ILP32 ABI. The main changes are:
- Wiring up the new target to enable ILP32 codegen and MC.
- ILP32 va_list support.
- ILP32 TLSDESC relocation support.
There was existing MC support for ELF ILP32 relocations from D25159 which could be enabled by passing "-target-abi ilp32" to llvm-mc. This was changed to check for "gnu_ilp32" in the target triple instead. This shouldn't cause any issues since the existing support was slightly broken: it was generating ELF64 objects instead of the ELF32 object files expected by the GNU ILP32 toolchain.
This target has been tested by running the full rustc testsuite on a big-endian ILP32 system based on the GCC ILP32 toolchain.
Reviewed By: kristof.beyls
Differential Revision: https://reviews.llvm.org/D94143
Use a mutex to protect concurrent access to the signpost map. This fixes
nondeterministic crashes in LLDB that appeared after using signposts in
the timer implementation.
Differential revision: https://reviews.llvm.org/D94285
Such files (Thin-%%%%%%.tmp.o) are supposed to be deleted immediately
after they're used (either by renaming or deletion). However, we've seen
instances on Windows where this doesn't happen, probably due to the
filesystem being flaky. This is effectively a resource leak which has
prevented us from using the ThinLTO cache on Windows.
Since those temporary files are in the thinlto cache directory which we
prune periodically anyway, allowing them to be pruned too seems like a
tidy way to solve the problem.
Differential revision: https://reviews.llvm.org/D94962
The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask.
For example:
> start /B /AFFINITY 0xF lld-link.exe ...
Would let LLD only use 4 hyper-threads.
Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket.
Differential Revision: https://reviews.llvm.org/D92419
The number of hardware threads available to a ThreadPool can be limited if setting an affinity mask.
For example:
> start /B /AFFINITY 0xF lld-link.exe ...
Would let LLD only use 4 hyper-threads.
Previously, there was an outstanding issue on Windows Server 2019 on dual-CPU machines, which was preventing from using both CPU sockets. In normal conditions, when no affinity mask was set, ProcessorGroup::AllThreads was different from ProcessorGroup::UsableThreads. The previous code in llvm/lib/Support/Windows/Threading.inc L201 was improperly assuming those two values to be equal, and consequently was limiting the execution to only one CPU socket.
Differential Revision: https://reviews.llvm.org/D92419
The pipe signal handler must be installed before any other handlers are
registered. This is because the Unix RegisterHandlers function does not
perform a sigaction() for SIGPIPE unless a one-shot handler is present,
to allow long-lived processes (like lldb) to fully opt-out of llvm's
SIGPIPE handling and ignore the signal safely.
Fixes a bug introduced in D70277.
Tested by running Nick's test case:
% xcrun ./bin/clang -E -fno-integrated-cc1 x.c | tee foo.txt | head
I verified that child cc1 process exits with IO_ERR, and that the parent
recognizes the error code, exiting cleanly.
Differential Revision: https://reviews.llvm.org/D94324
Make llvm::Signpost more generic by untying from llvm::Timer. This
allows signposts to be used in a different context.
My motivation for doing this is being able to use signposts in LLDB.
Differential revision: https://reviews.llvm.org/D93655
Check if all possible values for a pair of knownbits give the same icmp result - these are based off the checks performed in InstCombineCompares.cpp and D86578.
Add exhaustive unit test coverage - a followup will update InstCombineCompares.cpp to use this.
Add a triple for powerpcle-*-*.
This is a little-endian encoding of the 32-bit PowerPC ABI, useful in certain niche situations:
1) A loader such as the FreeBSD loader which will be loading a little endian kernel. This is required for PowerPC64LE to load properly in pseries VMs.
Such a loader is implemented as a freestanding ELF32 LSB binary.
2) Userspace emulation of a 32-bit LE architecture such as x86 on 64-bit hosts such as PowerPC64LE with tools like box86 requires having a 32-bit LE toolchain and library set, as they operate by translating only the main binary and switching to native code when making library calls.
3) The Void Linux for PowerPC project is experimenting with running an entire powerpcle userland.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93918
This introduces asm support for the Branch Record Buffer extension, through
the new 'brbe' subtarget feature. It consists of a new set of system registers
that enable the handling of branch records.
Patch written by Simon Tatham.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D92389
This extends the command-line support for the 'armv8.7-a' architecture
name to the ARM target.
Based on a patch written by Momchil Velikov.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D93231
This introduces command-line support for the 'armv8.7-a' architecture name
(and an alias without the '-', as usual), and for the 'ls64' extension name.
Based on patches written by Simon Tatham.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D91776
Currently unknown keys when inputting mapping traits have the location set to the Value.
Example:
```
YAML:1:14: error: unknown key 'UnknownKey'
{UnknownKey: SomeValue}
^~~~~~~~~
```
This is unhelpful for a user as it draws them to fix the wrong item.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D93037
This is the first in a series of patches that attempts to migrate
existing cost instructions to return a new InstructionCost class
in place of a simple integer. This new class is intended to be
as light-weight and simple as possible, with a full range of
arithmetic and comparison operators that largely mirror the same
sets of operations on basic types, such as integers. The main
advantage to using an InstructionCost is that it can encode a
particular cost state in addition to a value. The initial
implementation only has two states - Normal and Invalid - but these
could be expanded over time if necessary. An invalid state can
be used to represent an unknown cost or an instruction that is
prohibitively expensive.
This patch adds the new class and changes the getInstructionCost
interface to return the new class. Other cost functions, such as
getUserCost, etc., will be migrated in future patches as I believe
this to be less disruptive. One benefit of this new class is that
it provides a way to unify many of the magic costs in the codebase
where the cost is set to a deliberately high number to prevent
optimisations taking place, e.g. vectorization. It also provides
a route to represent the extremely high, and unknown, cost of
scalarization of scalable vectors, which is not currently supported.
Differential Revision: https://reviews.llvm.org/D91174
Add an overload of `RedirectingFileSystem::create` that builds a
redirecting filesystem off of a simple vector of string pairs. This is
intended to be used to support `clang::arcmt::FileRemapper` and
`clang::PreprocessorOptions::RemappedFiles`.
Differential Revision: https://reviews.llvm.org/D91317
Uniformly return uniquely-owned filesystems from VFS creation APIs. The
one exception is `getRealFileSystem`, which has a single instance and
needs to be shared.
This is almost NFC, except that it fixes a memory leak in
`vfs::collectVFSFromYAML()`.
Depends on https://reviews.llvm.org/D92888
Differential Revision: https://reviews.llvm.org/D92890
Update all the users of `AlignedCharArrayUnion` to stop peeking inside
(to look at `buffer`) so that a follow-up patch can replace it with an
alias to `std::aligned_union_t`.
This was reviewed as part of https://reviews.llvm.org/D92512, but I'm
splitting this bit out to commit first to reduce churn in case the
change to `AlignedCharArrayUnion` needs to be reverted for some
unexpected reason.
Use a fast path for column width computation for ascii characters. Especially
relevant for llvm-objdump.
before:
% time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null
./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.75s user 0.01s system 99% cpu 0.757 total
after:
% time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null
./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.37s user 0.01s system 99% cpu 0.378 total
Differential Revision: https://reviews.llvm.org/D92180
This also teaches MachO writers/readers about the MachO cpu subtype,
beyond the minimal subtype reader support present at the moment.
This also defines a preprocessor macro to allow users to distinguish
__arm64__ from __arm64e__.
arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing
pointer-authentication codegen.
It also currently defaults to ios14 and macos11.
Differential Revision: https://reviews.llvm.org/D87095
This implementation of `ELFDumper<ELFT>::printAttributes()` in llvm-readobj has issues:
1) It crashes when the content of the attribute section is empty.
2) It uses `unwrapOrError` and `reportWarning` calls, though
ideally we want to use `reportUniqueWarning`.
3) It contains a TODO about redundant format version check.
`lib/Support/ELFAttributeParser.cpp` uses a hardcoded constant instead of the named constant.
This patch fixes all these issues.
Differential revision: https://reviews.llvm.org/D92318
The static_assert in "libcxx/include/memory" was the main offender here,
but then I figured I might as well `git grep -i instantat` and fix all
the instances I found. One was in user-facing HTML documentation;
the rest were in comments or tests.
This retrieves CPU affinity via FreeBSD's cpuset(2) API, and makes LLVM
respect affinity settings configured by the user via the cpuset(1)
command.
In particular, this allows to reduce the number of threads used on
machines with high core counts, which can interact badly with
parallelized build systems. This is particularly noticable with lld,
which spawns lots of threads even for linking e.g. hello_world!
This fix is related to PR48193, but does not adress the more fundamental
problem, which is that LLVM by default grabs as many CPUs and/or threads
as possible.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92271
I took the "Permitted"/"Not Permitted" combo from the `Tag_ARM_ISA_use` case (GNU tools print "Yes").
Reviewed By: compnerd, MaskRay, simon_tatham
Differential Revision: https://reviews.llvm.org/D90305
Truncates the APInt if the bit width is greater than the width specified,
otherwise do nothing
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D91445
In some places the parser guards against dereferencing `End`, while in
others it relies on the presence of a trailing `'\0'` to elide checks.
Add the remaining guards needed to ensure the parser never attempts to
dereference `End`, making it safe to not require a null-terminated input
buffer.
Update the parser fuzzer harness so that it tests with buffers that are
guaranteed to be non-null-terminated, null-terminated, and 1-terminated,
additionally ensuring the result of the parse is the same in each case.
Some of the regression tests were written by inspection, and some are
cases caught by the fuzzer which required additional fixes in the
parser.
Differential Revision: https://reviews.llvm.org/D84050
When we produce an YAML output, we also print leading zeroes currently.
An output might look like this:
```
- Name: .dynsym
Type: SHT_DYNSYM
Address: 0x0000000000001000
EntSize: 0x0000000000000018
```
There are probably no reason to print leading zeroes.
It just makes harder to read values. This patch stops printing them.
The output becomes like:
```
- Name: .dynsym
Type: SHT_DYNSYM
Address: 0x1000
EntSize: 0x18
```
This affects obj2yaml mostly, but also dsymutil and llvm-xray tools output.
Differential revision: https://reviews.llvm.org/D90930
A common routine is to have the compiler crash, and attempt to rerun the
cc1 command-line by copying and pasting the arguments printed by
`llvm::Support::PrettyStackProgram::print`. However, these arguments are
not quoted or escaped which means they must be manually edited before
working correctly. This patch ensures that shell-unfriendly characters
are C-escaped, and arguments with spaces are double-quoted reducing the
frustration of running cc1 inside a debugger.
As the quoting is C, this is "best effort for most shells", but should
be fine for at least bash, zsh, csh, and cmd.exe.
Reviewed by: jhenderson
Differential Revision: https://reviews.llvm.org/D90759
The `Range` of an alias/anchor token includes the leading `&` or `*`,
but it is skipped while parsing the name. The check for an empty name
fails to account for the skipped leading character and so the error is
never hit.
Fix the off-by-one and add a couple regression tests.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D91462
ValueTracking was using a more powerful abs() implementation. Roll
it into KnownBits::abs(). Also add an exhaustive test for abs(),
in both the poisoning and non-poisoning variants.
By starting with the source shift value minimum leading/trailing bits, we can then add the minimum known shift amount to more accurately predict the minimum leading/trailing bits of the result.
This is currently only covered by the exhaustive unit tests in KnownBitsTests.cpp, but will help with some of the regressions encountered in D90479 (PR44526).
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
This is a follow-up for D70378 (Cover usage of LLD as a library).
While debugging an intermittent failure on a bot, I recalled this scenario which
causes the issue:
1.When executing lld/test/ELF/invalid/symtab-sh-info.s L45, we reach
lld:🧝:Obj-File::ObjFile() which goes straight into its base ELFFileBase(),
then ELFFileBase::init().
2.At that point fatal() is thrown in lld/ELF/InputFiles.cpp L381, leaving a
half-initialized ObjFile instance.
3.We then end up in lld::exitLld() and since we are running with LLD_IN_TEST, we
hapily restore the control flow to CrashRecoveryContext::RunSafely() then back
in lld::safeLldMain().
4.Before this patch, we called errorHandler().reset() just after, and this
attempted to reset the associated SpecificAlloc<ObjFile<ELF64LE>>. That tried
to free the half-initialized ObjFile instance, and more precisely its
ObjFile::dwarf member.
Sometimes that worked, sometimes it failed and was catched by the
CrashRecoveryContext. This scenario was the reason we called
errorHandler().reset() through a CrashRecoveryContext.
But in some rare cases, the above repro somehow corrupted the heap, creating a
stack overflow. When the CrashRecoveryContext's filter (that is,
__except (ExceptionFilter(GetExceptionInformation()))) tried to handle the
exception, it crashed again since the stack was exhausted -- and that took the
whole application down. That is the issue seen on the bot. Locally it happens
about 1 times out of 15.
Now this situation can happen anywhere in LLD. Since catching stack overflows is
not a reliable scenario ATM when using CrashRecoveryContext, we're now
preventing further re-entrance when such failures occur, by signaling
lld::SafeReturn::canRunAgain=false. When running with LLD_IN_TEST=2 (or above),
only one iteration will be executed, instead of two.
Differential Revision: https://reviews.llvm.org/D88348
We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........
When LLDB Python bindings are used and stack backtraces are enabled
for logging, getMainExecutable() is called with argv0 being null.
This caused the fallback function getprogpath() (used on FreeBSD, NetBSD
and Linux) to segfault. Make it handle null executable name gracefully.
Differential Revision: https://reviews.llvm.org/D91012
Add unit tests for this behavior, since the integration test for
clang-cl did not catch these bugs.
Fixes PR47604
Differential Revision: https://reviews.llvm.org/D90866
As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.
Differential Revision: https://reviews.llvm.org/D90419
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.
We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105
On Windows, after commit 881ba10465, tools
using TempFile would error with "bad file descriptor" when writing the
file on a network drive. It appears that setting the delete-on-close bit via
SetFileInformationByHandle/FileDispositionInfo prevented it from
accessing the file on network drives, and although using
FILE_DISPOSITION_INFO seems to work, it causes other troubles.
Differential Revision: https://reviews.llvm.org/D81803
These logically belong together since it's a base commit plus
followup fixes to less common build configurations.
The patches are:
Revert "CfgInterface: rename interface() to getInterface()"
This reverts commit a74fc48158.
Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5"
This reverts commit f2a06875b6.
Revert "Try to make GCC5 happy about the CfgTraits thing"
This reverts commit 03a5f7ce12.
Revert "Introduce CfgTraits abstraction"
This reverts commit c0cdd22c72.
It appears for Swift there was confusing errors when trying to parse APINotes, when libAPINotes and libInterfaceStub are linked, they both export symbol
`__ZN4llvm4yaml7yamlizeINS_12VersionTupleEEENSt3__19enable_ifIXsr16has_ScalarTraitsIT_EE5valueEvE4typeERNS0_2IOERS5_bRNS0_12EmptyContextE`, and discovered
same symbol defined within llvm-ifs.
This consolidates the boilerplate into YAMLTraits and defers the specific validation in reading the whole input.
fixes: rdar://problem/70450563
Reviewed By: phosek, dblaikie
Differential Revision: https://reviews.llvm.org/D89764
Avoid having to instantiate and compile a subset of the dominator tree logic
separately for each node type. More importantly, this allows generic
algorithms to be built on top of dominator trees without writing them as
templates -- such algorithms can now use opaque CfgBlockRef and
CfgInterface instead.
A type-erased implementation of dominator trees could be written in
terms of CfgInterface as well, but doing so would change the current
trade-off: it would slightly reduce code size at the cost of a slight
runtime overhead.
This patch does not change the trade-off, as it only does type-erasure
where basic blocks can be treated in a fully opaque way, i.e. it only
moves methods that don't require iteration over CFG successors and
predecessors.
v5:
- rename generic_{begin,end,children} back without the generic_ prefix
and refer explictly to base class methods in NewGVN, which wants to
mutate the order of dominator tree node children directly
v6:
- style change: iDom -> idom; it's arguable whether this is really
invalid, since it is actually standard camelCase, but clang-tidy
complains about it so... *shrug*
- rename {to,from}Generic -> {wrap,unwrap}Ref
Change-Id: Ib860dc04cf8bb093d8ed00be7def40d662213672
Differential Revision: https://reviews.llvm.org/D83089
The CfgTraits abstraction simplfies writing algorithms that are
generic over the type of CFG, and enables writing such algorithms
as regular non-template code that operates on opaque references
to CFG blocks and values.
Implementations of CfgTraits provide operations on the concrete
CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock *`.
CfgInterface is an abstract base class which provides operations
on opaque types CfgBlockRef and CfgValueRef. Those opaque types
encapsulate a `void *`, but the meaning depends on the concrete
CFG type. For example, MachineCfgTraits -- for use with MachineIR
in SSA form -- encodes a Register inside CfgValueRef. Converting
between concrete references and opaque/generic ones is done by
CfgTraits::{fromGeneric,toGeneric}. Convenience methods
CfgTraits::{un}wrap{Iterator,Range} are available as well.
Writing algorithms in terms of CfgInterface adds some overhead
(virtual method calls, plus in same cases it removes the
opportunity to inline iterators), but can be much more convenient
since generic algorithms can be written as non-templates.
This patch adds implementations of CfgTraits for all CFGs on
which dominator trees are calculated, so that the dominator
tree can be ported to this machinery. Only IrCfgTraits (LLVM IR)
and MachineCfgTraits (Machine IR in SSA form) are complete, the
other implementations are limited to the absolute minimum
required to make the upcoming dominator tree changes work.
v5:
- fix MachineCfgTraits::blockdef_iterator and allow it to iterate over
the instructions in a bundle
- use MachineBasicBlock::printName
v6:
- implement predecessors/successors for all CfgTraits implementations
- fix error in unwrapRange
- rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
that is consistent with {wrap,unwrap}{Iterator,Range}
- use getVRegDef instead of getUniqueVRegDef
v7:
- std::forward fix in wrapping_iterator
- fix typos
v8:
- cleanup operators on CfgOpaqueType
- address other review comments
Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d
Differential Revision: https://reviews.llvm.org/D83088
For the reproducers in LLDB we want to switch to an "immediate mode"
FileCollector that writes every file encountered straight to disk so we
can generate the actual mapping out-of-process. This patch moves the
interface into a separate base class.
Differential revision: https://reviews.llvm.org/D89742
Measure amount of high-level or fixed-cost operations performed during
building/loading modules and during header search. High-level operations
like building a module or processing a .pcm file are motivated by
previous issues where clang was re-building modules or re-reading .pcm
files unnecessarily. Fixed-cost operations like `stat` calls are tracked
because clang cannot change how long each operation takes but it can
perform fewer of such operations to improve the compile time.
Also tracking such stats over time can help us detect compile-time
regressions. Added stats are more stable than the actual measured
compilation time, so expect the detected regressions to be less noisy.
On relanding drop stats in MemoryBuffer.cpp as their value is pretty low
but affects a lot of clients and many of those aren't interested in
modules and header search.
rdar://problem/55715134
Reviewed By: aprantl, bruno
Differential Revision: https://reviews.llvm.org/D86895
- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
This relands commit 53b3873cf4. The failure
of `ConvertUTFTest.UTF16WrappersForConvertUTF16ToUTF8String` detected the
first time is fixed.
Differential Revision: https://reviews.llvm.org/D88824
Split out from https://reviews.llvm.org/D66782, use `Optional<MemoryBufferRef>`
in `line_iterator` so you don't need access to a `MemoryBuffer*`. Follow up
patches in `clang/` will leverage this.
Differential Revision: https://reviews.llvm.org/D89280
As preparation for changing `LineIterator` to work with `MemoryBufferRef`:
- Add an `operator==` that uses buffer pointer identity to ensure two buffers
are equivalent.
- Split out `MemoryBufferRef.h`, to avoid polluting `LineIterator.h` includers
with everything from `MemoryBuffer.h`. This also means moving the
`MemoryBuffer` constructor to a source file.
Differential Revision: https://reviews.llvm.org/D89279
I have introduced a new template PolySize class, where the template
parameter determines the type of quantity, i.e. for an element
count this is just an unsigned value. The ElementCount class is
now just a simple derivation of PolySize<unsigned>, whereas TypeSize
is more complicated because it still needs to contain the uint64_t
cast operator, since there are still many places in the code that
rely upon this implicit cast. As such the class also still needs
some of it's own operators.
I've tried to minimise the amount of code in the base PolySize
class, which led to a couple of changes:
1. In some places we were relying on '==' operator comparisons
between ElementCounts and the scalar value 1. I didn't put this
operator in the new PolySize class, and thought it was actually
clearer to use the isScalar() function instead.
2. I removed the isByteSized function and replaced it with calls
to isKnownMultipleOf(8).
I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so
that it's more consistent with coefficientDivideBy.
Differential Revision: https://reviews.llvm.org/D88409
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:
* The "Oland" and "Hainan" variants of gfx601 are now split out into
gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
that to avoid using the shaderZExport workaround on gfx602.
* One variant of gfx703 is now split out into gfx705. LLPC and other
front-ends could use that to avoid using the
shaderSpiCsRegAllocFragmentation workaround on gfx705.
* The "TongaPro" variant of gfx802 is now split out into gfx805.
TongaPro has a faster 64-bit shift than its former friends in gfx802,
and a subtarget feature could be set up for that to take advantage of
it. This commit does not make that change; it just adds the target.
V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
so fix the GPUKind order.
Differential Revision: https://reviews.llvm.org/D88916
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
This patch refactors the logic in ValueTracking.cpp so that
computeKnownBitsForMul now uses a helper function from KnownBits.
NFC
Differential Revision: https://reviews.llvm.org/D88935
(Based on D87170 by dsanders)
I recently had need to call out to an external API to emit a JSON object as part
of one an LLVM tool was emitting. However, our JSON support didn't provide a way
to delegate part of the JSON output to that API.
Add rawValueBegin() and rawValueEnd() to maintain and check the internal state
while something else is writing to the stream. It's the users responsibility to
ensure that the resulting JSON output is still valid.
Differential Revision: https://reviews.llvm.org/D88902
`LLVM-Unit :: Support/./SupportTests/ConvertUTFTest.ConvertUTF16LittleEndianToUTF8String`
`FAIL`s on Solaris/sparcv9:
In `llvm/lib/Support/ConvertUTFWrapper.cpp` (`convertUTF16ToUTF8String`)
the `SrcBytes` arg is reinterpreted/accessed as `UTF16` (`unsigned short`,
which requires 2-byte alignment on strict-alignment targets like Sparc)
without anything guaranteeing the alignment, so the access yields a
`SIGBUS`.
This patch avoids this by enforcing the required alignment in the callers.
Tested on `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D88824