This diff adjusts binaryAnd to take advantage of the analysis
based on KnownBits.
Differential revision: https://reviews.llvm.org/D125603
Test plan:
1/ ninja check-llvm
2/ ninja check-llvm-unit
Add toKnownBits() method to mirror fromKnownBits(). We know the
top bits that are constant between min and max.
The return value for an empty range is chosen to be conservative.
This is based on https://reviews.llvm.org/D125168 which adds a
wrapper to allow use of opaque pointers from the C API.
I added an opaque pointer mode test to echo.ll, and to fix assertions
that forbid the use of mixed typed and opaque pointers that were
triggering in it I had to also add wrappers for setOpaquePointers()
and isOpaquePointer().
I also changed echo.ll to remove a bitcast i32* %x to i8*, because
passing it through llvm-as and llvm-dis was generating a
%0 = bitcast ptr %x to ptr, but when building that same bitcast in
echo.cpp it was getting elided by IRBuilderBase::CreateCast
(08ac661248/llvm/include/llvm/IR/IRBuilder.h (L1998-L1999)).
Differential Revision: https://reviews.llvm.org/D125183
This patch is refactoring the allocation, initialization and deletion
of MDNodes. It is intended as a preparatory patch for the upcoming
addition of dynamic resizability of MDNodes. It is fundamentally NFC,
but removes the necessity for suppressing the memory sanitizer for
MDNode's operator delete.
Reviewers: dexonsmith
Differential Revision: https://reviews.llvm.org/D125489
Even if the minimum number of elements is 1 and the length doesn't change,
we don't know what vscale is so we can't classify it as identity mask. Instead it
is a zero element splat.
For reverse, we shouldn't classify it as a reverse unless there are at least 2 elements
in the mask. This applies to both fixed and scalable vectors. For fixed vectors, a single
element would be an identity shuffle. For scalable vector it's a zero elt splat.
Reviewed By: sdesmalen, liaolucy
Differential Revision: https://reviews.llvm.org/D124655
Normally the index type will already be canonicalized here, but
this is not guaranteed depending on visitation order. The code
was already accounting for a potentially needed sext, but a trunc
may also be needed.
Add a ConstantExpr::getSExtOrTrunc() helper method to make this
simpler. This matches the corresponding IRBuilder method in behavior.
Fixes https://github.com/llvm/llvm-project/issues/55228.
X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee.
It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion.
- For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller.
- For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match.
The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`.
This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.
Differential Revision: https://reviews.llvm.org/D123284
This continues the push away from hard-coded knowledge about functions
towards attributes. We'll use this to annotate free(), realloc() and
cousins and obviate the hard-coded list of free functions.
Differential Revision: https://reviews.llvm.org/D123083
As implemented this patch assumes that Typed pointer support remains in
the llvm::PointerType class, however this could be modified to use a
different subclass of llvm::Type that could be disallowed from use in
other contexts.
This does not rely on inserting typed pointers into the Module, it just
uses the llvm::PointerType class to track and unique types.
Fixes#54918
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D122268
This emits an `st_size` that represents the actual useable size of an object before the redzone is added.
Reviewed By: vitalybuka, MaskRay, hctim
Differential Revision: https://reviews.llvm.org/D123010
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
Most of insertelement constant folding is blocked if the vector type
is scalable. I believe we can make an exception for inserting null
into an all zeros vector.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123413
specifying DW_AT_trampoline as a string. Also update the signature
of DIBuilder::createFunction to reflect this addition.
Differential Revision: https://reviews.llvm.org/D123697
The original AutoUpgrade code from 1e68724d24
did not retain existing attributes. I noticed this in some downstream test
cases, but it turns out there are also two affected testcase upstream.
Differential Revision: https://reviews.llvm.org/D121971
LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker.
Such a setup is allowed in the normal build with this change that is possible.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D123493
This got changed to use hasAttrSomewhere() during review, and I didn't
notice until today when I was writing some tests for another part of
this system that using hasAttrSomewhere only checked the callsite for
allocalign, rather than both the callsite and the definition. This fixes
that by introducing a helper method.
Differential Revision: https://reviews.llvm.org/D121641
This way it can be inlined to its caller. This method
shows up in the profile and it is essentially a fancy
getter. It would benefit from inlining into its callers.
NFC.
The LLVM IR verifier and analysis linter defines and uses several macros in
code that performs validation of IR expectations. Previously, these macros
were named with an 'Assert' prefix. These names were misleading since the
macro definitions are not conditioned on build kind; they are defined
identically in builds that have asserts enabled and those that do not. This
was confusing since an LLVM developer might expect these macros to be
conditionally enabled as 'assert' is. Further confusion was possible since
the LLVM IR verifier is implicitly disabled (in Clang::ConstructJob()) for
builds without asserts enabled, but only for Clang driver invocations; not
for clang -cc1 invocations. This could make it appear that the macros were
not active for builds without asserts enabled, e.g. when investigating
behavior using the Clang driver, and thus lead to surprises when running
tests that exercise the clang -cc1 interface.
This change renames this set of macros as follows:
Assert -> Check
AssertDI -> CheckDI
AssertTBAA -> CheckTBAA
This allows both explicitly enabling and explicitly disabling
opaque pointers, in anticipation of the default switching at some
point.
This also slightly changes the rules by allowing calls if either
the opaque pointer mode has not yet been set (explicitly or
implicitly) or if the value remains unchanged.
With opaque pointers, we can eliminate zero-index GEPs even if
they have multiple indices, as this no longer impacts the result
type of the GEP.
This optimization is already done for instructions in InstSimplify,
but we were missing the corresponding constant expression handling.
The constexpr transform is a bit more powerful, because it can
produce a vector splat constant and also handles undef values --
it is an extension of an existing single-index transform.
Prior to this change, CallBase::hasFnAttr checked the called function to
see if it had an attribute if it wasn't set on the CallBase, but
getFnAttr didn't do the same delegation, which led to very confusing
behavior. This patch fixes the issue by making CallBase::getFnAttr also
check the function under the same circumstances.
Test changes look (to me) like they're cleaning up redundant attributes
which no longer get specified both on the callee and call. We also clean
up the one ad-hoc implementation of this getter over in InlineCost.cpp.
Differential Revision: https://reviews.llvm.org/D122821
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.
This is recommit of 115b3ace36, reverted in
8160dd582b.
Differential Revision: https://reviews.llvm.org/D69562
This patch mostly follows up on D121292 which introduced the vp.fcmp
intrinsic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122729
We only want to do the upgrade from named to anonymous struct
return if the intrinsic is declared to return a struct, but not
if it has an overloaded return type that just happens to be a
struct. In that case the struct type will be mangled into the
intrinsic name and there is no problem.
This should address the problem reported in
https://reviews.llvm.org/D122471#3416598.
This patch adds the first support for vector-predicated comparison
intrinsics, starting with vp.fcmp. It uses metadata to encode its
condition code, like the llvm.experimental.constrained.fcmp intrinsic.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D121292
This reverts commit 115b3ace36.
Starting from this commit the buildbot sanitizer-x86_64-linux-bootstrap-msan
starts failing (build 10071). Reverted for investigation.
This is an alternative to D122376. Rather than working around the
problem, this patch requires that struct return types in intrinsics
are anonymous/literal and adds auto-upgrade code to convert
existing uses of intrinsics with named struct types.
This ensures that the mapping between intrinsic name and
intrinsic function type is actually bijective, as it is supposed
to be.
This also fixes https://github.com/llvm/llvm-project/issues/37891.
Differential Revision: https://reviews.llvm.org/D122471
A new function 'getConstrainedIntrinsic' is added, which for any gived
instruction returns id of the corresponding constrained intrinsic. If
there is no constrained counterpart for the instruction or the instruction
is already a constrained intrinsic, the function returns zero.
Differential Revision: https://reviews.llvm.org/D69562
Inline assembly is scary but we need to support it for the OpenMP GPU
device runtime. The new assumption expresses the fact that it may not
have call semantics, that is, it will not call another function but
simply perform an operation or side-effect. This is important for
reachability in the presence of inline assembly.
Differential Revision: https://reviews.llvm.org/D109986
Before we gave up if a call through bitcast had parameter attributes.
Interestingly, we allowed attributes for the return value already. We
now handle both the same way, namely, we drop the ones that are
incompatible with the new type and keep the rest. This cannot cause
"more UB" than initially present.
Differential Revision: https://reviews.llvm.org/D119967
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the value by 1, so the max aligment that
ISel can support is 2^14. This patch verify align attribute for parameter.
Differential Revision: https://reviews.llvm.org/D122130
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the shift value by 1, so the max aligment
ISel can support is 2^14. This patch verify the parameter and return
value for alignment.
Differential Revision: https://reviews.llvm.org/D121898
This adds LLVMAnyPointerToElt to use instead of LLVMPointerToElt.
This allows us to preserve the address space as part of the type
overload for the intrinsic, but still require the vector element
type to match the pointer type.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D122042
Allows for skipping the pointer to vector type if opaque pointers
are enabled and the matching pointer is a vector pointer when
matching an intrinsic signature in the verifier.
No test added since lacking a target using intrinsic with pointer
to vector arguments.
Differential Revision: https://reviews.llvm.org/D122203
VPIntrinsic::getStaticVectorLength infers the operational vector length
of a VPIntrinsic instance from a type that is used with the intrinsic.
The function used the mask operand before. Yet, vp.merge|select do not
have a mask operand (in the predicating sense that the other VP
intrinsics are using them - it is a selection mask for them). Fallback
to the return type to fix this.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D121913
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be
populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be
lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.
Differential Revision: https://reviews.llvm.org/D115907
This allows us to not have to specify -opaque-pointers when updating
IR tests from typed pointers to opaque pointers.
We detect opaque pointers in .ll files by looking for relevant tokens,
either "ptr" or "*".
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D119482
Move structural hashing into virtual methods on Pass. This will
allow MachineFunctionPass to override the method to add hashing of
the MachineFunction.
Differential Revision: https://reviews.llvm.org/D120123
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.
Basically the same as D120527.
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D121847
Includes verifier changes checking the elementtype, clang codegen
changes to emit the elementtype, and ISel changes using the elementtype.
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D120527
According to LangRef, an access scope must have zero operands and
be distinct. The access group may either be a single access scope
or a list of access scopes.
LoopInfo may assert if this is not the case.
Per LangRef, swifterror alloca must be a pointer.
Not checking this may result in a verifier error after transforms
instead, so make sure it's discarded early.
This patch introduces two new experimental IR intrinsics and SDAG nodes
to represent vector strided loads and stores.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D114884
This patch adds PrettyStackEntries before running passes. The entries
include the pass name and the IR unit the pass runs on.
The information is used the print additional information when a pass
crashes, including the name and a reference to the IR unit on which it
crashed. This is similar to the behavior of the legacy pass manager.
The improved stack trace now includes:
Stack dump:
0. Program arguments: bin/opt -loop-vectorize -force-vector-width=4 crash.ll
1. Running pass 'ModuleToFunctionPassAdaptor' on module 'crash.ll'
2. Running pass 'LoopVectorizePass' on function '@a'
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D120993
Without opaque pointers, this code currently treats a call through
a bitcast as the function being address taken, and IPSCCP relies
on this for correctness. Match the same behavior under opaque
pointers by checking that the function types are the same.
Fixes https://github.com/llvm/llvm-project/issues/54258.
VectorBuilder wraps around an IRBuilder and
VectorBuilder::createVectorInstructions emits VP intrinsics as if they
were regular instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D105283
This check is not compatible with opaque pointers. We can avoid
it by adjusting the getPointerAlignment() implementation to avoid
creating unnecessary ptrtoint expressions for bitcasted pointers.
The code already uses OnlyIfReduced to not create an expression
if it does not simplify, and this makes sure that folding a
bitcast and ptrtoint into a ptrtoint doesn't count as a
simplification.
Differential Revision: https://reviews.llvm.org/D120904
This will let us start moving away from hard-coded attributes in
MemoryBuiltins.cpp and put the knowledge about various attribute
functions in the compilers that emit those calls where it probably
belongs.
Differential Revision: https://reviews.llvm.org/D117921
VectorBuilder wraps around an IRBuilder and
VectorBuilder::createVectorInstructions emits VP intrinsics as if they
were regular instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D105283
Now that clang no longer emits GlobalIFunc-s with a
declaration for a resolver, we can restore that check.
In addition, add a linkage check like the one we have
on GlobalAlias-es, and a Verifier test for ifuncs.
Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D120267
These aliases are produced by MergeFunctions and need to be mangled according to the calling convention of the function they are pointing to instead of defaulting to the C calling convention.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D120382
Add traverse through gc.relocate in determining whether base is
isExclusivelyDerivedFromNull OR ExclusivelyNull.
Reviewers: reames, anna
Reviewed By: reames, anna
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D119712
This introduces a new "ptrauth" operand bundle to be used in
call/invoke. At the IR level, it's semantically equivalent to an
@llvm.ptrauth.auth followed by an indirect call, but it additionally
provides additional hardening, by preventing the intermediate raw
pointer from being exposed.
This mostly adds the IR definition, verifier checks, and support in
a couple of general helper functions. Clang IRGen and backend support
will come separately.
Note that we'll eventually want to support this bundle in indirectbr as
well, for similar reasons. indirectbr currently doesn't support bundles
at all, and the IR data structures need to be updated to allow that.
Differential Revision: https://reviews.llvm.org/D113685
We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e. we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.
Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.
That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.
This patch extends the `uwtable` attribute with an optional
value:
- `uwtable` (default to `async`)
- `uwtable(sync)`, synchronous unwind tables
- `uwtable(async)`, asynchronous (instruction precise) unwind tables
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114543
Need to add an assert about this->takeName(this). This restriction is already documented, so this is just an NFC check.
Without this assertion (as prescribed by original comments for this API), name deletion or down-stream assert failures may occur in other routines: e.g. at the beginning of replaceAllUsesWith() below.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D119636
Add a new llvm.fptrunc.round intrinsic to precisely control
the rounding mode when converting from f32 to f16.
Differential Revision: https://reviews.llvm.org/D110579
We currently don't have any specialized upgrades for intrinsics
that can be used in invokes, but they can still be subject to
a generic remangling upgrade. In particular, this happens when
upgrading statepoint intrinsics under -opaque-pointers.
This patch just changes the upgrade code to work on CallBase
instead of CallInst in particular.
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.
More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.
This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].
As a consequence of that patch, downstream user may need to manually some extra
files:
llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h
In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 10978519
before: 11245451
Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions
Differential Revision: https://reviews.llvm.org/D118781
This makes the statepoint methods in IRBuilder accept a
FunctionCallee, which carries both the callee and function type.
This is used to add the elementtype attribute to the statepoint call.
RS4GC requires an additional tweak to actually preserve that attribute
-- previously the attributes on the call were completely overwritten.
Differential Revision: https://reviews.llvm.org/D118886
This patch extends clang frontend to add metadata that can be used to emit macho files with two build version load commands.
It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that.
MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target,
and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native
macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build
compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable
by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support.
Differential Revision: https://reviews.llvm.org/D115415
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
Currently, the clang.arc.attachedcall bundle takes an optional function
argument. Depending on whether the argument is present, calls with this
bundle have the following semantics:
- on x86, with the argument present, the call is lowered to:
call _target
mov rax, rdi
call _objc_retainAutoreleasedReturnValue
- on AArch64, without the argument, the call is lowered to:
bl _target
mov x29, x29
and the objc runtime call is expected to be emitted separately.
That's because, on x86, the objc runtime checks for both the mov and
the call on x86, and treats the combination as the ARC autorelease elision
marker.
But on AArch64, it only checks for the dedicated NOP marker, as that's
historically been sufficiently unique. Thanks to that, the runtime call
wasn't required to be adjacent to the NOP marker, so it wasn't emitted
as part of the bundle sequence.
This patch unifies both architectures: on AArch64, we now emit all
3 instructions for the bundle. This guarantees that the runtime call
is adjacent to the marker in the sequence, and that's information the
runtime can use to further optimize this.
This helps simplify some of the handling, in particular
BundledRetainClaimRVs, which no longer needs to know whether the bundle
is sufficient or not: it now always should be.
Note that this does not include an AutoUpgrade for the nullary bundles,
as they are only produced in ObjCContract as part of the obj/asm emission
pipeline, and are not expected to be in bitcode.
Differential Revision: https://reviews.llvm.org/D118214
Once again, this fold is meaningless with opaque pointers, as there
is no pointer element type to canonicalize. At some point, we may
want to do GEP type canonicalizations.
DIStringType is used to encode the debug info of a character object
in Fortran. A Fortran deferred-length character object is typically
implemented as a pair of the following two pieces of info: An address
of the raw storage of the characters, and the length of the object.
The stringLocationExp field contains the DIExpression to get to the
raw storage.
This patch also enables the emission of DW_AT_data_location attribute
in a DW_TAG_string_type debug info entry based on stringLocationExp
in DIStringType.
A test is also added to ensure that the bitcode reader is backward
compatible with the old DIStringType format.
Differential Revision: https://reviews.llvm.org/D117586
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
This patch removes an incorrect behaviour in Constants.cpp, which would
replace dead constant references in metadata with an undef value. This
blanket replacement resulted in undef values being inserted into
metadata that would not accept them. The replacement was intended for
debug info metadata, but this is now instead handled in the RAUW
handler.
Differential Revision: https://reviews.llvm.org/D117300
This method is intended for use in places that cannot be reached
with opaque pointers, or part of deprecated methods. This makes
it easier to see that some uses of getPointerElementType() don't
need further action.
Differential Revision: https://reviews.llvm.org/D117870
MSVC currently doesn't support 80 bits long double. ICC supports it when
the option `/Qlong-double` is specified. Changing the alignment of f80
to 16 bytes so that we can be compatible with ICC's option.
Reviewed By: rnk, craig.topper
Differential Revision: https://reviews.llvm.org/D115942
I tried to look over the file and didn't see any other non-use of *Len variables.
Reviewed By: deadalnix
Differential Revision: https://reviews.llvm.org/D116482
This follows up on the work in D116599, which changed AttrBuilder
to store string attributes as SmallVector<Attribute>. This patch
changes the implementation to store *all* attributes as a sorted
vector.
This both makes the implementation simpler and improves compile-time.
We get a -0.5% geomean compile-time improvement on CTMark at O0.
Differential Revision: https://reviews.llvm.org/D117558
Currently, the behavior when adding an attribute with the same key
as an existing attribute is inconsistent, depending on the type of
the attribute and the method used to add it. When going through
AttrBuilder::addAttribute(), the new attribute always overwrites
the old one. When going through AttrBuilder::merge() the new
attribute overwrites the existing one if it is a string attribute,
but keeps the existing one for int and type attributes. One
particular API also asserts that you can't overwrite an align
attribute, but does not handle any of the other int, type or string
attributes.
This patch makes the behavior consistent by always overwriting with
the new attribute, which is the behavior I would intuitively expect.
Two tests are affected, which now make a different (but equally
valid) choice. Those tests could be improved by taking the maximum
deref bytes, but I haven't bothered with that, since this is testing
a degenerate case -- the important bit is that it doesn't crash.
Differential Revision: https://reviews.llvm.org/D117552
I based this off of the API already create for llvm.dbg.value since both
intrinsics have the same arguments at the API level.
I added some tests exercising the API a little as well as an additional small
test that shows how one can use llvm.dbg.addr to limit the PC range where an
address value is available in the debugger. This is done by calling
llvm.dbg.value with undef and the same metadata info as one used to create the
llvm.dbg.addr.
rdar://83957028
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D117442
The `InstrProfInstBase` class is for all `llvm.instrprof.*` intrinsics. In a
later diff we will add new instrinsic of this type. Also refactor some
logic in `InstrProfiling.cpp`.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D117261
This is unused, and doesn't make a lot of sense as an API. The
usual pattern would be to combine the AttrBuilder(AttributeSet)
constructor with the overlaps() method.
Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.
The invalid undef value already triggers a verifier failure, but
then the upwards scan from the cleanuppad ends up asserting. Make
sure this is handled gacefully instead.
This fixes -fno-semantic-interposition -fsanitize-coverage incompatibility.
-fPIC -fno-semantic-interposition may add dso_local to an external linkage
function. -fsanitize-coverage instrumentation does not clear dso_local when
adding comdat nodeduplicate. This causes a compatibility issue: the function
symbol may be referenced by a PC-relative relocation without using the local
alias. In -shared mode, ld will report a relocation error.
The fix is to either clear dso_local when adding comdat nodeduplicate, or
supporting comdat nodeduplicate. The latter is more appropriate, because a
comdat nodeduplicate is like not using comdat.
Note: The comdat condition was originally added by D77429 to not use local alias
for a hidden external linkage function in a deduplicate comdat. The condition
has been unused since the code was refactored to only use local alias for
default visibility symbols.
Note: `canBenefitFromLocalAlias` is used by clang/lib/CodeGen/CodeGenModule.cpp
and we don't want to add dso_local to default visibility external linkage comdat any
(clang/test/CodeGenCUDA/usual-deallocators.cu).
Differential Revision: https://reviews.llvm.org/D117190
Since 26c6a3e736, LLVM's inliner will "upgrade" the caller's stack protector
attribute based on the callee. This lead to surprising results with Clang's
no_stack_protector attribute added in 4fbf84c173 (D46300). Consider the
following code compiled with clang -fstack-protector-strong -Os
(https://godbolt.org/z/7s3rW7a1q).
extern void h(int* p);
inline __attribute__((always_inline)) int g() {
return 0;
}
int __attribute__((__no_stack_protector__)) f() {
int a[1];
h(a);
return g();
}
LLVM will inline g() into f(), and f() would get a stack protector, against the
users explicit wishes, potentially breaking the program e.g. if h() changes the
value of the stack cookie. That's a miscompile.
More recently, bc044a88ee (D91816) addressed this problem by preventing
inlining when the stack protector is disabled in the caller and enabled in the
callee or vice versa. However, the problem remained if the callee is marked
always_inline as in the example above. This affected users, see e.g.
http://crbug.com/1274129 and http://llvm.org/pr52886.
One way to fix this would be to prevent inlining also in the always_inline
case. Despite the name, always_inline does not guarantee inlining, so this
would be legal but potentially surprising to users.
However, I think the better fix is to not enable the stack protector in a
caller based on the callee. The motivation for the old behaviour is unclear, it
seems counter-intuitive, and causes real problems as we've seen.
This commit implements that fix, which means in the example above, g() gets
inlined into f() (also without always_inline), and f() is emitted without stack
protector. I think that matches most developers' expectations, and that's also
what GCC does.
Another effect of this change is that a no_stack_protector function can now be
inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856):
extern void h(int* p);
inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() {
return 0;
}
int f() {
int a[1];
h(a);
return g();
}
I think that's fine. Such code would be unusual since no_stack_protector is
normally applied to a program entry point which sets up the stack canary. And
even if such code exists, inlining doesn't change the semantics: there is still
no stack cookie setup/check around entry/exit of the g() code region, but there
may be in the surrounding context, as there was before inlining. This also
matches GCC.
See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722
Differential revision: https://reviews.llvm.org/D116589
llvm.vp.merge interprets the %evl operand differently than the other vp
intrinsics: all lanes at positions greater or equal than the %evl
operand are passed through from the second vector input. Otherwise it
behaves like llvm.vp.select.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116725
MSVC currently doesn't support 80 bits long double. ICC supports it when
the option `/Qlong-double` is specified. Changing the alignment of f80
to 16 bytes so that we can be compatible with ICC's option.
Reviewed By: rnk, craig.topper
Differential Revision: https://reviews.llvm.org/D115942
I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of-bounds. In practice, this means changing the definition so that
the index is now only valid in the range [-VL, VL-1] where VL is the
known minimum vector length. We use the vscale_range attribute to
take the minimum vscale value into account so that we can permit
more indices when the attribute is present.
The splice intrinsic is currently only ever generated by the vectoriser,
which will never attempt to splice vectors with out-of-bounds values.
Changing the definition also makes things simpler for codegen since we
can always assume that the index is valid.
This patch was created in response to review comments on D115863
Differential Revision: https://reviews.llvm.org/D115933
We need to explicitly visit a number of types, as these are no
longer reachable through the pointer type if opaque pointers are
enabled. This is similar to ValueEnumerator changes that have
been done previously.
Track all GlobalObjects that reference a given comdat, which allows
determining whether a function in a comdat is dead without scanning
the whole module.
In particular, this makes filterDeadComdatFunctions() have complexity
O(#DeadFunctions) rather than O(#SymbolsInModule), which addresses
half of the compile-time issue exposed by D115545.
Differential Revision: https://reviews.llvm.org/D115864
The naming has come up as a source of confusion in several recent reviews. onlyWritesMemory is consist with onlyReadsMemory which we use for the corresponding readonly case as well.
This folded (null + X) == g to false, but of course this is
incorrect if X == g.
Possibly this got confused with the null == g case, which is
already handled elsewhere.
This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for building and removing, which introduced odd situation like creation of
Attribute with dummy value because the only relevant part was the attribute
kind.
Differential Revision: https://reviews.llvm.org/D116110
The fold for merging a GEP of GEP into a single GEP currently bails
if doing so would result in notional overindexing. The justification
given in the comment above this check is dangerously incorrect: GEPs
with notional overindexing are perfectly fine, and if some code
treats them incorrectly, then that code is broken, not the GEP.
Such a GEP might legally appear in source IR, so only preventing
its creation cannot be sufficient. (The constant folder also ends
up canonicalizing the GEP to remove the notional overindexing, but
that's neither here nor there.)
This check dates back to
bd4fef4a89,
and as far as I can tell the original issue this was trying to
patch around has since been resolved.
Differential Revision: https://reviews.llvm.org/D116587
This fold is not correct, because indices might evaluate to zero
even if they are not a literal zero integer. Additionally, this
fold would be wrong (in the general case) for non-i8 types as well,
due to index overflow.
Drop this fold and instead let the target-dependent constant
folder compute the actual offset and fold the comparison based
on that.
Indirect inline asm operands may require the materialization of a
memory access according to the pointer element type. As this will
no longer be available with opaque pointers, we require it to be
explicitly annotated using the elementtype attribute, for example:
define void @test(i32* %p, i32 %x) {
call void asm "addl $1, $0", "=*rm,r"(i32* elementtype(i32) %p, i32 %x)
ret void
}
This patch only includes the LangRef change and Verifier updates to
allow adding the elementtype attribute in this position. It does not
yet enforce this, as this will require changes on the clang side
(and test updates) first.
Something I'm a bit unsure about is whether we really need the
elementtype for all indirect constraints, rather than only indirect
register constraints. I think indirect memory constraints might not
strictly need it (though the backend code is written in a way that
does require it). I think it's okay to just make this a general
requirement though, as this means we don't need to carefully deal
with multiple or alternative constraints. In addition, I believe
that MemorySanitizer benefits from having the element type even in
cases where it may not be strictly necessary for normal lowering
(cd2b050fa4/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (L4066)).
Differential Revision: https://reviews.llvm.org/D116531
This reverts commit fd4808887e.
This patch causes gcc to issue a lot of warnings like:
warning: base class ‘class llvm::MCParsedAsmOperand’ should be
explicitly initialized in the copy constructor [-Wextra]
This patch extends the available uses of the 'align' parameter attribute
to include vectors of pointers. The attribute specifies pointer
alignment element-wise.
This change was previously requested and discussed in D87304.
The vector predication (VP) intrinsics intend to use this for scatter
and gather operations, as they lack the explicit alignment parameter
that the masked versions use.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D115161
We can fold an equality or unsigned icmp between base+offset1 and
base+offset2 with inbounds offsets by comparing the offsets directly.
This replaces a pair of specialized folds that tried to reason
based on the GEP structure instead. One of those folds was plain
wrong (because it does not account for negative offsets), while
the other is unnecessarily complicated and limited (e.g. it will
fail with bitcasts involved).
The disadvantage of this change is that it requires data layout,
so the fold is no longer performed by datalayout-independent
constant folding. I don't think this is a loss in practice, but
it does regress the ConstantExprFold.ll test, which checks folding
without running any passes.
Differential Revision: https://reviews.llvm.org/D116332
The function `ConstantFoldCompareInstruction` uses `unsigned short` to
represent compare predicate, although all usesrs of the respective
include file use definition of CmpInst also. This change replaces
predicate argument type in this function to `ICmpInst::Predicate`,
which allows to make code a bit clearer and simpler.
No functional changes.
Differential Revision: https://reviews.llvm.org/D116379
An inbounds GEP may still cross the sign boundary, so signed icmps
cannot be folded (https://alive2.llvm.org/ce/z/XSgi4D). This was
previously fixed for other folds in this function, but this one
was missed.
New method `FCmpInst::compare` is added, which evaluates the given
compare predicate for constant operands. Interface is made similar to
`ICmpInst::compare`.
Differential Revision: https://reviews.llvm.org/D116168
The recursive implementation can run into stack overflows, e.g. like in PR52844.
The order the users are visited changes, but for the current use case
this only impacts the order error messages are emitted.
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces
function references with CFI jump table references, which is a problem
for low-level code that needs the address of the actual function body.
For example, in the Linux kernel, the code that sets up interrupt
handlers needs to take the address of the interrupt handler function
instead of the CFI jump table, as the jump table may not even be mapped
into memory when an interrupt is triggered.
This change adds the no_cfi constant type, which wraps function
references in a value that LowerTypeTestsModule::replaceCfiUses does not
replace.
Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D108478
It was already possible to create an AttributeList from an Index
and an AttributeSet. However, this would actually end up using
the implicit constructor on AttrBuilder, thus doing an unnecessary
conversion from AttributeSet to AttrBuilder to AttributeSet.
Instead we can accept the AttributeSet directly, as that is what
we need anyway.
As requested in D115787, I've added a test for LLVMConstGEP2 and
LLVMConstInBoundsGEP2. However, to make this work in the echo test,
I also had to change a couple of APIs to work on GEP operators,
rather than only GEP instructions.
Differential Revision: https://reviews.llvm.org/D115858
Weirdly, the opaque pointer compatible variants LLVMConstGEP2 and
LLVMConstInBoundsGEP2 were already declared in the header, but not
actually implemented. This adds the missing implementations and
deprecates the incompatible functions.
Differential Revision: https://reviews.llvm.org/D115787
These are deprecated and should be replaced with getAlign().
Some of these asserts don't do anything because Load/Store/AllocaInst never have a 0 align value.
These flags are documented as generating poison values for particular input values. As such, we should really be consistent about their handling with how we handle nsw/nuw/exact/inbounds.
Differential Revision: https://reviews.llvm.org/D115460
The vp.load and vp.gather intrinsics require the intrinsic return
type to determine the correct function signature. With opaque pointers,
it cannot be derived from the parameter pointee types.
Differential Revision: https://reviews.llvm.org/D115632
Rust allows enums to be scopes, as shown by the previous change. Sadly,
D111770 disallowed enums-as-scopes in the LLVM Verifier, which means
that LLVM HEAD stopped working for Rust compiles. As a result, we back
out the verifier part of D111770 with a modification to the testcase so
we don't break this in the future.
The testcase is now actual IR from rustc at commit 8f8092cc3, which is
the nightly as of 2021-09-28. I would expect rustc 1.57 to produce
similar or identical IR if someone wants to reproduce this IR in the
future with minimal changes. A recipe for reproducing the IR using rustc
is included in the test file.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D115353
This exposes the core logic of getGEPIndicesForOffset() as a
getGEPIndexForOffset() method that only returns a single offset,
instead of following the whole chain.
Reverts 02940d6d22. Fixes breakage in the modules build.
LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.
The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.
This review is a restart of an older review request:
https://reviews.llvm.org/D83094
Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>
Differential Revision: https://reviews.llvm.org/D112696
Currently, it is impossible to specify a DataLayout with pointer
size and index size that is not a whole number of bytes.
This patch modifies
the DataLayout class to accept arbitrary pointer sizes and to
store the size as a number of bits, rather than as a number of bytes.
Generally speaking, the external interface of the class as used
by in-tree architectures remains the same and shouldn't affect the
behavior of architecures with pointer sizes equal to a whole number
of bytes.
Note the interface of setPointerAlignment has changed and takes
a pointer and index size that is a number of bits, rather than a number
of bytes.
Patch originally by Ajit Kumar Agarwal
Differential Revision: https://reviews.llvm.org/D114141
This patch extends LLVM IR to add metadata that can be used to emit macho files with two build version load commands.
It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that,
which will be set by a future patch in clang.
MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target,
and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native
macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build
compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable
by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support.
Differential Revision: https://reviews.llvm.org/D112189