Commit Graph

1031 Commits

Author SHA1 Message Date
Simon Moll d05ddb86f6 [VP] vp.sitofp cast intrinsic and docs
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119922
2022-03-02 10:16:19 +01:00
serge-sans-paille a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Simon Moll 03e83cc8eb [VP] vp.fptosi cast intrinsic and docs
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D119535
2022-02-15 18:17:19 +01:00
Momchil Velikov 6398903ac8 Extend the `uwtable` attribute with unwind table kind
We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e.  we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.

Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.

That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.

This patch extends the `uwtable` attribute with an optional
value:
      -  `uwtable` (default to `async`)
      -  `uwtable(sync)`, synchronous unwind tables
      -  `uwtable(async)`, asynchronous (instruction precise) unwind tables

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D114543
2022-02-14 14:35:02 +00:00
YASHASVI KHATAVKAR 70fdbf35de Adding DiBuilder interface for assumed length strings 2022-02-11 14:40:02 -05:00
Paul Robinson d2495b69f2 [RGT] Exercise both paths through a test
BitcastToGEP had an opaque/typed pointer decision point, make sure it
exercises both sides.

Found by the Rotten Green Tests project.
2022-02-11 10:47:00 -08:00
YASHASVI KHATAVKAR 93d1a623ce Reverting an entire stack of changes causing build failures 2022-02-10 17:58:22 -05:00
YASHASVI KHATAVKAR 0e7341b7b1 worked on review comments 2022-02-10 15:24:51 -05:00
YASHASVI KHATAVKAR c26a0d1cda Updated the test to include proper string get functions 2022-02-10 15:24:50 -05:00
YASHASVI KHATAVKAR 929499eb64 Updated the test to include addtional details 2022-02-10 15:24:50 -05:00
YASHASVI KHATAVKAR 99f990be64 Added StringLocationExp to the new apis 2022-02-10 15:24:50 -05:00
YASHASVI KHATAVKAR 43d421cda3 Adding DIBuilder interface for assumed length string 2022-02-10 15:24:50 -05:00
Craig Topper 60745fb16f [VP] llvm.vp.fneg intrinsic and LangRef
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D119262
2022-02-09 07:54:36 -08:00
Craig Topper cef177d186 [VP] llvm.vp.fma intrinsic and LangRef
Differential Revision: https://reviews.llvm.org/D119185
2022-02-07 15:53:27 -08:00
serge-sans-paille ffe8720aa0 Reduce dependencies on llvm/BinaryFormat/Dwarf.h
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.

More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.

This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].

As a consequence of that patch, downstream user may need to manually some extra
files:

llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h

In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.

$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after:   10978519
before:  11245451

Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions

Differential Revision: https://reviews.llvm.org/D118781
2022-02-04 11:44:03 +01:00
serge-sans-paille e188aae406 Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.

This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/

I've tried to summarize the biggest change below:

- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h

And the usual count of preprocessed lines:
$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after:  6189948

200k lines less to process is no that bad ;-)

Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup

Differential Revision: https://reviews.llvm.org/D118652
2022-02-02 06:54:20 +01:00
Michael Gottesman 7ed95d1577 [debug-info] Add support for llvm.dbg.addr in DIBuilder.
I based this off of the API already create for llvm.dbg.value since both
intrinsics have the same arguments at the API level.

I added some tests exercising the API a little as well as an additional small
test that shows how one can use llvm.dbg.addr to limit the PC range where an
address value is available in the debugger. This is done by calling
llvm.dbg.value with undef and the same metadata info as one used to create the
llvm.dbg.addr.

rdar://83957028

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D117442
2022-01-18 18:26:50 -08:00
Simon Moll 33efbc8184 [VP] llvm.vp.merge intrinsic and LangRef
llvm.vp.merge interprets the %evl operand differently than the other vp
intrinsics: all lanes at positions greater or equal than the %evl
operand are passed through from the second vector input. Otherwise it
behaves like llvm.vp.select.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116725
2022-01-12 14:06:56 +01:00
Serge Guelton d2cc6c2d0c Use a sorted array instead of a map to store AttrBuilder string attributes
Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599
2022-01-10 14:49:53 +01:00
Nikita Popov 32808cfb24 [IR] Track users of comdats
Track all GlobalObjects that reference a given comdat, which allows
determining whether a function in a comdat is dead without scanning
the whole module.

In particular, this makes filterDeadComdatFunctions() have complexity
O(#DeadFunctions) rather than O(#SymbolsInModule), which addresses
half of the compile-time issue exposed by D115545.

Differential Revision: https://reviews.llvm.org/D115864
2022-01-06 09:13:58 +01:00
serge-sans-paille 9290ccc3c1 Introduce the AttributeMask class
This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for building and removing, which introduced odd situation like creation of
Attribute with dummy value because the only relevant part was the attribute
kind.

Differential Revision: https://reviews.llvm.org/D116110
2022-01-04 15:37:46 +01:00
Florian Hahn 5d68dc184e
[Verifier] Iteratively traverse all indirect users.
The recursive implementation can run into stack overflows, e.g. like in PR52844.

The order the users are visited changes, but for the current use case
this only impacts the order error messages are emitted.
2021-12-23 23:20:12 +01:00
Fraser Cormack 8002fa6760 [LangRef] Remove incorrect vector alignment rules
The LangRef incorrectly says that if no exact match is found when
seeking alignment for a vector type, the largest vector type smaller
than the sought-after vector type. This is incorrect as vector types
require an exact match, else they fall back to reporting the natural
alignment.

The corrected rule was not added in its place, as rules for other types
(e.g., floating-point types) aren't documented.

A unit test was added to demonstrate this.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D112463
2021-12-14 14:35:40 +00:00
Nikita Popov 6213f1dd03 [IR] Make VPIntrinsic::getDeclarationForParams() opaque pointer compatible
The vp.load and vp.gather intrinsics require the intrinsic return
type to determine the correct function signature. With opaque pointers,
it cannot be derived from the parameter pointee types.

Differential Revision: https://reviews.llvm.org/D115632
2021-12-14 14:20:59 +01:00
Nikita Popov 9cbab13282 [ConstantsTest] Avoid crash with opaque pointers
With opaque pointers there will be no bitcast, so don't assume
that.
2021-12-13 15:23:12 +01:00
Arthur Eubanks 93a20ecee4 [DebugInfo] Check DIEnumerator bit width when comparing for equality
As mentioned in D106585, this causes non-determinism, which can also be
shown by this test case being flaky without this patch.

We were using the APSInt's bit width for hashing, but not for checking
for equality. APInt::isSameValue() does not check bit width.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D115054
2021-12-03 13:40:22 -08:00
Sanjay Patel 97e921c81f [PatternMatch] create and use matcher for 'not' that excludes undef elements
We needed a stricter version of m_Not for D114462, but I wasn't
sure if that was going to be required anywhere else, so I didn't bother
to make that reusable.

It turns out we have one more existing simplification that needs
this (currently miscompiles):
https://alive2.llvm.org/ce/z/9-nTKi

And there's at least one more fold in that family that we could add.

Differential Revision: https://reviews.llvm.org/D114882
2021-12-02 08:51:13 -05:00
Nikita Popov 55d392cc30 [llvm-c] Make LLVMAddAlias opaque pointer compatible
Deprecate LLVMAddAlias in favor of LLVMAddAlias2, which accepts a
value type and an address space. Previously these were extracted
from the pointer type.

Differential Revision: https://reviews.llvm.org/D114860
2021-12-02 09:21:16 +01:00
Ellis Hoag 0150645bf5 [DebugInfo] Do not replace existing nodes from DICompileUnit
When creating a new DIBuilder with an existing DICompileUnit, load the
DINodes from the current DICompileUnit so they don't get overwritten.
This is done in the MachineOutliner pass, but it didn't change the CU so
the bug never appeared. We need this if we ever want to add DINodes to
the CU after it has been created, e.g., DIGlobalVariables.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114556
2021-11-29 19:46:10 -08:00
Simon Moll 1e65b93f3a [VP] Canonicalize macros of VPIntrinsics.def
Usage and naming of macros in VPIntrinsics.def has been inconsistent. Rename all property macros to VP_PROPERTY_<name>.  Use BEGIN/END scope macros to attach properties to vp intrinsics and SDNodes (instead of specifying either directly with the property macro).
A follow-up patch has documentation on how the macros are (intended) to be used.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D114144
2021-11-23 16:51:11 +01:00
David Sherwood ca18fcc2c0 [IR] Change CreateStepVector to work with element types smaller than i8
Currently the stepvector intrinsic only supports element types that
are integers of size 8 bits or more. This patch adds support for the
creation of stepvectors with smaller element types by creating
the intrinsic with i8 elements that we then truncate to the requested
size.

It's not currently possible to write a vectoriser test to exercise
this code path so I have added a unit test here:

  llvm/unittests/IR/IRBuilderTest.cpp

Differential Revision: https://reviews.llvm.org/D113767
2021-11-17 10:47:50 +00:00
Mircea Trofin a32c2c3808 [NFC] Use Optional<ProfileCount> to model invalid counts
ProfileCount could model invalid values, but a user had no indication
that the getCount method could return bogus data. Optional<ProfileCount>
addresses that, because the user must dereference the optional. In
addition, the patch removes concept duplication.

Differential Revision: https://reviews.llvm.org/D113839
2021-11-14 19:03:30 -08:00
Nikita Popov 1c5d636af1 [ConstantRangeTest] Add helper to enumerate APInts (NFC)
While ForeachNumInConstantRange(ConstantRange::getFull(Bits))
works, it's somewhat roundabout, and I keep looking for this
function.
2021-11-12 18:18:51 +01:00
Nikita Popov 2060895c9c [ConstantRange] Add exact union/intersect (NFC)
For some optimizations on comparisons it's necessary that the
union/intersect is exact and not a superset. Add methods that
return Optional<ConstantRange> only if the result is exact.

For the sake of simplicity this is implemented by comparing
the subset and superset approximations for now, but it should be
possible to do this more directly, as unionWith() and intersectWith()
already distinguish the cases where the result is imprecise for the
preferred range type functionality.
2021-11-07 21:46:06 +01:00
Nikita Popov cf71a5ea8f [ConstantRange] Support zero size in isSizeLargerThan()
From an API perspective, it does not make a lot of sense that 0
is not a valid argument to this function. Add the exact check needed
to support it.
2021-11-07 21:22:45 +01:00
Luke Benes 2249ecee8d
[IR][ShuffleVector] Fix Wdangling-else warning in InstructionsTest
Fix a dangling else that gcc-11 warned about. The EXPECT_EQ macro
expands to an if-else, so the whole construction contains a hidden
dangling else.

Differential Revision: https://reviews.llvm.org/D113346
2021-11-07 00:07:01 +03:00
Nikita Popov 9f0194be45 [ConstantRange] Add getEquivalentICmp() variant with offset (NFCI)
Add a variant of getEquivalentICmp() that produces an optional
offset. This allows us to create an equivalent icmp for all ranges.

Use this in the with.overflow folding code, which was doing this
adjustment separately -- this clarifies that the fold will indeed
always apply.
2021-11-06 21:59:45 +01:00
Roman Lebedev a5cd27880a
[IR] Improve member `ShuffleVectorInst::isReplicationMask()`
When we have an actual shuffle, we can impose the additional restriction
that the mask replicates the elements of the first operand, so we know
the replication factor as a ratio of output and op0 vector sizes.
2021-11-06 00:09:27 +03:00
Roman Lebedev 0b36431810
[NFCI] InstructionTest: trim `InstructionsTest.ShuffleMaskIsReplicationMask_*` complexity
These tests have pretty high O() complexity due to their nature,
which leads to potentially-long runtimes.

While in release build for me they took ~1 and ~2 sec,
as noted in https://reviews.llvm.org/D113214#inline-1080479
they take minutes in debug build.

Fine-tune the amount of permutations they deal with,
without affecting the test coverage. After this,
they take <~10ms each for me (in release build),
hopefully that is good-enough for debug build too.
2021-11-05 19:22:48 +03:00
Roman Lebedev 01d8759ac9
[IR][ShuffleVector] Introduce `isReplicationMask()` matcher
Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.

As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.

As it has been witnessed in the recent
AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.

Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching
the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize
`Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage
using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`

This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.

This patch adds precise undef-aware `isReplicationMask()`,
with exhaustive test coverage.
* `InstructionsTest.ShuffleMaskIsReplicationMask` shows that
   it correctly detects all the known masks.
* `InstructionsTest.ShuffleMaskIsReplicationMask_undef`
  shows that replacing some mask elements in a known replication mask
  still allows us to recognize it as a replication mask.
  Note, with enough undef elts, we may detect a different tuple.
* `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness`
  shows that if we detected the replication mask with given params,
  then if we actually generate a true replication mask with said params,
  it matches element-wise ignoring undef mask elements.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113214
2021-11-05 16:53:47 +03:00
Jakub Kuderski 3348b841d3 Make enum iteration with seq safe by default
By default `llvm::seq` would happily iterate over enums, which may be unsafe if the enum values are not continuous. This patch disable enum iteration with `llvm::seq` and `llvm::seq_inclusive` and adds two new functions: `enum_seq` and `enum_seq_inclusive`.

To make sure enum iteration is safe, we require users to declare their enum types as iterable by specializing `enum_iteration_traits<SomeEnum>`. Because it's not always possible to add these traits next to enum definition (e.g., for enums defined in external libraries), we provide an escape hatch to allow iteration on per-callsite basis by passing `force_iteration_on_noniterable_enum`.

The main benefit of this approach is that these global declarations via traits can appear just next to enum definitions, making easy to spot when enums are miss-labeled, e.g., after introducing new enum values, whereas `force_iteration_on_noniterable_enum` should stand out and be easy to grep for.

This emerged from a discussion with gchatelet@ about reusing llvm's `Sequence.h` in lieu of https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/interface/lgc/EnumIterator.h.

Reviewed By: dblaikie, gchatelet, aaron.ballman

Differential Revision: https://reviews.llvm.org/D107378
2021-11-03 20:52:21 -04:00
Roman Lebedev 03a4f1f3b8
[ConstantRange] Sign-flipping of signedness-invariant comparisons
For certain combination of LHS and RHS constant ranges,
the signedness of the relational comparison predicate is irrelevant.

This implements complete and precise model for all predicates,
as confirmed by the brute-force tests. I'm not sure if there are
some more cases that we can handle here.

In a follow-up, CVP will make use of this.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D90924
2021-10-31 22:53:17 +03:00
Roman Lebedev 25043c8276
[NFCI] Introduce `ICmpInst::compare()` and use it where appropriate
As noted in https://reviews.llvm.org/D90924#inline-1076197
apparently this is a pretty common pattern,
let's not repeat it yet again, but have it in a common place.

There may be some more places where it could be used,
but these are the most obvious ones.
2021-10-30 17:50:06 +03:00
Roman Lebedev 42712698fd
Revert "[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`"
Clang OpenMP codegen tests are failing.

This reverts commit 288f1f8abe.
This reverts commit cb90e5356a.
2021-10-27 22:21:37 +03:00
Roman Lebedev cb90e5356a
[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`
There's precedent for that in `CreateOr()`/`CreateAnd()`.

The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 21:34:38 +03:00
Yuanfang Chen 7c3fa52785 [DebugInfo] Skip ODRUniquing for mismatched tags
Otherwise, ODRUniquing would map some member method/variable MDNodes
to have enum type DIScope, resulting in invalid debug info and bad
DWARF.

- Add a Verifier check that when a 'scope:' operand is an ODR type that is not an enum.
- Makes ODRUniquing apply to only ODR types with the same tag so that the debuginfo/DWARF is well-formed.

Reviewed By: probinson, aprantl

Differential Revision: https://reviews.llvm.org/D111770
2021-10-26 15:28:25 -07:00
Jakub Kuderski 763ae1d2c6 [DomTree][NFC] Clean up nits in DomTree code
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112482
2021-10-25 16:05:34 -04:00
Itay Bookstein 08ed216000 [IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol
As discussed in:
* https://reviews.llvm.org/D94166
* https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html

The GlobalIndirectSymbol class lost most of its meaning in
https://reviews.llvm.org/D109792, which disambiguated getBaseObject
(now getAliaseeObject) between GlobalIFunc and everything else.
In addition, as long as GlobalIFunc is not a GlobalObject and
getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
is a GlobalIFunc cannot currently be modeled properly. Creating
aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
which is undesirable because it should return the object itself for
non-aliases.

This patch refactors the GlobalIFunc class to inherit directly from
GlobalObject, and removes GlobalIndirectSymbol (while inlining the
relevant parts into GlobalAlias and GlobalIFunc). This allows for
calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
itself, making getAliaseeObject() more consistent and enabling
alias-to-ifunc to be properly modeled in the IR.

I exercised some judgement in the API clients of GlobalIndirectSymbol:
some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
some remained shared (with the type adapted to become GlobalValue).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D108872
2021-10-20 10:29:47 -07:00
Nikita Popov 274b2439f8 [ConstantRange] Add fast signed multiply
The multiply() implementation is very slow -- it performs six
multiplications in double the bitwidth, which means that it will
typically work on allocated APInts and bypass fast-path
implementations. Add an additional implementation that doesn't
try to produce anything better than a full range if overflow is
possible. At least for the BasicAA use-case, we really don't care
about more precise modeling of overflow behavior. The current
use of multiply() is fine while the implementation is limited to
a single index, but extending it to the multiple-index case makes
the compile-time impact untenable.
2021-10-17 16:41:49 +02:00
Nikita Popov 587493b441 [ConstantRange] Compute precise shl range for single elements
For the common case where the shift amount is constant (a single
element range) we can easily compute a precise range (up to
unsigned envelope), so do that.
2021-10-15 23:44:41 +02:00
Nikita Popov 9eb8040a28 [ConstantRange] Support checking optimality for subset of inputs (NFC)
We always want to check correctness, but for some operations we
can only guarantee optimality for a subset of inputs. Accept an
additional predicate that determines whether optimality for a
given pair of ranges should be checked.
2021-10-15 22:48:07 +02:00
Nikita Popov 82e858d1bf [ConstantRange] Better diagnostic for correctness test failure (NFC)
Print a friendly error message including the inputs, result and
not-contained element if an exhaustive correctness test fails,
same as we do if the optimality test fails.
2021-10-15 21:52:17 +02:00
Sanjay Patel 519752062c [PatternMatch] add matchers for commutative logical and/or
We need these to add folds with the same structure as
regular commuted logic ops.
2021-10-07 10:37:34 -04:00
Arthur Eubanks 05392466f0 Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 13:29:23 -07:00
Arthur Eubanks 569346f274 Revert "Reland [IR] Increase max alignment to 4GB"
This reverts commit 8d64314ffe.
2021-10-06 11:38:11 -07:00
Arthur Eubanks 8d64314ffe Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 11:03:51 -07:00
Arthur Eubanks 72cf8b6044 Revert "[IR] Increase max alignment to 4GB"
This reverts commit df84c1fe78.

Breaks some bots
2021-10-06 10:21:35 -07:00
Arthur Eubanks df84c1fe78 [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.

This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.

The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.

Updating clang's max allowed alignment will come in a future patch.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D110451
2021-10-06 09:54:14 -07:00
Kazu Hirata 3081de8c72 [llvm] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-05 08:29:19 -07:00
Jay Foad a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Min-Yih Hsu 475de8da01 [IR]PATCH 2/2: Add MDNode::printTree and dumpTree
This patch adds the functionalities to print MDNode in tree shape. For
example, instead of printing a MDNode like this:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
```
The printTree/dumpTree functions can give you:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
  <0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0)
  <0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir")
  <0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2)
    <0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype")
```
Which is useful when using it in debugger. Where sometimes printing the
whole module to see all MDNodes is too expensive.

Differential Revision: https://reviews.llvm.org/D110113
2021-10-02 21:19:52 -07:00
Kazu Hirata f631173d80 [llvm] Migrate from arg_operands to args (NFC)
Note that arg_operands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-09-30 08:51:21 -07:00
Simon Moll 72a08c0b94 [VP] Vector predicated vector splice intrinsic
This patch introduces the vector-predicated version of the
experimental_vector_splice intrinsic [1] at the IR level. It considers
the active vector length for both vectors and and uses a vector mask to
disable certain lanes in the result.

[1] https://reviews.llvm.org/D94708

Change originally authored by Vineet Kumar <vineet.kumar@bsc.es>

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D103898
2021-09-29 10:43:36 +02:00
hyeongyu kim 86bf234d0b [IR] Change the default value of InstertElement to poison (1/4)
This patch is for fixing potential insertElement-related bugs like D93818.
```
V = UndefValue::get(VecTy);
for(...)
  V = Builder.CreateInsertElementy(V, Elt, Idx);
=>
V = PoisonValue::get(VecTy);
for(...)
  V = Builder.CreateInsertElementy(V, Elt, Idx);
```
Like above, this patch changes the placeholder V to poison.
The patch will be separated into several commits.

Reviewed By: aqjune

Differential Revision: https://reviews.llvm.org/D110311
2021-09-28 22:29:16 +09:00
Anirudh Prasad e09a1dc475 [SystemZ][z/OS] Add GOFF Support to the DataLayout
- This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation.
- Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend.

Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay

Differential Revision: https://reviews.llvm.org/D109362
2021-09-24 14:09:01 -04:00
Alok Kumar Sharma a5b72abc9e [DebugInfo] Enhance DIImportedEntity to accept children entities
New field `elements` is added to '!DIImportedEntity', representing
list of aliased entities.
This is needed to dump optimized debugging information where all names
in a module are imported, but a few names are imported with overriding
aliases.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D109343
2021-09-16 10:41:55 +05:30
Nikita Popov 90ec6dff86 [OpaquePtr] Forbid mixing typed and opaque pointers
Currently, opaque pointers are supported in two forms: The
-force-opaque-pointers mode, where all pointers are opaque and
typed pointers do not exist. And as a simple ptr type that can
coexist with typed pointers.

This patch removes support for the mixed mode. You either get
typed pointers, or you get opaque pointers, but not both. In the
(current) default mode, using ptr is forbidden. In -opaque-pointers
mode, all pointers are opaque.

The motivation here is that the mixed mode introduces additional
issues that don't exist in fully opaque mode. D105155 is an example
of a design problem. Looking at D109259, it would probably need
additional work to support mixed mode (e.g. to generate GEPs for
typed base but opaque result). Mixed mode will also end up
inserting many casts between i8* and ptr, which would require
significant additional work to consistently avoid.

I don't think the mixed mode is particularly valuable, as it
doesn't align with our end goal. The only thing I've found it to
be moderately useful for is adding some opaque pointer tests in
between typed pointer tests, but I think we can live without that.

Differential Revision: https://reviews.llvm.org/D109290
2021-09-10 15:18:23 +02:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Simon Moll ea2cdbf5e6 [VP] Declaration and docs for vp.select intrinsic
llvm.vp.select extends the regular select instruction with an explicit
vector length (%evl).

All lanes with indexes at and above %evl are
undefined. Lanes below %evl are taken from the first input where the
mask is true and from the second input otherwise.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D105351
2021-09-02 11:17:14 +02:00
Yonghong Song 1bebc31c61 [DebugInfo] generate btf_tag annotations for func parameters
Generate btf_tag annotations for function parameters.
A field "annotations" is introduced to DILocalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
    distinct !DILocalVariable(name: "info",, arg: 1, ..., annotations: !10)
    !10 = !{!11, !12}
    !11 = !{!"btf_tag", !"a"}
    !12 = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106620
2021-08-26 14:18:30 -07:00
Yonghong Song 30c288489a [DebugInfo] generate btf_tag annotations for DIGlobalVariable
Generate btf_tag annotations for DIGlobalVariable.
A field "annotations" is introduced to DIGlobalVariable, and
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates how
annotations are encoded in IR:
    distinct !DIGlobalVariable(..., annotations: !10)
    !10 = !{!11, !12}
    !11 = !{!"btf_tag", !"a"}
    !12 = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106619
2021-08-26 10:03:44 -07:00
Roman Lebedev 564d85e090
The maximal representable alignment in LLVM IR is 1GiB, not 512MiB
In LLVM IR, `AlignmentBitfieldElementT` is 5-bit wide
But that means that the maximal alignment exponent is `(1<<5)-2`,
which is `30`, not `29`. And indeed, alignment of `1073741824`
roundtrips IR serialization-deserialization.

While this doesn't seem all that important, this doubles
the maximal supported alignment from 512MiB to 1GiB,
and there's actually one noticeable use-case for that;
On X86, the huge pages can have sizes of 2MiB and 1GiB (!).

So while this doesn't add support for truly huge alignments,
which i think we can easily-ish do if wanted, i think this adds
zero-cost support for a not-trivially-dismissable case.

I don't believe we need any upgrade infrastructure,
and since we don't explicitly record the IR version,
we don't need to bump one either.

As @craig.topper speculates in D108661#2963519,
this might be an artificial limit imposed by the original implementation
of the `getAlignment()` functions.

Differential Revision: https://reviews.llvm.org/D108661
2021-08-26 12:53:39 +03:00
Yonghong Song 0b32dca12e Reland [DebugInfo] generate btf_tag annotations for DIComposite types
Clang patch D106614 added attribute btf_tag support. This patch
generates btf_tag annotations for DIComposite types.
A field "annotations" is introduced to DIComposite, and the
annotations are represented as an DINodeArray, similar to
DIComposite elements. The following example illustrates
how annotations are encoded in IR:
  distinct !DICompositeType(..., annotations: !10)
  !10 = !{!11, !12}
  !11 = !{!"btf_tag", !"a"}
  !12 = !{!"btf_tag", !"b"}
Each btf_tag annotation is represented as a 2D array of
meta strings. Each record may have more than one
btf_tag annotations, as in the above example.

Reland with additional fixes for llvm/unittests/IR/DebugTypeODRUniquingTest.cpp.

Differential Revision: https://reviews.llvm.org/D106615
2021-08-19 17:33:50 -07:00
Arthur Eubanks 7c8206cd2a [NFC] Cleanup AttributeList::getStackAlignment()
So that we don't use a confusing index.
2021-08-19 14:21:40 -07:00
Arthur Eubanks 3f4d00bc3b [NFC] More get/removeAttribute() cleanup 2021-08-17 21:05:41 -07:00
Arthur Eubanks de0ae9e89e [NFC] Cleanup more AttributeList::addAttribute() 2021-08-17 21:05:41 -07:00
Arthur Eubanks ad727ab7d9 [NFC] Migrate some callers away from Function/AttributeLists methods that take an index
These methods can be confusing.
2021-08-17 21:05:40 -07:00
Fraser Cormack f3e9047249 [VP] Add vector-predicated reduction intrinsics
This patch adds vector-predicated ("VP") reduction intrinsics corresponding to
each of the existing unpredicated `llvm.vector.reduce.*` versions. Unlike the
unpredicated reductions, all VP reductions have a start value. This start value
is returned when the no vector element is active.

Support for expansion on targets without native vector-predication support is
included.

This patch is based on the ["reduction
slice"](https://reviews.llvm.org/D57504#1732277) of the LLVM-VP reference patch
(https://reviews.llvm.org/D57504).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D104308
2021-08-17 17:56:35 +01:00
Arthur Eubanks d7593ebaee [NFC] Clean up users of AttributeList::hasAttribute()
AttributeList::hasAttribute() is confusing, use clearer methods like
hasParamAttr()/hasRetAttr().

Add hasRetAttr() since it was missing from AttributeList.
2021-08-13 11:59:18 -07:00
Arthur Eubanks a9831cce1e [NFC] Remove public uses of AttributeList::getAttributes()
Use methods that better convey the intent.
2021-08-13 11:38:12 -07:00
Arthur Eubanks 80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Arthur Eubanks 92ce6db9ee [NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr()
This is more consistent with similar methods.
2021-08-13 11:09:18 -07:00
Arthur Eubanks a0c42ca56c [NFC] Remove AttributeList::hasParamAttribute()
It's the same as AttributeList::hasParamAttr().
2021-08-13 10:58:21 -07:00
Paul Robinson 75aa3d520d Add a DIExpression const-folder to prevent silly expressions.
It's entirely possible (because it actually happened) for a bool
variable to end up with a 256-bit DW_AT_const_value.  This came about
when a local bool variable was initialized from a bitfield in a
32-byte struct of bitfields, and after inlining and constant
propagation, the variable did have a constant value. The sequence of
optimizations had it carrying "i256" values around, but once the
constant made it into the llvm.dbg.value, no further IR changes could
affect it.

Technically the llvm.dbg.value did have a DIExpression to reduce it
back down to 8 bits, but the compiler is in no way ready to emit an
oversized constant *and* a DWARF expression to manipulate it.
Depending on the circumstances, we had either just the very fat bool
value, or an expression with no starting value.

The sequence of optimizations that led to this state did seem pretty
reasonable, so the solution I came up with was to invent a DWARF
constant expression folder.  Currently it only does convert ops, but
there's no reason it couldn't do other ops if that became useful.

This broke three tests that depended on having convert ops survive
into the DWARF, so I added an operator that would abort the folder to
each of those tests.

Differential Revision: https://reviews.llvm.org/D106915
2021-08-05 06:14:40 -07:00
Dylan Fleming 80e0bd1496 [SVE][IR] Fix Binary op matching in PatternMatch::m_VScale
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D105978
2021-07-23 11:39:13 +01:00
Guillaume Chatelet d6da02d952 [llvm] Add enum iteration to Sequence
This patch allows iterating typed enum via the ADT/Sequence utility.

It also changes the original design to better separate concerns:
 - `StrongInt` only deals with safe `intmax_t` operations,
 - `SafeIntIterator` presents the iterator and reverse iterator
 interface but only deals with safe `StrongInt` internally.
 - `iota_range` only deals with `SafeIntIterator` internally.

 This design ensures that operations are always valid. In particular,
 "Out of bounds" assertions fire when:
  - the `value_type` is not representable as an `intmax_t`
  - iterator operations make internal computation underflow/overflow
  - the internal representation cannot be converted back to `value_type`

Differential Revision: https://reviews.llvm.org/D106279
2021-07-21 12:48:53 +00:00
Fraser Cormack 5cbd5c62be [VP][NFC] Correct formatting in unit test 2021-07-15 12:38:47 +01:00
Guillaume Chatelet 2c47b8847e Revert "[llvm] Add enum iteration to Sequence"
This reverts commit a006af5d6e.
2021-07-13 16:44:42 +00:00
Guillaume Chatelet a006af5d6e [llvm] Add enum iteration to Sequence
This patch allows iterating typed enum via the ADT/Sequence utility.

Differential Revision: https://reviews.llvm.org/D103900
2021-07-13 16:22:19 +00:00
Nikita Popov 7ed3e87825 [Attributes] Determine attribute properties from TableGen data
Continuing from D105763, this allows placing certain properties
about attributes in the TableGen definition. In particular, we
store whether an attribute applies to fn/param/ret (or a combination
thereof). This information is used by the Verifier, as well as the
ForceFunctionAttrs pass. I also plan to use this in LLParser,
which also duplicates info on which attributes are valid where.

This keeps metadata about attributes in one place, and makes it
more likely that it stays in sync, rather than in various
functions spread across the codebase.

Differential Revision: https://reviews.llvm.org/D105780
2021-07-12 22:13:38 +02:00
Arthur Eubanks d338e79a4c [OpaquePtr] Remove checking pointee type for byval/preallocated type
These currently always require a type parameter. The bitcode reader
already upgrades old bitcode without the type parameter to use the
pointee type.

In cases where the caller does not have byval but the callee does, we
need to follow CallBase::paramHasAttr() and also look at the callee for
the byval type so that CallBase::isByValArgument() and
CallBase::getParamByValType() are in sync. Do the same for preallocated.

While we're here add a corresponding version for inalloca since we'll
need it soon.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104663
2021-07-07 14:28:55 -07:00
Hussain Kadhem d21a35ac0a [VP] Implementation of intrinsic and SDNode definitions for VP load, store, gather, scatter.
This patch adds intrinsic definitions and SDNodes for predicated
load/store/gather/scatter, based on the work done in D57504.

Reviewed By: simoll, craig.topper

Differential Revision: https://reviews.llvm.org/D99355
2021-07-01 13:34:44 +02:00
Fraser Cormack 983972bfb0 [VP][NFCI] Address various clang-tidy warnings
Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D104288
2021-06-21 10:57:42 +01:00
Simon Moll 74d45b884c [VP] Binary floating-point intrinsics.
This patch implements vector-predicated intrinsics on IR level for fadd,
fsub, fmul, fdiv and frem.  There operate in the default floating-point
environment. We will use constrained fp operand bundles for constrained
vector-predicated fp math (D93455).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D93470
2021-06-14 08:51:41 +02:00
Simon Moll 0f9d299122 [VP] getDeclarationForParams
`VPIntrinsic::getDeclarationForParams` creates a vp intrinsic
declaration for parameters you want to call it with.  This is in
preparation of a new builder class that makes emitting vp intrinsic code
nearly as convenient as using a plain ir builder (aka `VectorBuilder`,
to be used by D99750).

Reviewed By: frasercrmck, craig.topper, vkmr

Differential Revision: https://reviews.llvm.org/D102686
2021-06-08 14:21:28 +02:00
Arthur Eubanks 8961293851 [OpaquePtr] Create API to make a copy of a PointerType with some address space
Some existing places use getPointerElementType() to create a copy of a
pointer type with some new address space.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103429
2021-06-01 16:52:32 -07:00
Craig Topper 2830d924b0 [VP] Make getMaskParamPos/getVectorLengthParamPos return unsigned. Lowercase function names.
Parameter positions seem like they should be unsigned.

While there, make function names lowercase per coding standards.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D103224
2021-05-28 11:28:47 -07:00
Simon Moll 66963bf381 [VP] make getFunctionalOpcode return an Optional
The operation of some VP intrinsics do/will not map to regular
instruction opcodes.  Returning 'None' seems more intuitive here than
'Instruction::Call'.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D102778
2021-05-19 17:08:34 +02:00
Benjamin Kramer 05de4b4139 Put back the trailing commas on TYPED_TEST_SUITE
This avoids a -pedantic warning:
warning: ISO C++11 requires at least one argument for the "..." in a variadic macro

See also https://github.com/google/googletest/issues/2271
2021-05-17 14:14:13 +02:00
Benjamin Kramer 34fa3f8733 Clean up uses of gmock Invoke in an attempt to make it work with GCC 6.2. NFCI. 2021-05-17 13:48:45 +02:00