SelectionDAG's equivalents in ISD::InputArg/OutputArg track the
original argument index. Mips relies on this, and its currently
reinventing its own parallel CallLowering infrastructure which tracks
these indexes on the side. Add this to help move towards deleting the
custom mips handling.
This enables proper lowering of non-byte sized loads. We still aren't
faithfully preserving memory types everywhere, so the legality checks
still only consider the size.
This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).
Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.
This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.
This patch relands https://reviews.llvm.org/D104454, but fixes some failing
builds on Mac OS which apparently has a different definition for size_t,
that caused 'ambiguous operator overload' for the implicit conversion
of TypeSize to a scalar value.
This reverts commit b732e6c9a8.
To reflect that the size may be scalable, a TypeSize is returned
instead of an unsigned. In places where the result is used,
it currently relies on an implicit cast of TypeSize -> uint64_t,
which asserts that the type is not scalable.
This patch is NFC for fixed-width vectors.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104454
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector
The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
same number of elements, then use LLT::vector(OtherTy.getElementCount())
or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
or operator*. That is because there is no reason to specifically restrict
the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
just use fixed_vector. This will need to be fixed up in the future when
modifying the algorithm to also work for scalable vectors, and will need
then need additional tests to confirm the behaviour works the same for
scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
this is replaced by LLT::scalable_vector.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104451
This patch aims to add the scalable property to LLT. The rest of the
patch-series changes the interfaces to take/return ElementCount and
TypeSize, which both have the ability to represent the scalable property.
The changes are mostly mechanical and aim to be non-functional changes
for fixed-width vectors.
For scalable vectors some unit tests have been added, but no effort has
been put into making any of the GlobalISel algorithms work with scalable
vectors yet. That will be left as future work.
The work is split into a series of 5 patches to make reviews easier.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D104450
- Distinct metadata needs generating in the codegen to attach correct
AAInfo on the loads/stores after lowering, merging, and other relevant
transformations.
- This patch adds 'MachhineModuleSlotTracker' to help assign slot
numbers to these newly generated unnamed metadata nodes.
- To help 'MachhineModuleSlotTracker' track machine metadata, the
original 'SlotTracker' is rebased from 'AbstractSlotTrackerStorage',
which provides basic interfaces to create/retrive metadata slots. In
addition, once LLVM IR is processsed, additional hooks are also
introduced to help collect machine metadata and assign them slot
numbers.
- Finally, if there is any such machine metadata, 'MIRPrinter' outputs
an additional 'machineMetadataNodes' field containing all the
definition of those nodes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D103205
G_INSERT legalization is incomplete and doesn't work very
well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES
padding with undef values (although this can get pretty large).
For the case of load/store narrowing, this is still performing the
load/stores in irregularly sized pieces. It might be cleaner to split
this down into equal sized pieces, and rely on load/store merging to
optimize it.
Fixes getTypeConversion to return `TypeScalarizeScalableVector` when a scalable vector
type cannot be legalized by widening/splitting. When this is the method of legalization
found, getTypeLegalizationCost will return an Invalid cost.
The getMemoryOpCost, getMaskedMemoryOpCost & getGatherScatterOpCost functions already call
getTypeLegalizationCost and will now also return an Invalid cost for unsupported types.
Reviewed By: sdesmalen, david-arm
Differential Revision: https://reviews.llvm.org/D102515
It's still in use in a few places so we can't delete it yet but there's not
many at this point.
Differential Revision: https://reviews.llvm.org/D103352
This makes it possible for targets to define their own MCObjectFileInfo.
This MCObjectFileInfo is then used to determine things like section alignment.
This is a follow up to D101462 and prepares for the RISCV backend defining the
text section alignment depending on the enabled extensions.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101921
We're trying to move DebugLogging into instrumentation, rather than
being part of PassManagers/AnalysisManagers.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D102093
This untangles the MCContext and the MCObjectFileInfo. There is a circular
dependency between MCContext and MCObjectFileInfo. Currently this dependency
also exists during construction: You can't contruct a MOFI without a MCContext
without constructing the MCContext with a dummy version of that MOFI first.
This removes this dependency during construction. In a perfect world,
MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the
MCContext, like other MC information. This is future work.
This also shifts/adds more information to the MCContext making it more
available to the different targets. Namely:
- TargetTriple
- ObjectFileType
- SubtargetInfo
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101462
This utility allows more efficient start of pattern match.
Often MachineInstr(MI) is available and instead of using
mi_match(MI.getOperand(0).getReg(), MRI, ...) followed by
MRI.getVRegDef(Reg) that gives back MI we now use
mi_match(MI, MRI, ...).
Differential Revision: https://reviews.llvm.org/D99735
ConstantFoldingMIRBuilder was an experiment which is not used for
anything. The constant folding functionality is now part of
CSEMIRBuilder.
Differential Revision: https://reviews.llvm.org/D101050
Change the definition of G_SBFX and G_UBFX so that the lsb and width
can have different types than the src and dst operands.
Differential Revision: https://reviews.llvm.org/D99739
In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.
This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.
The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.
This allows us to do away with the following pattern in many of the SVE tests:
RUN: .... 2>%t
RUN: cat %t | FileCheck --check-prefix=WARN
WARN-NOT: warning: ...
The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.
This patch also has fixes the following tests:
unittests:
ScalableVectorMVTsTest.SizeQueries
SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG
regression tests:
Transforms/InstCombine/vscale_gep.ll
Reviewed By: paulwalker-arm, ctetreau
Differential Revision: https://reviews.llvm.org/D98856
Also, make it structurally required so it can't be forgotten and re-introduce
the bug that led to the rotten green tests.
Differential Revision: https://reviews.llvm.org/D99692
This patch adds a new llvm.experimental.stepvector intrinsic,
which takes no arguments and returns a linear integer sequence of
values of the form <0, 1, ...>. It is primarily intended for
scalable vectors, although it will work for fixed width vectors
too. It is intended that later patches will make use of this
new intrinsic when vectorising induction variables, currently only
supported for fixed width. I've added a new CreateStepVector
method to the IRBuilder, which will generate a call to this
intrinsic for scalable vectors and fall back on creating a
ConstantVector for fixed width.
For scalable vectors this intrinsic is lowered to a new ISD node
called STEP_VECTOR, which takes a single constant integer argument
as the step. During lowering this argument is set to a value of 1.
The reason for this additional argument at the codegen level is
because in future patches we will introduce various generic DAG
combines such as
mul step_vector(1), 2 -> step_vector(2)
add step_vector(1), step_vector(1) -> step_vector(2)
shl step_vector(1), 1 -> step_vector(2)
etc.
that encourage a canonical format for all targets. This hopefully
means all other targets supporting scalable vectors can benefit
from this too.
I've added cost model tests for both fixed width and scalable
vectors:
llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll
llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll
as well as codegen lowering tests for fixed width and scalable
vectors:
llvm/test/CodeGen/AArch64/neon-stepvector.ll
llvm/test/CodeGen/AArch64/sve-stepvector.ll
See this thread for discussion of the intrinsic:
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html
There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG.
E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain
code that matches a bitfield extract from an and + right shift.
Rather than duplicating code in the same way, this adds two opcodes:
- G_UBFX (unsigned bitfield extract)
- G_SBFX (signed bitfield extract)
They work like this
```
%x = G_UBFX %y, %lsb, %width
```
Where `lsb` and `width` are
- The least-significant bit of the extraction
- The width of the extraction
This will extract `width` bits from `%y`, starting at `lsb`. G_UBFX zero-extends
the result, while G_SBFX sign-extends the result.
This should allow us to use the combiner to match the bitfield extraction
patterns rather than duplicating pattern-matching code in each target.
Differential Revision: https://reviews.llvm.org/D98464
Suppresses an implicit TypeSize to uint64_t conversion warning.
We might be able to just not offset it since we're writing to a
Fixed stack object, but I wasn't sure so I just did what
DAGTypeLegalizer::IncrementPointer does.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D98736
It is good to have a combined `divrem` instruction when the
`div` and `rem` are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.
Reviewed By: arsenm, paquette
Differential Revision: https://reviews.llvm.org/D96013
This is recommit of 4c8fb7ddd6.
MIR in one unit test had mismatched types.
For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.
Differential Revision: https://reviews.llvm.org/D96122
:: (store 1 + 4, addrspace 1)
->
:: (store 1 into undef + 4, addrspace 1)
An offset without a base isn't terribly useful but it's convenient to update
the offset without checking the value. For example, when breaking apart
stores into smaller units
Differential Revision: https://reviews.llvm.org/D97812
For vectors we consider a bit as known if it is the same for all demanded
vector elements (all elements by default). KnownBits BitWidth for vector
type is size of vector element. Add support for G_BUILD_VECTOR.
This allows combines of urem_pow2_to_mask in pre-legalizer combiner.
Differential Revision: https://reviews.llvm.org/D96122
This merges more AMDGPU ABI lowering code into the generic call
lowering. Start cleaning up by factoring away more of the pack/unpack
logic into the buildCopy{To|From}Parts functions. These could use more
improvement, and the SelectionDAG versions are significantly more
complex, and we'll eventually have to emulate all of those cases too.
This is mostly NFC, but does result in some minor instruction
reordering. It also removes some of the limitations with mismatched
sizes the old code had. However, similarly to the merge on the input,
this is forcing gfx6/gfx7 to use the gfx8+ ABI (which is what we
actually want, but SelectionDAG is stuck using the weird emergent
ABI).
This also changes the load/store size for stack passed EVTs for
AArch64, which makes it consistent with the DAG behavior.
remove `Hi` `Lo` argument from `emitDwarfUnitLength`, so we
can make caller of emitDwarfUnitLength easier.
Reviewed By: MaskRay, dblaikie, ikudrin
Differential Revision: https://reviews.llvm.org/D96409
We may need to do some customization for DWARF unit length in DWARF
section headers for some targets for some code generation path.
For example, for XCOFF in assembly path, AIX assembler does not require
the debug section containing its debug unit length in the header.
Move emitDwarfUnitLength to MCStreamer class so that we can do
customization in different Streamers
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D95932
Same implementation as G_SEXT_INREG.
Add a testcase to combine-sext-inreg for a concrete example, and a testcase
to KnownBitsTest.
Differential Revision: https://reviews.llvm.org/D96897
Some of these accidentally disabled tests failed as a result; updated
tests per @qcolombet instructions. A small number needed additional
updates because legalization has actually changed since they were
written.
Found by the Rotten Green Tests project.
Differential Revision: https://reviews.llvm.org/D95257
It's the same as the ZEXT/TRUNC case, except SrcBitWidth is given by the
immediate operand.
Update KnownBitsTest.cpp and a MIR test for a concrete example.
Differential Revision: https://reviews.llvm.org/D95566
These are widened to a wider UADDE/USUBE, with the overflow value
unused, and with the same synthesis of a new overflow value as for the
O operations.
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D95326
Just use the existing `Known.sextInReg` implementation.
- Update KnownBitsTest.cpp.
- Update combine-redundant-and.mir for a more concrete example.
Differential Revision: https://reviews.llvm.org/D95484
The widenScalar implementation for signed and unsigned overflowing
operations were very similar: both are checked by truncating the result
and then re-sign/zero-extending it and checking that it matches the
computed operation.
Using a truncate + zero-extend for the unsigned case instead of manually
producing the AND instruction like before leads to an extra copy
instruction during legalization, but this should be harmless.
Differential Revision: https://reviews.llvm.org/D95035
Add a matcher that checks if the given subpattern has only one non-debug use.
Also improve existing m_OneUse testcase.
Differential Revision: https://reviews.llvm.org/D94705
Revert "Delete llvm::is_trivially_copyable and CMake variable HAVE_STD_IS_TRIVIALLY_COPYABLE"
This reverts commit 4d4bd40b57.
This reverts commit 557b00e0af.
If the size of memory access is unknown, do not use it to analysis. One
example of unknown size memory access is to load/store scalable vector
objects on the stack.
Differential Revision: https://reviews.llvm.org/D91833
Add a convenience matcher which handles
```
G_XOR %not_reg, -1
```
And a convenience matcher which returns true if an integer constant is
all-ones.
Differential Revision: https://reviews.llvm.org/D91459
It's fairly common to need matchers for a specific constant value, or for
common idioms like finding a negated register.
Add
- `m_SpecificICst`, which returns true when matching a specific value..
- `m_ZeroInt`, which returns true when an integer 0 is matched.
- `m_Neg`, which returns when a register is negated.
Also update a few places which use idioms related to the new matchers.
Differential Revision: https://reviews.llvm.org/D91397
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.
Replaces https://reviews.llvm.org/D74158
Differential Revision: https://reviews.llvm.org/D89613
A SMLoc allows MCStreamer to report location-aware diagnostics, which
were previously done by adding SMLoc to various methods (e.g. emit*) in an ad-hoc way.
Since the file:line is most important, the column is less important and
the start token location suffices in many cases, this patch reverts
b7e7131af2
```
// old
symbol-binding-changed.s:6:8: error: local changed binding to STB_GLOBAL
.globl local
^
// new
symbol-binding-changed.s:6:1: error: local changed binding to STB_GLOBAL
.globl local
^
```
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D90511
Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method.
I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns.
Differential Revision: https://reviews.llvm.org/D87930
The EXPECT_XY comparison functions all rely upon using the existing
TypeSize comparison operators, which we are deprecating in favour
of isKnownXY. I've changed all such cases to compare either the known
minimum size or the fixed size.
Differential Revision: https://reviews.llvm.org/D89531
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.
Differential Revision: https://reviews.llvm.org/D74158
If the known shift amount is bigger than or equal to the bitwidth of the type of the value to be shifted,
the result is target dependent, so don't try to infer any bits.
This fixes a crash we've seen in one of our internal test suites.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D89232
If a CSEMIRBuilder query hits the instruction at the current insert point,
move insert point ahead one so that subsequent uses of the builder don't end up with
uses before defs.
This fix also shows that AMDGPU was also affected by this bug often, but got away
with it because it was using a G_IMPLICIT_DEF before the use.
Differential Revision: https://reviews.llvm.org/D88605
When we know that a particular type is always going to be fixed
width we have so far been writing code like this:
getSizeInBits().getFixedSize()
Since we are doing this in quite a few places now it seems to make
sense to add a new helper function that allows us to replace
these calls with a single getFixedSizeInBits() call.
Differential Revision: https://reviews.llvm.org/D88649
Added unittests. In the process, separated core construction - which just
needs the hits, order, and 'HardHints' values - from construction from
current register allocation state, to simplify testing.
Differential Revision: https://reviews.llvm.org/D88455
Fix creation of illegal unmerge when widen was requested to a type which
is not a multiple of the destination type. E.g. when trying to widen
an s48 unmerge to s64 the existing code would create an illegal unmerge
from s64 to s48.
Instead, create further unmerges to a GCD type, then use this to remerge
these intermediate results to the actual destinations.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D88422
After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and TypeSize,
since this could give the impression that these classes can be used
in the same way as basic integer integer types. However, division
for scalable types is a bit odd because we are only dividing the
minimum quantity by a value, as opposed to something like:
(MinSize * Vscale) / SomeValue
This is why when performing division it's important the caller
first establishes whether the operation makes sense, perhaps by
calling isKnownMultipleOf() prior to division. The caller must now
explictly call divideCoefficientBy() on the class to perform the
operation.
Differential Revision: https://reviews.llvm.org/D87700
An AsmPrinter should always be provided to the method because some forms
depend on its parameters. The only place in the codebase which passed
a nullptr value was found in the unit tests, so the patch updates it to
use some dummy AsmPrinter instead.
Differential Revision: https://reviews.llvm.org/D85293
This is mostly an NFC patch because the involved methods are used when
emitting DWO files, which is incompatible with DWARFv3, or for platforms
where DWARF64 is not supported yet.
Differential Revision: https://reviews.llvm.org/D87015
The patch also adds a method to choose an appropriate DWARF form
to represent section offsets according to the version and the format
of producing debug info.
Differential Revision: https://reviews.llvm.org/D87014
DW_FORM_sec_offset and DW_FORM_strp imply values of different sizes with
DWARF32 and DWARF64. The patch fixes DIE value classes to use correct
sizes when emitting their values. For DIELocList it ensures that the
requested DWARF form matches the current DWARF format because that class
uses a method that selects the size automatically.
Differential Revision: https://reviews.llvm.org/D87009
These methods are used to emit values which are 32-bit in DWARF32 and
64-bit in DWARF64. The patch fixes them so that they choose the length
automatically, depending on the DWARF format set in the Context.
Differential Revision: https://reviews.llvm.org/D87008
Halide users reported this here: https://llvm.org/pr46176
I reported the issue to MSVC here:
https://developercommunity.visualstudio.com/content/problem/1179643/msvc-copies-overaligned-non-trivially-copyable-par.html
This codepath is apparently not covered by LLVM's unit tests, so I added
coverage in a unit test.
If we want to support this configuration going forward, it means that is
in general not safe to pass a SmallVector<T, N> by value if alignof(T)
is greater than 4. This doesn't appear to come up often because passing
a SmallVector by value is inefficient and not idiomatic: it copies the
inline storage. In this case, the SmallVector<LLT,4> is captured by
value by a lambda, and the lambda is passed by value into std::function,
and that's how we hit the bug.
Differential Revision: https://reviews.llvm.org/D87475
Use forward declarations and move the include down to dependent files that actually use it.
This also exposes a number of implicit dependencies on KnownBits.h
This relands e9a3d1a401 which was originally
missing linking LLVMSupport into LLMVFileCheck which broke the SHARED_LIBS build.
Original summary:
The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a
good reason for having FileCheck implemented there as it has a very specific use
while LLVMSupport is a dependency of pretty much every LLVM tool there is. In
fact, the only use of FileCheck I could find (outside the FileCheck tool and the
FileCheck unit test) is a single call in GISelMITest.h.
This moves the FileCheck logic to its own LLVMFileCheck library. This way only
FileCheck and the GlobalISelTests now have a dependency on this code.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D86344
The actual FileCheck logic seems to be implemented in LLVMSupport. I don't see a
good reason for having FileCheck implemented there as it has a very specific use
while LLVMSupport is a dependency of pretty much every LLVM tool there is. In
fact, the only use of FileCheck I could find (outside the FileCheck tool and the
FileCheck unit test) is a single call in GISelMITest.h.
This moves the FileCheck logic to its own LLVMFileCheck library. This way only
FileCheck and the GlobalISelTests now have a dependency on this code.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D86344
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().
Differential Revision: https://reviews.llvm.org/D86065