There are two different senses in which a block can be "address-taken".
There can be a BlockAddress involved, which means we need to map the
IR-level value to some specific block of machine code. Or there can be
constructs inside a function which involve using the address of a basic
block to implement certain kinds of control flow.
Mixing these together causes a problem: if target-specific passes are
marking random blocks "address-taken", if we have a BlockAddress, we
can't actually tell which MachineBasicBlock corresponds to the
BlockAddress.
So split this into two separate bits: one for BlockAddress, and one for
the machine-specific bits.
Discovered while trying to sort out related stuff on D102817.
Differential Revision: https://reviews.llvm.org/D124697
r175673 changed repairIntervalsInRange to find anchoring end points for
ranges automatically, but the calculation of Begin included the first
instruction found that already had an index. This patch changes it to
exclude that instruction:
1. For symmetry, so that the half open range [Begin,End) only includes
instructions that do not already have indexes.
2. As a possible performance improvement, since repairOldRegInRange
will scan fewer instructions.
3. Because repairOldRegInRange hits assertion failures in some cases
when it sees a def that already has a live interval.
(3) fixes about ten tests in the CodeGen lit test suite when
-early-live-intervals is forced on.
Differential Revision: https://reviews.llvm.org/D110182
-march=hexagon uses the default target triple and changes the arch part of
hexagon. On linux-musl, this essentially becomes hexagon-unknown-linux-musl
which has different code generation. Use -mtriple instead.
Link: https://github.com/llvm/llvm-project/issues/48936
Such vector types cannot be split at the moment, because splitting expects
an even number of elements in the source type. If a target marks such a type
as "split", TargetLoweringBase::computeRegisterProperties will override it
with widening to the next power of 2. This could lead to issues during
instruction selection when conflicting information about how to handle this
type is present.
The test specifies a CPU arch but not a particular OS. So if run on
OpenBSD it acts as if it's an OpenBSD/hexagon system. OpenBSD uses
__guard_local instead of __stack_chk_guard so the test will fail. So
specify an OS other than OpenBSD fixes the test.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126265
At some point in instruction selection, A2_tfrsi Constant:i32<...> was
created, where the "Constant" came from SelectAnyInt. Since it wasn't
a TargetConstant, it was selected again, leading to
%vreg = A2_tfrsi ...
... = A2_tfrsi %vreg
which is not a valid code.
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to
reduce memory usage when it is not needed.
Differential Revision: https://reviews.llvm.org/D120714
This is a fix for a crash in the HexagonOptAddrMode pass that was looking
for the third operand (offset) in the following instruction that does not,
in fact, have a third operand:
$r1 = L2_loadw_locked $r1
Additionally, this patch also adds an addrMode value to vgather pseudos
in the Hexagon backend.
Differential Revision: https://reviews.llvm.org/D117133
This commit fixes a missed opportunity in merging consecutive stores.
The code that searches for stores skipped the case of stores that
directly connect to the root. The comment above the implementation lists
this case but the code did not handle it. I found this pattern when
looking into the shared_ptr destructor. GCC generates the right
sequence. Here is a small repo:
int foo(int* buff) {
buff[0] = 0;
int x = buff[1];
buff[1] = 0;
return x;
}
Differential Revision: https://reviews.llvm.org/D116895
Lower select(I1,Q,Q) by converting vector predicate Q to vector register V,
doing select(I1,V,V), and then converting the resulting V back to Q. Also,
try to avoid creating such situations in the first place.
This change extends the addressing mode optimization
pass to HVX vgather. This is specifically intended to
resolve compiler not generating indexed addresses for
vgather stores to vtcm. Changed the vgather pseudo
instructions to accept an immediate operand and handled
addition of appropriate immediate operand in addressing
mode optimization pass.
Ideally we should make USR as Def for these floating point instructions.
However, it violates some assembler MCChecker rules. This patch fixes
the issue by marking these FP instructions as non-sinkable.
For code below:
{
r7 = addasl(r3,r0,#2)
r8 = addasl(r3,r2,#2)
r5 = memw(r3+r0<<#2)
r6 = memw(r3+r2<<#2)
}
{
p1 = cmp.gtu(r6,r5)
if (p1.new) memw(r8+#0) = r5
if (p1.new) memw(r7+#0) = r6
}
{
r0 = mux(p1,r2,r4)
}
In packetizer, a new packet is created for the cmp instruction since
there arent enough resources in previous packet. Also it is determined
that the cmp stalls by 2 cycles since it depends on the prior load of r5.
In current packetizer implementation, the predicated store is evaluated
for whether it can go in the same packet as compare, and since the compare
stalls, the stall of the predicated store does not matter and it can go in
the same packet as the cmp. However the predicated store will stall for
more cycles because of its dependence on the addasl instruction and to
avoid that stall we can put it in a new packet.
Improve the packetizer to check if an instruction being added to packet
will stall longer than instruction already in packet and if so create a
new packet.
The HexagonVectorCombine pass was moving an instruction
incorrectly, which caused a use in a GEP that was not yet
defined.
HexagonVectorCombine removes a load from a group due to its
dependences, but in realignGroup, the load is processed anyways.
In realignGroup, when determining the maximum alignment, only
those instructions still in the group should be considered.
This reverts commit ba07f300c6.
A build-vector sequence is made of pairs: rotate+insert. When constructing
a single vector, this results in a chain of 2*N instructions. The rotate
operation is a permute operation, but the insert uses a multiplication
resource: insert and rotate can execute in the same cycle, but obviously
they cannot operate on the same vector. The original halving idea is still
beneficial since it does allow for insert/rotate overlap, and for hiding
insert's latency.
For vectors with repeating values, old codegen would rotate and insert
every duplicate element. This patch replaces that behavior with a splat
of the most common element, vinsert/vror only occur when needed.
They are the same as for the other HVX vectors, but types need to be
listed explicitly. Also, add a detailed codegen testcase.
Co-authored-by: Abhikrant Sharma <quic_abhikran@quicinc.com>
Adds x-ray support for hexagon to llvm codegen, clang driver,
compiler-rt libs.
Differential Revision: https://reviews.llvm.org/D113638
Reapplying this after 543a9ad7c4,
which fixes the leak introduced there.