Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76348
Extends the existing support for spilling and restoring the condition
register to the linkage area for 32-bit targets, and enables for AIX.
Differential Revision: https://reviews.llvm.org/D74349
Remove some cumbersome Darwin specific logic for updating the frame
offsets of the condition-register spill slots. The containing function has an
early return if the subtarget is not ELF based which makes the Darwin logic
dead.
Create preprocessor defines for callee saved floating-point register spill
slots, vector register spill slots, and both 32-bit and 64-bit general
purpose register spill slots. This is an NFC refactor to prepare for
adding ABI compliant callee saves and restores for AIX.
Skip the loop over the CalleSavedInfos in 'restoreCalleeSavedRegisters' when
the register is a CR field and we are not targeting 32-bit ELF. This is safe
because:
1) The helper function 'restoreCRs' returns if the target is not 32-bit ELF,
making all the code in the loop related to CR fields dead for every other
subtarget. This code is only called on ELF right now, but the patch
to extend it for AIX also needs to skip 'restoreCRs'.
2) The loop will not otherwise modify the iterator, so the iterator
manipulations at the bottom of the loop end up setting 'I' to its
current value.
This simplifciation allows us to remove one argument from 'restoreCRs'.
Also add a helper function to determine if a register is one of the
callee saved condition register fields.
This patch:
- enable frame pointer for AIX;
- update some of red zone comments;
- add/update testcases;
Differential Revision: https://reviews.llvm.org/D72454
On little endian targets prior to Power9, we spill vector registers using a
swapping store (i.e. stdxvd2x saves the vector with the two doublewords in
big endian order regardless of endianness). This is generally not a problem
since we restore them using the corresponding swapping load (lxvd2x). However
if the restore is done by the unwinder, the vector register contains data in
the incorrect order.
This patch fixes that by using Altivec loads/stores for vector saves and
restores in PEI (which keep the order correct) under those specific conditions:
- EH aware function
- Subtarget requires swaps for VSX memops (Little Endian prior to Power9)
Differential revision: https://reviews.llvm.org/D73692
hasReservedSpillSlot returns a dummy frame index of '0' on PPC64 for the
non-volatile condition registers, which leads to the CalleSavedInfo
either referencing an unrelated stack object, or an invalid object if
there are no stack objects. The latter case causes the mir-printer to
crash due to assertions that checks if the frame index referenced by a
CalleeSavedInfo is valid.
To fix the problem create an immutable FixedStack object at the correct offset
in the linkage area of the previous stack frame (ie SP + positive offset).
Differential Revision: https://reviews.llvm.org/D73709
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.
In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:
struct FunctionDescriptor {
void *EntryPoint;
void *TOCAnchor;
void *EnvironmentPointer;
};
An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:
1) Save the caller's TOC pointer into the TOC save slot in the linkage
area, and then load the callee's TOC pointer into the TOC register
(GPR 2 on AIX).
2) Load the function descriptor's entry point into the count register.
3) Load the environment pointer into the environment pointer register
(GPR 11 on AIX).
4) Perform the call by branching on count register.
5) Restore the caller's TOC pointer after returning from the indirect call.
A couple important caveats to the above:
- There is no way to directly load a value from memory into the count register.
Instead we populate the count register by loading the entry point address into
a gpr and then moving the gpr to the count register.
- The TOC restore has to come immediately after the branch on count register
instruction (i.e., the 1st instruction executed after we return from the
call). This is an implementation limitation. We could, in theory, schedule
the restore elsewhere as long as no uses of the TOC pointer fall in between
the call and the restore; however, to keep it simple, we insert a pseudo
instruction that represents both the indirect branch instruction and the
load instruction that restores the caller's TOC from the linkage area. As
they flow through the compiler as a single pseudo instruction, nothing can be
inserted between them and the caller's TOC is then valid at any use.
Differtential Revision: https://reviews.llvm.org/D70724
This adds a flag to LLVM and clang to always generate a .debug_frame
section, even if other debug information is not being generated. In
situations where .eh_frame would normally be emitted, both .debug_frame
and .eh_frame will be used.
Differential Revision: https://reviews.llvm.org/D67216
Summary:
Since the SPE4RC register class contains an identical set of registers
and an identical spill size to the GPRC class its slightly confusing
the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized
register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0.
This is because SPE4C is found first in the super register class list
when inheriting these properties and it doesn't set the VTs or
AltOrders the same way as GPRC or GPRC_NOR0.
This patch replaces all uses of GPE4RC with GPRC and allows GPRC and
GPRC_NOR0 to contain f32.
The test changes here are because the AltOrders are being inherited
to GPRC_NOR0 now.
Found while trying to determine if getCommonSubClass needs to take
a VT argument. It was originally added to support fp128 on x86-64,
I've changed some things about that so that it might be needed
anymore. But a PowerPC test crashed without it and I think its
due to this subclass issue.
Reviewers: jhibbits, nemanjai, kbarton, hfinkel
Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67513
llvm-svn: 371779
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
Trying again with the code changes (and not just the new test).
Summary:
This patch fixes the offsets of fields in the stack frame linkage save
area for AIX.
Reviewers: sfertile, hubert.reinterpretcast, jasonliu, Xiangling_L, xingxue, ZarkoCA, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64424
Patch by Chris Bowler!
llvm-svn: 368721
In preperation for AIX support in FrameLowering: replace a number of literal
'8' that represent the stack offset of the condition register save area with
a member in PPCFrameLowering.
Patch by Chris Bowler.
llvm-svn: 367111
The indirect call sequence on PPC requires that the TOC base register be saved
prior to the indirect call and restored after the call since the indirect call
may branch to a global entry point in another DSO which will update the TOC
base. Over the last couple of years, we have improved this to:
- be able to hoist TOC saves from loops (with changes to MachineLICM)
- avoid multiple saves when one dominates the other[s]
However, it is still possible to have multiple TOC saves dynamically in the
execution path if there is no dominance relationship between them.
This patch moves the TOC save to the prologue when one of the TOC saves is in a
block that post-dominates entry (i.e. it cannot be avoided) or if it is in a
block that is hotter than entry.
Differential revision: https://reviews.llvm.org/D63803
llvm-svn: 365232
Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.
Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma
Differential Revision: https://reviews.llvm.org/D61948
llvm-svn: 361669
Move the stdu instruction in the prologue and epilogue.
This should provide a small performance boost in functions that are able to do
this. I've kept this change rather conservative at the moment and functions
with frame pointers or base pointers will not try to move the stack pointer
update.
Differential Revision: https://reviews.llvm.org/D42590
llvm-svn: 355085
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.
Differential Revision: https://reviews.llvm.org/D39386
llvm-svn: 346512
Summary:
The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1,
e500v2, and several e200 cores. This adds support targeting the e500v2,
as this is more common than the e500v1, and is in SoCs still on the
market.
This patch is very intrusive because the SPE is binary incompatible with
the traditional FPU. After discussing with others, the cleanest
solution was to make both SPE and FPU features on top of a base PowerPC
subset, so all FPU instructions are now wrapped with HasFPU predicates.
Supported by this are:
* Code generation following the SPE ABI at the LLVM IR level (calling
conventions)
* Single- and Double-precision math at the level supported by the APU.
Still to do:
* Vector operations
* SPE intrinsics
As this changes the Callee-saved register list order, one test, which
tests the precise generated code, was updated to account for the new
register order.
Reviewed by: nemanjai
Differential Revision: https://reviews.llvm.org/D44830
llvm-svn: 337347
The instruction sequence used to get the address of the PC into a GPR requires
that we clobber the link register. Doing so without having first saved it in
the prologue leaves the function unable to return. Currently, this sequence is
emitted into the entry block. To ensure the prologue is inserted before this
sequence, disable shrink-wrapping.
This fixes PR33547.
Differential Revision: https://reviews.llvm.org/D43677
llvm-svn: 325972
candidates with coldcc attribute.
This recommits r322721 reverted due to sanitizer memory leak build bot failures.
Original commit message:
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.
Differential Revision: https://reviews.llvm.org/D38413
llvm-svn: 323778
candidates with coldcc attribute.
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.
Differential Revision: https://reviews.llvm.org/D38413
llvm-svn: 322721
This patch makes the following changes to the schedule of instructions in the
prologue and epilogue.
The stack pointer update is moved down in the prologue so that the callee saves
do not have to wait for the update to happen.
Saving the lr is moved down in the prologue to hide the latency of the mflr.
The stack pointer is moved up in the epilogue so that restoring of the lr can
happen sooner.
The mtlr is moved up in the epilogue so that it is away form the blr at the end
of the epilogue. The latency of the mtlr can now be hidden by the loads of the
callee saved registers.
This commit is almost identical to this one: r322036 except that two warnings
that broke build bots have been fixed.
The revision number is D41737 as before.
llvm-svn: 322124
This patch makes the following changes to the schedule of instructions in the
prologue and epilogue.
The stack pointer update is moved down in the prologue so that the callee saves
do not have to wait for the update to happen.
Saving the lr is moved down in the prologue to hide the latency of the mflr.
The stack pointer is moved up in the epilogue so that restoring of the lr can
happen sooner.
The mtlr is moved up in the epilogue so that it is away form the blr at the end
of the epilogue. The latency of the mtlr can now be hidden by the loads of the
callee saved registers.
Differential Revision: https://reviews.llvm.org/D41737
llvm-svn: 322036
The function createTailCallBranchInstr assumes that the iterator MBBI is valid.
However, only one use of MBBI is guarded in the function.
Fix this by adding an assert.
Differential Revision: https://reviews.llvm.org/D41358
llvm-svn: 321205
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.
Differential Revision: https://reviews.llvm.org/D36160
llvm-svn: 310619
1. The available program storage region of the red zone to compilers is 288
bytes rather than 244 bytes.
2. The formula for negative number alignment calculation should be
y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1).
Differential Revision: https://reviews.llvm.org/D34337
llvm-svn: 307672
The variables MinGPR/MinG8R were not updated properly when resetting the
offsets, which in the included testcase lead to saving the CR register
in the same location as R30.
This fixes another issue reported in PR26519.
Differential Revision: https://reviews.llvm.org/D33017
llvm-svn: 303257
This happened on the PPC32/SVR4 path and was discovered when building
FreeBSD on PPC32. It was a typo-class error in the frame lowering code.
This fixes PR26519.
llvm-svn: 302183
1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
- TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
- TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;
This will allow making those values depend on subtarget features in the
future.
Differential Revision: https://reviews.llvm.org/D31783
llvm-svn: 301221
This is per function data so it is better kept at the function instead
of the module.
This is a necessary step to have machine module passes work properly.
Differential Revision: https://reviews.llvm.org/D27185
llvm-svn: 288291