This patch extends support for generating huge stack frames on 64-bit XPLINK by implementing the ABI-mandated call to the stack extension routine.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D120450
Add support for va intrinsics for the XPLINK ABI.
Only the extended vararg variant, which uses a pointer to next
argument, is supported. The standard variant will build on this.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D120148
The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). It becomes a bitcast from f32 to i64.
Since we don't handle a bitcast for f32s this caused an assertion.
With XPLINK, dynamic stack allocations requires calling
a runtime function, which allocates the stack memory,
moves the register save area, and returns the new
stack pointer.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D119732
With XPLINK, a no-op with information about the call type is emitted
after each call instruction. Centralizing it has the advantage that it is
easy to document all cases, and it makes it easier to extend it later
(e.g. dynamic stack allocation, 32 bit mode).
Also add a test checking the call types emitted so far.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D119557
The XPLINK return `b 2(7)` has size 4 bytes, while the Linux return
`br 7` only has size 2 bytes. Thus a new alias is required to have correct
instruction byte count. It also fixes the conditional return code.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D119437
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h
Preprocessed lines to build llvm on my setup:
after: 1068185081
before: 1068324320
So no compile time benefit to expect, but we still get the looser coupling
between files which is great.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119359
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
Return false from ShouldShrinkFPConstant(), so that these constants are stored
in their full size on the constant pool, even if they could have been shrunk
and used with an extending load.
This is better since LD is faster than LDE, and it also enables reg/mem opcodes.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D117927
By reordering the objects on the stack frame after looking at the users, a
better utilization of displacement operands will result. This means less
needed Load Address instructions for the accessing of these objects.
This is important for very large functions where otherwise small changes
could cause a lot more/less accesses go out of range.
Note: this is not yet enabled for SystemZXPLINKFrameLowering, but should be.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D115690
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
This flag was set in the presence of stacksave/stackrestore in order to force
a frame pointer.
This should however not be needed per the comment in MachineFrameInfo.h
stating that a a variable sized object "...is the sole condition which
prevents frame pointer elimination", and experiments have also shown that
there seems to be no effect whatsoever on code generation with ManipulatesSP.
Review: Ulrich Weigand
This reverts commit fd4808887e.
This patch causes gcc to issue a lot of warnings like:
warning: base class ‘class llvm::MCParsedAsmOperand’ should be
explicitly initialized in the copy constructor [-Wextra]
This patch adds support for prologue and epilogue generation for the z/OS target under the XPLINK64 ABI for functions with a stack size of less than 1048576 bytes (huge stack frames).
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D114457
This patch adds support for prologue and epilogue generation for
the z/OS target under the XPLINK64 ABI for functions with a stack
size of less than 1048576 bytes (huge stack frames).
Reviewed by: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D114457
The AsmParser checks the range of a PC-relative operand, but only if it is
immediate.
This patch adds range checks for operands in applyFixup(), at which point the
offset to a label is known.
The diagnostic message for an operand that is out of range is explicit (with
given value and min/max limits). This is now also done for displacement
fixups.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D114194
Memset with a constant length was implemented with a single store followed by
a series of MVC:s. This patch changes this so that one store of the byte is
emitted for each MVC, which avoids data dependencies between the MVCs. An
MVI/STC + MVC(len-1) is done for each block.
In addition, memset with a variable length is now also handled without a
libcall. Since the byte is first stored and then MVC is used from that
address, a length of two must now be subtracted instead of one for the loop
and EXRL. This requires an extra check for the one-byte case, which is
handled in a special block with just a single MVI/STC (like GCC).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D112004
AMDGPU is unusual in that the both stack is indexed in the same
direction as stack growth (up). We therefore always need the emergency
stack slots placed as low as possible to ensure they are in range of
load/store instruction immediate offsets. The existing logic is mostly
OK, but failed if we required stack realignment.
I don't understand what the existing control isFPCloseToIncomingSP is
supposed to mean, but can only be used to stop placing the scavenge
slots earlier. Make this explicit so that targets can opt-in rather
than opt-out only.
Delegate updating of LiveIntervals to each target's
convertToThreeAddress implementation, instead of repairing LiveIntervals
after the fact in TwoAddressInstruction::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D113493
This patch adds support for symbolic displacements, e.g. like 'lg %r0,
sym(%r1)', which is done using relocations. This is needed to compile the
kernel without disabling the integrated assembler.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D113341
It was discovered that an extra register COPY remained when expanding a
(variable length) memory operation with a loop and there was another use of
the involved address register(s) afterwards.
A simple fix for this is to COPY the address registers before the loop and
use that new vreg instead.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D112065
All instructions must have a correct size value close to emission when
SystemZLongBranch runs, or a necessary branch relaxation may be missed.
This patch also adds an assert for instruction sizes in SystemZLongBranch.
Review: Ulrich Weigand
This pseudo is expanded very late (AsmPrinter) and therefore has to have a
correct size value, or the branch relaxation pass may make a wrong decision.
Review: Ulrich Weigand
- This patch provides the initial implementation for lowering a call on z/OS according to the XPLINK64 calling convention
- A series of changes have been made to SystemZCallingConv.td to account for these additional XPLINK64 changes including adding a new helper function to shadow the stack along with allocation of a register wherever appropriate
- For the cases of copying a f64 to a gr64 and a f128 / 128-bit vector type to a gr64, a `CCBitConvertToType` has been added and has been bitcasted appropriately in the lowering phase
- Support for the ADA register (R5) will be provided in a later patch.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D111662
Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....
Differential Revision: https://reviews.llvm.org/D111998
This patch fixes the bug that consisted of treating variable / immediate
length mem operations (such as memcpy, memset, ...) differently. The variable
length case needs to have the length minus 1 passed due to the use of EXRL
target instructions. However, the DAGCombiner can convert a register length
argument into a constant one, and whenever that happened one byte too little
would end up being performed.
This is also a refactorization by reducing the number of opcodes and variants
involved. For any opcode (variable or constant length), only the length minus
one is passed on to the ISD node. The rest of the logic is now instead
handled during isel pseudo expansion.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D111729
This PR implements the save of the XPLINK callee-saved registers
on z/OS.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D111653
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Seem to cause test failures in compiler-rt.
Revert "[SystemZ] Implement memcmp of variable length with CLC."
This reverts commit 7a4e9a0c73.
Revert "[SystemZ] Implement memcpy of variable length with MVC."
This reverts commit c6c13c58ee.
Following the same pattern of memset/memcpy, this patch implements a variable
length memcmp with a CLC loop followed by an EXRL instruction.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D107380
Instead of making a memcpy libcall, emit an MVC loop and an EXRL instruction
the same way as is already done for memset 0.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D106874
Note that SystemZMnemonicSpellCheck is defined in
SystemZGenAsmMatcher.inc, which SystemZAsmParser.cpp includes.
Identified with readability-redundant-declaration.
- This patch adds in the GOFFMCAsmInfo interfaces for the z/OS target.
- This patch decouples the previously existing SystemZMCAsmInfo interface for the ELF target and the z/OS target.
- This patch also removes a small test in the SystemZAsmLexerTest.cpp. The reason for this is because, the test is set up for the s390x-ibm-linux (SystemZ ELF triple), and the test checks a function which is overridden only for the z/OS target. The reason we can't change the test to use a z/OS triple outright is because there is still missing support which prevents the successful running of a test (assert in AsmParser.cpp due to missing GOFFAsmParser support)
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D110077
This patch changes hard-coded usages of SystemZ::R15D with calls to the getStackPointerRegister function. Uses in the LowerCall function are avoided to avoid merge conflicts with an expected upcoming patch.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D109702
- This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation.
- Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend.
Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay
Differential Revision: https://reviews.llvm.org/D109362
The type legalizer has by default no method of doing this bitcast other than
storing and reloading the value from stack.
This patch implements a custom lowering of this operation using extractions
of subregs (z13 and earlier using FP128 register pairs), or of vector
elements (with 'vector enhancements 1' using VR128 FP registers).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D110346
This simplifies the API and addresses a FIXME in
TwoAddressInstructionPass::convertInstTo3Addr.
Differential Revision: https://reviews.llvm.org/D110229
Based off a discussion on D110100, we should be avoiding default CostKinds whenever possible.
This initial patch removes them from the 'inner' target implementation callbacks - these should only be used by the main TTI calls, so this should guarantee that we don't cause changes in CostKind by missing it in an inner call. This exposed a few missing arguments in getGEPCost and reduction cost calls that I've cleaned up.
Differential Revision: https://reviews.llvm.org/D110242
SystemZ adds the EXRL target instructions in the end of each file. This must
be done before debug info emission since that may end the text section, and
therefore this is now done in emitConstantPools() (instead of in
emitEndOfAsmFile).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109513
The .machine directive can be used in assembly files to specify the ISA for
the instructions following it.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D109660
This patch adds class SystemZFrameLowering which is a SystemZ-specific class
detailing special registers used by calling conventions on the target.
SystemZELFFrameLowering and SystemZXPLINKFrameLowering implement this class
for ELF and XPLINK64 respectively. Previous functionality in SystemZFrameLowering
is moved to SystemZELFFrameLowering. SystemZXPLINKFrameLowering can then be
implemented in future patches.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D108777
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
The backend generally uses 64-bit immediates (e.g. what
MachineOperand::getImm() returns), so use that for analyzeCompare()
and optimizeCompareInst() as well. This avoids truncation for
targets that support immediates larger 32-bit. In particular, we
can avoid the bugprone value normalization hack in the AArch64
target.
This is a followup to D108076.
Differential Revision: https://reviews.llvm.org/D108875
This patch replaces the SpecialRegisters field with a unique_ptr instead of a raw pointer. This is better practice, and allows us to remove the definition of the dtor for the SystemZSubtarget class.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D108639
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D107271
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D106380
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10304.
Note: No currently available Z system supports the arch14
architecture. Once new systems become available, the
official system name will be added as supported -march name.
Don't use a local MachineOperand copy in SystemZAsmPrinter::PrintAsmOperand()
and change the register as it may break the MRI tracking of register
uses. Use an MCOperand instead.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D105757
The odd register of a (128 bit) register pair is accessed with the 'N' code
with an inline assembly operand.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D105502
Benchmarking has shown that it is worthwhile to implement a variable length
memset of 0 with XC (exclusive or) like gcc does, instead of using a libcall.
This requires the use of the EXecute Relative Long (EXRL) instruction which
can now be done in a framework that can also be used with other target
instructions (not just XC).
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D103865
Add support for the .reloc directive along the lines of
other back-ends.
This fixes a regression after https://reviews.llvm.org/D104080
was merged, since that patch presupposed support for .reloc.
- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it.
- However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`)
- To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D104715
Since this method can apply to cmpxchg operations, make sure it's clear
what value we're actually retrieving. This will help ensure we don't
accidentally ignore the failure ordering of cmpxchg in the future.
We could potentially introduce a getOrdering() method on AtomicSDNode
that asserts the operation isn't cmpxchg, but not sure that's
worthwhile.
Differential Revision: https://reviews.llvm.org/D103338
The implementation of subword atomics does not actually
guarantee the result is zero-extended, which now caused
build bot failures after https://reviews.llvm.org/D101342
was landed.
Support virtual, physical and tied i128 register operands in inline assembly.
i128 is on SystemZ not really supported and is not a legal type and generally
such a value will be split into two i64 parts. There are however some
instructions that require a pair of two GPR64 registers contained in the GR128
bit reg class, which is untyped.
For inline assmebly operands, it proved to be very cumbersome to first follow
the general behavior of splitting an i128 operand into two parts and then
later rebuild the INLINEASM MI to have one GR128 register. Instead, some
minor common code changes were made to SelectionDAGBUilder to only create one
GR128 register part to begin with. In particular:
- getNumRegisters() now has an optional parameter "RegisterVT" which is
passed by AddInlineAsmOperands() and GetRegistersForValue().
- The bitcasting in GetRegistersForValue is not performed if RegVT is
Untyped.
- The RC for a tied use in AddInlineAsmOperands() is now computed either from
the tied def (virtual register), or by getMinimalPhysRegClass() (physical
register).
- InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register
class (DstRC) can also be computed for an illegal type.
In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and
joinRegisterPartsIntoValue() have been implemented to handle i128 operands.
Differential Revision: https://reviews.llvm.org/D100788
Review: Ulrich Weigand
- Currently, LLVM supports symbols of the name "token1@token2".
- "token2" is used to identify whether an appropriate symbol reference can be used for the symbol.
- Now, if the symbol reference couldn't be found, the AsmParser usually emits an error, unless the backend is configured to accept the "@" in a symbol name
- Thus, this patch aims to do that. It sets the `AllowAtInName` attribute in the SystemZ backend for the HLASM dialect.
- Setting this attribute ensures that, if a particular symbol reference is found, it uses that. If it doesn't, and there exists an "@" in the symbol name, it will use that instead of explicitly erroring out.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103111
- Currently, before printing a label in MCSymbol.cpp (MCSymbol::print), the current code "validates" the label that is to be printed.
- If it fails the validation step, then it prints the label within double quotes.
- However, the validation is provided as a virtual function in MCAsmInfo.h (i.e. isAcceptableChar() function). So we can override this for the AD_HLASM dialect in SystemZMCAsmInfo.cpp.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103091