There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
AArch32/Armv8A introduced the performance deprecation of certain patterns
of IT instructions. After some debate internal to ARM, this is now being
reverted; i.e. no IT instruction patterns are performance deprecated
anymore, as the perfomance degredation is not significant enough.
This reverts the following:
"ARMv8-A deprecates some uses of the T32 IT instruction. All uses of
IT that apply to instructions other than a single subsequent 16-bit
instruction from a restricted set are deprecated, as are explicit
references to the PC within that single 16-bit instruction. This permits
the non-deprecated forms of IT and subsequent instructions to be treated
as a single 32-bit conditional instruction."
The deprecation no longer applies, but the behaviour may be controlled
by the -arm-restrict-it and -arm-no-restrict-it command-line options,
with the latter being the default. No warnings about complex IT blocks
will be generated.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118044
This patch introduces support for targetting the Armv9.3-A architecture,
which should map to the existing Armv8.8-A extensions.
Differential Revision: https://reviews.llvm.org/D116158
This patch implements PAC return address signing for armv8-m. This patch roughly
accomplishes the following things:
- PAC and AUT instructions are generated.
- They're part of the stack frame setup, so that shrink-wrapping can move them
inwards to cover only part of a function
- The auth code generated by PAC is saved across subroutine calls so that AUT
can find it again to check
- PAC is emitted before stacking registers (so that the SP it signs is the one
on function entry).
- The new pseudo-register ra_auth_code is mentioned in the DWARF frame data
- With CMSE also in use: PAC is emitted before stacking FPCXTNS, and AUT
validates the corresponding value of SP
- Emit correct unwind information when PAC is replaced by PACBTI
- Handle tail calls correctly
Some notes:
We make the assembler accept the `.save {ra_auth_code}` directive that is
emitted by the compiler when it saves a register that contains a
return address authentication code.
For EHABI we need to have the `FrameSetup` flag on the instruction and
handle the `t2PACBTI` opcode (identically to `t2PAC`), so we can emit
`.save {ra_auth_code}`, instead of `.save {r12}`.
For PACBTI-M, the instruction which computes return address PAC should use SP
value before adjustment for the argument registers save are (used for variadic
functions and when a parameter is is split between stack and register), but at
the same it should be after the instruction that saves FPCXT when compiling a
CMSE entry function.
This patch moves the varargs SP adjustment after the FPCXT save (they are never
enabled at the same time), so in a following patch handling of the `PAC`
instruction can be placed between them.
Epilogue emission code adjusted in a similar manner.
PACBTI-M code generation should not emit any instructions for architectures
v6-m, v8-m.base, and for A- and R-class cores. Diagnostic message for such cases
is handled separately by a future ticket.
note on tail calls:
If the called function has four arguments that occupy registers `r0`-`r3`, the
only option for holding the function pointer itself is `r12`, but this register
is used to keep the PAC during function/prologue epilogue and clobbers the
function pointer.
When we do the tail call we need the five registers (`r0`-`r3` and `r12`) to
keep six values - the four function arguments, the function pointer and the PAC,
which is obviously impossible.
One option would be to authenticate the return address before all callee-saved
registers are restored, so we have a scratch register to temporarily keep the
value of `r12`. The issue with this approach is that it violates a fundamental
invariant that PAC is computed using CFA as a modifier. It would also mean using
separate instructions to pop `lr` and the rest of the callee-saved registers,
which would offset the advantages of doing a tail call.
Instead, this patch disables indirect tail calls when the called function take
four or more arguments and the return address sign and authentication is enabled
for the caller function, conservatively assuming the caller function would spill
LR.
This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:
https://developer.arm.com/documentation/ddi0553/latest
The following people contributed to this patch:
- Momchil Velikov
- Ties Stuij
Reviewed By: danielkiss
Differential Revision: https://reviews.llvm.org/D112429
Some instructions with i8 immediate ranges can only hold negative values
(like t2LDRHi8), only hold positive values (like t2STRT) or hold +/-
depending on the U bit (like the pre/post inc instructions. e.g
t2LDRH_POST). This patch splits the AddrModeT2_i8 into AddrModeT2_i8,
AddrModeT2_i8pos and AddrModeT2_i8neg to make this clear.
This allows us to get the offset ranges of t2LDRHi8 correct in the
load/store optimizer, fixing issues where we could end up creating
instructions with positive offsets (which may then be encoded as ldrht).
Differential Revision: https://reviews.llvm.org/D114638
armv9-a, armv9.1-a and armv9.2-a can be targeted using the -march option
both in ARM and AArch64.
- Armv9-A maps to Armv8.5-A.
- Armv9.1-A maps to Armv8.6-A.
- Armv9.2-A maps to Armv8.7-A.
- The SVE2 extension is enabled by default on these architectures.
- The cryptographic extensions are disabled by default on these
architectures.
The Armv9-A architecture is described in the Arm® Architecture Reference
Manual Supplement Armv9, for Armv9-A architecture profile
(https://developer.arm.com/documentation/ddi0608/latest).
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D109517
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
This allows VMOVL in tail predicated loops so long as the the vector
size the VMOVL is extending into is less than or equal to the size of
the VCTP in the tail predicated loop. These cases represent a
sign-extend-inreg (or zero-extend-inreg), which needn't block tail
predication as in https://godbolt.org/z/hdTsEbx8Y.
For this a vecsize has been added to the TSFlag bits of MVE
instructions, which stores the size of the elements that the MVE
instruction operates on. In the case of multiple size (such as a
MVE_VMOVLs8bh that extends from i8 to i16, the largest size was be
chosen). The sizes are encoded as 00 = i8, 01 = i16, 10 = i32 and 11 =
i64, which often (but not always) comes from the instruction encoding
directly. A unit test was added, and although only a subset of the
vecsizes are currently used, the rest should be useful for other cases.
Differential Revision: https://reviews.llvm.org/D109706
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.
There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.
For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.
This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.
Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.
Differential Revision: https://reviews.llvm.org/D45961
Add assert to provoke failure in object file output, not just in disassembly output.
Reviewed By: yroux
Differential Revision: https://reviews.llvm.org/D107259
This does the same as D96259, but for ARM, just like AArch64,
using the same comment char as for ELF and MinGW mode.
As the assembly input/output of LLVM is GAS style, trying to
match what MS armasm.exe does isn't needed (because the comment
char used is the least concern when it comes to that; all
directives differ too). If a separate armasm compatible mode
is implemented, it can use its own comment style (just like
llvm-ml implements MS ml.exe compatible assembly parsing).
This fixes building compiler-rt assembly files for ARM in MSVC
mode.
The updated testcase literals-comments.s was only intended to
make sure that '#' isn't interpreted as a comment char.
Differential Revision: https://reviews.llvm.org/D107251
This implements `MCInstrAnalysis::evaluateMemoryOperandAddress()` for
Arm so that the disassembler can print the target address of memory
operands that use PC+immediate addressing.
Differential Revision: https://reviews.llvm.org/D105979
Previously we would emit constant pool entries for ldr inline asm at the
very end of AsmPrinter::doFinalization(). However, if we're emitting
dwarf aranges, that would end all sections with aranges. Then if we have
constant pool entries to be emitted in those same sections, we'd hit an
assert that the section has already been ended.
We want to emit constant pool entries before emitting dwarf aranges.
This patch splits out arm32/64's constant pool entry emission into its
own MCTargetStreamer virtual method.
Fixes PR51208
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107314
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Reviewed By: lattner, jpienaar
Differential Revision: https://reviews.llvm.org/D105959
Enable the emission of a GNU attributes section by reusing the code for
emitting the ARM build attributes section.
The GNU attributes follow the exact same section format as the ARM
BuildAttributes section, so this can be factored out and reused for GNU
attributes generally.
The immediate motivation for this is to emit a GNU attributes section for the
vector ABI on SystemZ (https://reviews.llvm.org/D105067).
Review: Logan Chien, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D102894
`ARMInstPrinter::printMveAddrModeQOperand()` was added in D62680, but
was never used. It looks like `printT2AddrModeImm8Operand<false>()` is
used instead.
Differential Revision: https://reviews.llvm.org/D105124
Previously xscale was known to everything apart
from the ELF streamer so we would crash as soon
as you tried to output an object file.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D104776
The instruction can be 16-bit aligned while targeting 32-bit aligned
code. To calculate the target address correctly, the address of the
instruction has to be adjusted.
Differential Revision: https://reviews.llvm.org/D104446
Currently the ARM backend only accpets constant expressions as the
immediate operand in load and store instructions. This allows the
result of symbolic expressions to be used in memory instructions. For
example,
0:
.space 2048
strb r2, [r0, #(.-0b)]
would be assembled into the following instructions.
strb r2, [r0, #2048]
This only adds support to ldr, ldrb, str, and strb in arm mode to
address the build failure of Linux kernel for now, but should facilitate
adding support to similar instructions in the future if the need arises.
Link:
https://github.com/ClangBuiltLinux/linux/issues/1329
Reviewed By: peter.smith, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D98916
llvm-objdump only uses one MCInstrAnalysis object, so if ARM and Thumb
code is mixed in one object, or if an object is disassembled without
explicitly setting the triple to match the ISA used, then branch and
call targets will be printed incorrectly.
This could be fixed by creating two MCInstrAnalysis objects in
llvm-objdump, like we currently do for SubtargetInfo. However, I don't
think there's any reason we need two separate sub-classes of
MCInstrAnalysis, so instead these can be merged into one, and the ISA
determined by checking the opcode of the instruction.
Differential revision: https://reviews.llvm.org/D97766
Make sure to set the bottom bit of the symbol even when the type
attribute of a label is set after the label.
GNU as sets the thumb state according to the thumb state of the label.
If a .type directive is placed after the label, set the symbol's thumb
state according to the thumb state of the .type directive. This matches
GNU as in most cases.
From: Stefan Agner <stefan@agner.ch>
This fixes:
https://bugs.llvm.org/show_bug.cgi?id=44860https://github.com/ClangBuiltLinux/linux/issues/866
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D74927
Early versions of the ARMv7 reference manuals considered the sp register
as a deprecated register for ldm/stm familiy of instructions. However,
later versions such as ARM DDI 0406C.d added a note to the Appendix:
D9.3 Use of the SP as a general-purpose register
Most ARM instructions, unlike Thumb instructions, provide exactly the
same access to the SP as to R0-R12. This means that it is possible to
use the SP as a general-purpose register. Earlier issues of this manual
deprecated the use of SP in an ARM instruction, in any way that is
deprecated, not permitted, or not possible in the corresponding
Thumb instruction. However, user feedback indicates a number of cases
where these instructions are useful. Therefore, ARM no longer deprecates
these instruction uses.
Also Armv8 manuals no longer consider SP as deprecated register for ldm/
stm A32 instructions.
Furthermore, GNU as also does not print a deprecated warning when using
SP with those instructions.
Drop deprecation warning for pop/ldm/push/stm instructions.
Patch by: Stefan Agner.
Differential Revision: https://reviews.llvm.org/D82692
This change introduces support for zero flag ELF section groups to LLVM.
LLVM already supports COMDAT sections, which in ELF are a special type
of ELF section groups. These are generally useful to enable linker GC
where you want a group of sections to always travel together, that is to
be either retained or discarded as a whole, but without the COMDAT
semantics. Other ELF assemblers already support zero flag ELF section
groups and this change helps us reach feature parity.
Differential Revision: https://reviews.llvm.org/D95851
As mentioned in TODO comment, casting double to float causes NaNs to change bits.
To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode.
Patch by Yuta Saito.
Differential Revision: https://reviews.llvm.org/D77384
This patch factors out the part of printInstruction that gets the
mnemonic string for a given MCInst. This is intended to be used
subsequently for the instruction-mix remarks to display the final
mnemonic (D90040).
Unfortunately making `getMnemonic` available to the AsmPrinter
seems to require making it virtual. Not sure if there's a way around
that with the current layering of the AsmPrinters.
Reviewed By: Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D90039
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848