Commit Graph

81 Commits

Author SHA1 Message Date
Fangrui Song de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
David Spickett c3b98194df Reland "[llvm][AArch64] Insert "bti j" after call to setjmp"
This reverts commit edb7ba714a.

This changes BLR_BTI to take variable_ops meaning that we can accept
a register or a label. The pattern still expects one argument so we'll
never get more than one. Then later we can check the type of the operand
to choose BL or BLR to emit.

(this is what BLR_RVMARKER does but I missed this detail of it first time around)

Also require NoSLSBLRMitigation which I missed in the first version.
2022-03-23 11:43:43 +00:00
David Spickett edb7ba714a Revert "[llvm][AArch64] Insert "bti j" after call to setjmp"
This reverts commit eb5ecbbcbb
due to failures on buildbots with expensive checks enabled.
2022-03-23 10:43:20 +00:00
David Spickett eb5ecbbcbb [llvm][AArch64] Insert "bti j" after call to setjmp
Some implementations of setjmp will end with a br instead of a ret.
This means that the next instruction after a call to setjmp must be
a "bti j" (j for jump) to make this work when branch target identification
is enabled.

The BTI extension was added in armv8.5-a but the bti instruction is in the
hint space. This means we can emit it for any architecture version as long
as branch target enforcement flags are passed.

The starting point for the hint number is 32 then call adds 2, jump adds 4.
Hence "hint #36" for a "bti j" (and "hint #34" for the "bti c" you see
at the start of functions).

The existing Arm command line option -mno-bti-at-return-twice has been
applied to AArch64 as well.

Support is added to SelectionDAG Isel and GlobalIsel. FastIsel will
defer to SelectionDAG.

Based on the change done for M profile Arm in https://reviews.llvm.org/D112427

Fixes #48888

Reviewed By: danielkiss

Differential Revision: https://reviews.llvm.org/D121707
2022-03-23 09:51:02 +00:00
Ahmed Bougacha 634ca7349d [ObjCARC] Require the function argument in the clang.arc.attachedcall bundle.
Currently, the clang.arc.attachedcall bundle takes an optional function
argument.  Depending on whether the argument is present, calls with this
bundle have the following semantics:

- on x86, with the argument present, the call is lowered to:
    call _target
    mov rax, rdi
    call _objc_retainAutoreleasedReturnValue

- on AArch64, without the argument, the call is lowered to:
    bl _target
    mov x29, x29

  and the objc runtime call is expected to be emitted separately.

That's because, on x86, the objc runtime checks for both the mov and
the call on x86, and treats the combination as the ARC autorelease elision
marker.

But on AArch64, it only checks for the dedicated NOP marker, as that's
historically been sufficiently unique.  Thanks to that, the runtime call
wasn't required to be adjacent to the NOP marker, so it wasn't emitted
as part of the bundle sequence.

This patch unifies both architectures: on AArch64, we now emit all
3 instructions for the bundle.  This guarantees that the runtime call
is adjacent to the marker in the sequence, and that's information the
runtime can use to further optimize this.

This helps simplify some of the handling, in particular
BundledRetainClaimRVs, which no longer needs to know whether the bundle
is sufficient or not: it now always should be.

Note that this does not include an AutoUpgrade for the nullary bundles,
as they are only produced in ObjCContract as part of the obj/asm emission
pipeline, and are not expected to be in bitcode.

Differential Revision: https://reviews.llvm.org/D118214
2022-01-28 12:41:45 -08:00
Jim Lin d6b0734837 [NFC] Use Register instead of unsigned 2022-01-19 20:17:04 +08:00
Kazu Hirata 387927bbaf [Target] Use range-based for loops (NFC) 2021-11-26 21:21:17 -08:00
Jameson Nash e20f69f612 [Aarch64] Correct register class for pseudo instructions
This constrains the Mov* and similar pseudo instruction to take
GPR64common register classes rather than GPR64. GPR64 includs XZR
which is invalid here, because this pseudo instructions expands
into an adrp/add pair sharing a destination register. XZR is invalid
on add and attempting to encode it will instead increment the stack
pointer causing crashes (downstream report at [1]). The test case
there reproduces on LLVM11, but I do not have a test case that
reaches this code path on main, since it is being masked by
improved dead code elimination introduced in D91513. Nevertheless,
this seems like a good thing to fix in case there are other cases
that dead code elimination doesn't clean up (e.g. if `optnone` is
used and the optimization is skipped).

I think it would be worth auditing uses of GPR64 in pseudo
instructions to see if there are any similar issues, but I do not
have a high enough view of the backend or knowledge of the
Aarch64 architecture to do this quickly.

[1] https://github.com/JuliaLang/julia/issues/39818

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D97435
2021-09-09 14:31:49 -04:00
Andrew Wei da9ed3dc71 [AArch64] Avoid adding duplicate implicit operands when expanding pseudo insts.
When expanding pseudo insts, in order to create a new machine instr, we use BuildMI,
which will add implicit operands by default. And transferImpOps will also copy implicit
operands from old ones. Finally, duplicate implicit operands are added to the same inst.
Sometimes this can cause correctness issues. Like below inst,
    renamable $w18 = nsw SUBSWrr renamable $w30, renamable $w14, implicit-def dead $nzcv
After expanding, it will become
    $w18 = SUBSWrs renamable $w13, renamable $w14, 0, implicit-def $nzcv, implicit-def dead $nzcv
A redundant implicit-def $nzcv is added, but the dead flag is missing.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D109069
2021-09-07 17:11:58 +08:00
Bradley Smith 81eafb8a37 [AArch64][SVE] Break false dependencies for inactive lanes of unary operations
Differential Revision: https://reviews.llvm.org/D105889
2021-07-26 15:01:21 +00:00
Eli Friedman 843c614058 [AArch64] Fix i128 cmpxchg using ldxp/stxp.
Basically two parts to this fix:

1. Stop using AtomicExpand to expand cmpxchg i128
2. Fix AArch64ExpandPseudoInsts to use a correct expansion.

From ARM architecture reference:

To atomically load two 64-bit quantities, perform a Load-Exclusive
pair/Store-Exclusive pair sequence of reading and writing the same value
for which the Store-Exclusive pair succeeds, and use the read values
from the Load-Exclusive pair.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51102

Differential Revision: https://reviews.llvm.org/D106039
2021-07-20 12:38:12 -07:00
Tim Northover ea0eec69f1 IR+AArch64: add a "swiftasync" argument attribute.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
2021-05-14 11:43:58 +01:00
Florian Hahn 75805dce5f
[AArch64] Add implicit uses for operands when expanding BLR_RVMARKER.
Make sure we preserve info about passed arguments as implicit uses, to
make sure later passes still have access to this information.

This fixes a mis-compile where the machine-combiner would pick an
incorrect free register.
2021-03-03 21:56:05 +00:00
Bradley Smith 8bad8a43c3 [AArch64][SVE] Add patterns to generate FMLA/FMLS/FNMLA/FNMLS/FMAD
Adjust generateFMAsInMachineCombiner to return false if SVE is present
in order to combine fmul+fadd into fma. Also add new pseudo instructions
so as to select the most appropriate of FMLA/FMAD depending on register
allocation.

Depends on D96599

Differential Revision: https://reviews.llvm.org/D96424
2021-02-18 16:55:16 +00:00
Tim Northover c93d50dd71 AArch64: use a constpool for blockaddress(...) on MachO
More MachO madness for everyone. MachO relocations are only 32-bits, which
means the ARM64_RELOC_ADDEND one only actually has 24 (signed) bits for the
actual addend. This is a problem when calculating the address of a basic block;
because it has no symbol of its own, the sequence

	adrp x0, Ltmp0@PAGE
	add x0, x0, x0 Ltmp0@PAGEOFF

is represented by relocation with an addend that contains the offset from the
function start to Ltmp, and so the largest function where this is guaranteed to
work is 8MB. That's not quite big enough that we can call it user error (IMO).

So this patch puts the any blockaddress into a constant-pool, where the addend
is instead stored in the (x)word being relocated, which is obviously big enough
for any function.
2021-02-08 15:13:29 +00:00
Florian Hahn 46bc40e502
Recommit "[AArch64] Lower calls with rv_marker attribute."
This recommits a87fccb3ff with a fix to mark the destination operand
of the marker instruction as def, to fix a machine verifier failure.

This reverts the revert commit c0f2cea7c0.
2020-12-13 16:20:39 +00:00
Florian Hahn c0f2cea7c0
Revert "[AArch64] Lower calls with rv_marker attribute ."
This reverts commit a87fccb3ff.

A test appears to fail with expensive checks. Reverting while I
investigate.
2020-12-11 20:12:59 +00:00
Florian Hahn a87fccb3ff
[AArch64] Lower calls with rv_marker attribute .
This patch adds support for lowering function calls with the
rv_marker attribute. The goal is to expand such calls to the
following sequence of instructions:

    BL @fn
    mov x29, x29

This sequence of instructions triggers Objective-C runtime optimizations,
hence we want to ensure no instructions get moved in between them.
This patch achieves that by adding a new CALL_RVMARKER ISD node,
which gets turned into the BLR_RVMARKER pseudo, which eventually gets
expanded into the sequence mentioned above. The sequence is then marked
as instruction bundle, to avoid anything being moved in between.

@ahatanak is working on using this attribute in the front- & middle-end.

Together with the front- & middle-end changes, this should address
PR31925 for AArch64.

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D92569
2020-12-11 19:45:44 +00:00
Paul Walker 7356b4243a [SVE] Fix invalid assert in expand_DestructiveOp.
AArch64ExpandPseudo::expand_DestructiveOp contains an assert to
ensure the destructive operand's register is unique.  However,
this is only required when psuedo expansion emits a movprfx.

A simple example when a movprfx is not required is
  Z0 = FADD_ZPZZ_UNDEF_S P0, Z0, Z0
which expands to an unprefixed FADD_ZPmZ_S instruction.

This patch moves the assert to the places where a movprfx is emitted.

Differential Revision: https://reviews.llvm.org/D83029
2020-07-04 09:21:40 +00:00
Sander de Smalen 8b7b0ad24c [AArch64][SVE] NFC: Rename isOrig -> isReverseInstr
This is a non-functional to clarify some of the terminology in the
AArch64SVEInstrInfo/SVEInstrFormats.td files around the tables
for mapping an instruction to it's reverse instruction counter part,
and vice versa. e.g. DIV -> DIVR and DIVR -> DIV.

Reviewers: paulwalker-arm, cameron.mcinally, rengolin, efriedma

Reviewed By: paulwalker-arm, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82979
2020-07-02 17:01:15 +01:00
Cullen Rhodes 8a397b66b2 [AArch64][SVE] Add support for spilling/filling ZPR2/3/4
Summary:
This patch enables the register allocator to spill/fill lists of 2, 3
and 4 SVE vectors registers to/from the stack. This is implemented with
pseudo instructions that get expanded to individual LDR_ZXI/STR_ZXI
instructions in AArch64ExpandPseudoInsts.

Patch by Sander de Smalen.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D75988
2020-05-28 10:02:57 +00:00
Matt Arsenault 2481f26ac3 CodeGen: Use Register in TargetFrameLowering 2020-04-07 17:07:44 -04:00
Cameron McInally 018dde4ce5 [AArch64][SVE] Add support for DestructiveBinaryImm DestructiveInstType
Support prefixing destructive operations, with the MOVPRFX instruction, to build constructive operations.

Differential Revision: https://reviews.llvm.org/D75064
2020-03-19 13:11:46 -05:00
Cameron McInally a5b22b768f [AArch64][SVE] Add support for DestructiveBinary and DestructiveBinaryComm DestructiveInstTypes
Add support for DestructiveBinaryComm DestructiveInstType, as well as the lowering code to expand the new Pseudos into the final movprfx+instruction pairs.

Differential Revision: https://reviews.llvm.org/D73711
2020-02-21 15:19:54 -06:00
Pavel Iliin dc0b815989 [AArch64][FIX] Correct register live range during pseudo expansion.
This commit fixes the broken tests after
commit b6a9fe2099
on the expensive check builder:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/2884
2020-02-15 12:16:56 +00:00
David Green da147ef0a5 [AArch64] Fixup kill flags on BSL generation
This hopefully fixes up the expensive checks bot.
2020-02-15 11:44:23 +00:00
Pavel Iliin b6a9fe2099 [AArch64] Add BIT/BIF support.
This patch added generation of SIMD bitwise insert BIT/BIF instructions.
In the absence of GCC-like functionality for optimal constraints satisfaction
during register allocation the bitwise insert and select patterns are matched
by pseudo bitwise select BSP instruction with not tied def.
It is expanded later after register allocation with def tied
to BSL/BIT/BIF depending on operands registers.
This allows to get rid of redundant moves.

Reviewers: t.p.northover, samparker, dmgreen

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D74147
2020-02-14 14:19:39 +00:00
Roland McGrath 2b0e6fe2e2 [Fuchsia] Remove aarch64-fuchsia target-specific -mcmodel=kernel
Under --target=aarch64-fuchsia, -mcmodel=kernel has the effect of
(the default) -mcmodel=small plus -mtp=el1 (which did not exist when
this behavior was added). Fuchsia's kernel now uses -mtp=el1
directly instead of -mcmodel=kernel, so remove this special support.

Patch By: mcgrathr

Differential Revision: https://reviews.llvm.org/D73409
2020-01-28 11:32:08 -08:00
Evgenii Stepanov d081962dea Merge memtag instructions with adjacent stack slots.
Summary:
Detect a run of memory tagging instructions for adjacent stack frame slots,
and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop

This code needs to run when stack slot offsets are already known, but before
FrameIndex operands in STG instructions are eliminated; that's the
reason for the new hook in PrologueEpilogue.

This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and adds _untied variants of those pseudos
that are allowed to take the base address as a FI operand. This is needed to
simplify recognizing an STGloop instruction as operating on a stack slot
post-regalloc.

This improves memtag code size by ~0.25%, and it looks like an additional ~0.1%
is possible by rearranging the stack frame such that consecutive STG
instructions reference adjacent slots (patch pending).

Reviewers: pcc, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70286
2020-01-17 15:19:29 -08:00
Evgenii Stepanov 58deb20dd2 Revert "Merge memtag instructions with adjacent stack slots."
*** Bad machine code: Tied use must be a register ***
- function:    stg_alloca17
- basic block: %bb.0 entry (0x20076710580)
- instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16)
- operand 3:   %stack.0.a

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio

This reverts commit b675a7628c.
2020-01-08 14:36:12 -08:00
Evgenii Stepanov b675a7628c Merge memtag instructions with adjacent stack slots.
Summary:
Detect a run of memory tagging instructions for adjacent stack frame slots,
and replace them with a shorter instruction sequence
* replace STG + STG with ST2G
* replace STGloop + STGloop with STGloop

This code needs to run when stack slot offsets are already known, but before
FrameIndex operands in STG instructions are eliminated; that's the
reason for the new hook in PrologueEpilogue.

This change modifies STGloop and STZGloop pseudos to take the size as an
immediate integer operand, and base address as a FI operand when
possible. This is needed to simplify recognizing an STGloop instruction
as operating on a stack slot post-regalloc.

This improves memtag code size by ~0.25%, and it looks like an additional ~0.1%
is possible by rearranging the stack frame such that consecutive STG
instructions reference adjacent slots (patch pending).

Reviewers: pcc, ostannard

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70286
2020-01-08 11:02:03 -08:00
Evgenii Stepanov 40a80a0a19 Lower TAGPstack with negative offset to SUBG.
Summary:
This never really occurs in the current codegen, so only a MIR test is
possible.

Reviewers: ostannard, pcc

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72123
2020-01-06 11:48:35 -08:00
Florian Hahn 636412bf31 [AArch64ExpandPseudos] Preserve renamable state when expanding MOVi64 & co.
If the MOVi operand was renamable, the operands of the expanded
instructions are also renamable.

Reviewers: thegameg, samparker, zatrazz

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D70061
2019-11-12 11:29:04 +00:00
Sander de Smalen 7774812965 [AArch64] Stackframe accesses to SVE objects.
Materialize accesses to SVE frame objects from SP or FP, whichever is
available and beneficial.

This patch still assumes the objects are pre-allocated. The automatic
layout of SVE objects within the stackframe will be added in a separate
patch.

Reviewers: greened, cameron.mcinally, efriedma, rengolin, thegameg, rovka

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D67749

llvm-svn: 374772
2019-10-14 13:11:34 +00:00
Tim Northover 52a89cc07d AArch64: fix EXPENSIVE_CHECKS for arm64_32.
For some reason I'd decided to mark the end-result of a GOT load as
dead. It's clearly not (necessarily).

llvm-svn: 371883
2019-09-13 18:55:38 +00:00
Tim Northover f1c2892912 AArch64: support arm64_32, an ILP32 slice for watchOS.
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.

llvm-svn: 371722
2019-09-12 10:22:23 +00:00
Daniel Sanders 5ae66e56cf [aarch64] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Manual fixups in:
AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned*
AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register

Depends on D65919

Reviewers: aemerson

Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision for full review was: https://reviews.llvm.org/D65962

llvm-svn: 368628
2019-08-12 22:40:53 +00:00
Sander de Smalen 612b038966 [AArch64] NFC: Add generic StackOffset to describe scalable offsets.
To support spilling/filling of scalable vectors we need a more generic
representation of a stack offset than simply 'int'.

For this we introduce the StackOffset struct, which comprises multiple
offsets sized by their respective MVTs. Byte-offsets will thus be a simple
tuple such as { offset, MVT::i8 }. Adding two byte-offsets will result in a
byte offset { offsetA + offsetB, MVT::i8 }. When two offsets have different
types, we can canonicalise them to use the same MVT, as long as their
runtime sizes are guaranteed to have the same size-ratio as they would have
at compile-time.

When we have both scalable- and fixed-size objects on the stack, we can 
create an offset that is: 

  ({ offset_fixed, MVT::i8 } + { offset_scalable, MVT::nxv1i8 })

The struct also contains a getForFrameOffset() method that is specific to
AArch64 and decomposes the frame-offset to be used directly in instructions
that operate on the stack or index into the stack.

Note: This patch adds StackOffset as an AArch64-only concept, but we would
like to make this a generic concept/struct that is supported by all 
interfaces that take or return stack offsets (currently as 'int'). Since
that would be a bigger change that is currently pending on D32530 landing,
we thought it makes sense to first show/prove the concept in the AArch64
target before proposing to roll this out further.

Reviewers: thegameg, rovka, t.p.northover, efriedma, greened

Reviewed By: rovka, greened

Differential Revision: https://reviews.llvm.org/D61435

llvm-svn: 368024
2019-08-06 13:06:40 +00:00
Peter Collingbourne 09f39967a2 AArch64: Add a tagged-globals backend feature.
This feature instructs the backend to allow locally defined global variable
addresses to contain a pointer tag in bits 56-63 that will be ignored by
the hardware (i.e. TBI), but may be used by an instrumentation pass such
as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD
sequence that sets bits 48-63 to the corresponding bits of the global, with
the linker bounds check disabled on the ADRP instruction to prevent the tag
from causing a link failure.

This implementation of the feature omits the MOVK when loading from or storing
to a global, which is sufficient for TBI. If the same approach is extended
to MTE, assuming that 0 is not configured as a catch-all tag, we will most
likely also need the MOVK in this case in order to avoid a tag mismatch.

Differential Revision: https://reviews.llvm.org/D65364

llvm-svn: 367475
2019-07-31 20:14:19 +00:00
Evgeniy Stepanov d752f5e953 Basic codegen for MTE stack tagging.
Implement IR intrinsics for stack tagging. Generated code is very
unoptimized for now.

Two special intrinsics, llvm.aarch64.irg.sp and llvm.aarch64.tagp are
used to implement a tagged stack frame pointer in a virtual register.

Differential Revision: https://reviews.llvm.org/D64172

llvm-svn: 366360
2019-07-17 19:24:02 +00:00
Oliver Stannard defdb1070f [AArch64] Allow -mattr=tpidr-el[1|2|3]
Added subtarget features for AArch64 to use TPIDR_EL[1|2|3] as the TLS base
register, rather than the default TPIDR_EL0.

Patch by Philip Derrin!

Differential revision: https://reviews.llvm.org/D54685

llvm-svn: 356657
2019-03-21 11:30:17 +00:00
Adhemerval Zanella 8a595b1d2e [AArch64] Refactor floating point materialization. NFC
It splits the login of actual instruction emission away from the logic
that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm.
The new function AArch64_IMM::expandMOVImm, which return the list of the 
instructions to materialize the immediate constant, is implemented on a 
separated unit because it will be used in a subsequent patch to optimize
floating point materialization.

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58915

llvm-svn: 356387
2019-03-18 18:23:23 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
David Green 9dd1d451d9 [AArch64] Add Tiny Code Model for AArch64
This adds the plumbing for the Tiny code model for the AArch64 backend. This,
instead of loading addresses through the normal ADRP;ADD pair used in the Small
model, uses a single ADR. The 21 bit range of an ADR means that the code and
its statically defined symbols need to be within 1MB of each other.

This makes it mostly interesting for embedded applications where we want to fit
as much as we can in as small a space as possible.

Differential Revision: https://reviews.llvm.org/D49673

llvm-svn: 340397
2018-08-22 11:31:39 +00:00
Eli Friedman 9e177882aa [AArch64] Improve orr+movk sequences for MOVi64imm.
The existing code has three different ways to try to lower a 64-bit
immediate to the sequence ORR+MOVK.  The result is messy: it misses
some possible sequences, and the order of the checks means we sometimes
emit two MOVKs when we only need one.

Instead, just use a simple loop to try all possible two-instruction
ORR+MOVK sequences.

Differential Revision: https://reviews.llvm.org/D47176

llvm-svn: 333218
2018-05-24 19:38:23 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Martin Storsjo 7bc64bd889 [AArch64] Fold adds with tprel_lo12_nc and secrel_lo12 into a following ldr/str
Differential Revision: https://reviews.llvm.org/D44355

llvm-svn: 327316
2018-03-12 18:47:43 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Matthias Braun c9056b834d Insert IMPLICIT_DEFS for undef uses in tail merging
Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.

To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:
- computeLiveIns() just computes the live-ins in a LivePhysRegs set.
- addLiveIns() applies the live-ins to a block live-in list.
- computeAndAddLiveIns() is a convenience function combining the other
  two functions and behaving like computeLiveIns() before this patch.

Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D37034

llvm-svn: 312668
2017-09-06 20:45:24 +00:00
Tim Northover 869fa74d4b Revert "[AArch64] Simplify AES*Tied pseudo expansion (NFC)."
This reverts commit r309821.

My suggestion was wrong because it left the MachineOperands tied which
confused the verifier. Since there's no easy way to untie operands, the
original BuildMI solution is probably best.

llvm-svn: 309962
2017-08-03 16:59:36 +00:00