Commit Graph

807 Commits

Author SHA1 Message Date
Ahmed Bougacha 2b6917b020 [SelectionDAG] Allow targets to specify legality of extloads' result
type (in addition to the memory type).

The *LoadExt* legalization handling used to only have one type, the
memory type.  This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.

However, this isn't always the case.  For instance, on X86, with AVX,
this is legal:
    v4i32 load, zext from v4i8
but this isn't:
    v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.

Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.

Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.

Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior.  The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)

No functional change intended.

Differential Revision: http://reviews.llvm.org/D6532

llvm-svn: 225421
2015-01-08 00:51:32 +00:00
Ahmed Bougacha 67dd2d25a3 [CodeGen] Use MVT iterator_ranges in legality loops. NFC intended.
A few loops do trickier things than just iterating on an MVT subset,
so I'll leave them be for now.
Follow-up of r225387.

llvm-svn: 225392
2015-01-07 21:27:10 +00:00
Karthik Bhat 9ba55334dc Revert r225165 and r225169
Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case
where the original subtraction could have caused an overflow.
Reverting the same.

llvm-svn: 225341
2015-01-07 06:34:34 +00:00
Lang Hames 04b37c4043 Revert r225048: It broke ObjC on AArch64.
I've filed http://llvm.org/PR22100 to track this issue.

llvm-svn: 225228
2015-01-06 00:54:32 +00:00
Ahmed Bougacha d54c448d34 [AArch64] Improve codegen of store lane instructions by avoiding GPR usage.
We used to generate code similar to:

  umov.b        w8, v0[2]
  strb  w8, [x0, x1]

because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:

  add   x8, x0, x1
  st1.b { v0 }[2], [x8]

This patch increases the ST1* AddedComplexity to achieve that.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202

llvm-svn: 225183
2015-01-05 17:10:26 +00:00
Ahmed Bougacha f964df3640 [AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister.
For 0-lane stores, we used to generate code similar to:

  fmov w8, s0
  str w8, [x0, x1, lsl #2]

instead of:

  str s0, [x0, x1, lsl #2]

To correct that: for store lane 0 patterns, directly match to STR <subreg>0.

Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772

llvm-svn: 225181
2015-01-05 17:02:28 +00:00
Karthik Bhat 93f27ce886 Select lower fsub,fabs pattern to fabd on AArch64
This patch lowers patterns such as-
  fsub   v0.4s, v0.4s, v1.4s
  fabs   v0.4s, v0.4s
to
  fabd  v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6791
llvm-svn: 225169
2015-01-05 13:57:59 +00:00
Karthik Bhat 8ec742c2f9 Select lower sub,abs pattern to sabd on AArch64
This patch lowers patterns such as-
  sub	v0.4s, v0.4s, v1.4s
  abs	v0.4s, v0.4s
to
  sabd	v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6781
llvm-svn: 225165
2015-01-05 13:11:07 +00:00
Craig Topper d3c02f177a Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert.
llvm-svn: 225160
2015-01-05 10:15:49 +00:00
Saleem Abdulrasool 67f729933f ARM: permit tail calls to weak externals on COFF
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements.  It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call.  For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.

llvm-svn: 225119
2015-01-03 21:35:00 +00:00
Craig Topper 589ceee7f4 Minor cleanup to all the switches after MatchInstructionImpl in all the AsmParsers.
Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation.

llvm-svn: 225114
2015-01-03 08:16:34 +00:00
Rafael Espindola 54b435ec3c Add r224985 back with a fix.
The issues was that AArch64 has additional restrictions on when local
relocations can be used. We have to take those into consideration when
deciding to put a L symbol in the symbol table or not.

Original message:

Remove doesSectionRequireSymbols.

In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

llvm-svn: 225048
2014-12-31 17:19:34 +00:00
Rafael Espindola d4da9040de Revert "Remove doesSectionRequireSymbols."
This reverts commit r224985.

I am investigating why it made an Apple bot unhappy.

llvm-svn: 225044
2014-12-31 16:06:48 +00:00
Rafael Espindola b22d5aa49a Remove doesSectionRequireSymbols.
In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

llvm-svn: 224985
2014-12-30 13:13:27 +00:00
Karthik Bhat bf662901c1 Lower multiply-negate operation to mneg on AArch64
This patch pattern matches code such as-
neg	 w8, w8
mul	 w8, w9, w8
to
mneg	 w8, w8, w9

Review: http://reviews.llvm.org/D6754
llvm-svn: 224706
2014-12-22 13:38:58 +00:00
Adrian Prantl b9fa945d51 ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions.
Debug info marks the first instruction without the FrameSetup flag
as being the end of the function prologue. Any CFI instructions in the
middle of the function prologue would cause debug info to end the prologue
too early and worse, attach the line number of the CFI instruction, which
incidentally is often 0.

llvm-svn: 224294
2014-12-16 00:20:49 +00:00
Michael Ilseman addddc441f Silence more static analyzer warnings.
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.

llvm-svn: 224255
2014-12-15 18:48:43 +00:00
Matthias Braun b2f2388a76 Enable MachineVerifier in debug mode for X86, ARM, AArch64, Mips.
llvm-svn: 224075
2014-12-11 23:18:03 +00:00
Matthias Braun 7e37a5f523 [CodeGen] Add print and verify pass after each MachineFunctionPass by default
Previously print+verify passes were added in a very unsystematic way, which is
annoying when debugging as you miss intermediate steps and allows bugs to stay
unnotice when no verification is performed.

To make this change practical I added the possibility to explicitely disable
verification. I used this option on all places where no verification was
performed previously (because alot of places actually don't pass the
MachineVerifier).
In the long term these problems should be fixed properly and verification
enabled after each pass. I'll enable some more verification in subsequent
commits.

This is the 2nd attempt at this after realizing that PassManager::add() may
actually delete the pass.

llvm-svn: 224059
2014-12-11 21:26:47 +00:00
Rafael Espindola 01c73610d0 This reverts commit r224043 and r224042.
check-llvm was failing.

llvm-svn: 224045
2014-12-11 20:03:57 +00:00
Matthias Braun 199aeff7dd Enable machineverifier in debug mode for X86, ARM, AArch64, Mips
llvm-svn: 224043
2014-12-11 19:42:09 +00:00
Matthias Braun a7c82a9f1d [CodeGen] Add print and verify pass after each MachineFunctionPass by default
Previously print+verify passes were added in a very unsystematic way, which is
annoying when debugging as you miss intermediate steps and allows bugs to stay
unnotice when no verification is performed.

To make this change practical I added the possibility to explicitely disable
verification. I used this option on all places where no verification was
performed previously (because alot of places actually don't pass the
MachineVerifier).
In the long term these problems should be fixed properly and verification
enabled after each pass. I'll enable some more verification in subsequent
commits.

llvm-svn: 224042
2014-12-11 19:42:05 +00:00
Juergen Ributzka 2326650ceb [AArch64] MachO large code-model: Materialize FP constants in code.
In the large code model we have to first get the address of the GOT entry, load
the address of the constant, and then load the constant itself.

To avoid these loads and the GOT entry alltogether this commit changes the way
how FP constants are materialized in the large code model. The constats are now
materialized in a GPR and then bitconverted/moved into the FPR.

Reviewed by Tim Northover

Fixes rdar://problem/16572564.

llvm-svn: 223941
2014-12-10 19:43:32 +00:00
Juergen Ributzka c6f314b8ed [FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'.
The load/store value type is currently not available when lowering the memcpy
intrinsic. Add the missing nullptr check to support this in 'computeAddress'.

Fixes rdar://problem/19178947.

llvm-svn: 223818
2014-12-09 19:44:38 +00:00
Tim Northover 67be569a31 AArch64: treat HFAs containing "half" types as blocks too.
llvm-svn: 223669
2014-12-08 17:54:58 +00:00
Benjamin Kramer 89e5306f43 Make the DenseMap bucket type configurable and use a smaller bucket for DenseSet.
DenseSet used to be implemented as DenseMap<Key, char>, which usually doubled
the memory footprint of the map. Now we use a compressed set so the second
element uses no memory at all. This required some surgery on DenseMap as
all accesses to the bucket now have to go through methods; this should
have no impact on the behavior of DenseMap though. The new default bucket
type for DenseMap is a slightly extended std::pair as we expose it through
DenseMap's iterator and don't want to break any existing users.

llvm-svn: 223588
2014-12-06 19:22:44 +00:00
Tim Northover 5e84fe3ed4 AArch64: use explicit MVT::i64 when creating EXTRACT_SUBVECTOR nodes.
All our patterns use MVT::i64, but the ISelLowering nodes were inconsistent in
their choice.

No functional change.

llvm-svn: 223551
2014-12-06 00:33:37 +00:00
Weiming Zhao cc4bf3ff3d [AArch64] Combining Load and IntToFp should check for neon availability
llvm-svn: 223382
2014-12-04 20:25:50 +00:00
Matt Arsenault 4e27343eec Allow target to specify prefix for labels
Use the MCAsmInfo instead of the DataLayout, and allow
specifying a custom prefix for labels specifically. HSAIL
requires that labels begin with @, but global symbols with &.

llvm-svn: 223323
2014-12-04 00:06:57 +00:00
Tim Northover 293d414380 AArch64: fix wrong-endian parameter passing.
The blocked arguments code didn't take account of the hacks needed to support
it.

llvm-svn: 223247
2014-12-03 17:49:26 +00:00
Tim Northover 4a8ac260cc AArch64: strengthen Darwin ABI alignment assumptions
A global variable without an explicit alignment specified should be assumed to
be ABI-aligned according to its type, like on other platforms. This allows us
to use better memory operations when accessing it.

rdar://18533701

llvm-svn: 223180
2014-12-02 23:53:43 +00:00
Tim Northover ec7ebebe55 AArch64: don't be too greedy when folding :lo12: accesses into mem ops.
This frequently leads to cases like:
   ldr xD, [xN, :lo12:var]
   add xA, xN, :lo12:var
   ldr xD, [xA, #8]

where the ADD would have been needed anyway, and the two distinct addressing
modes can prevent the formation of an ldp. Because of how we handle ADRP
(aggressively forming an ADRP/ADD pseudo-inst at ISel time), this pattern also
results in duplicated ADRP instructions (one on its own to cover the ldr, and
one combined with the add).

llvm-svn: 223172
2014-12-02 23:13:39 +00:00
Lang Hames a7395bf49b [AArch64][Stackmaps] Optimize stackmap shadows on AArch64.
Reduce the number of nops emitted for stackmap shadows on AArch64 by counting
non-stackmap instructions up to the next branch target towards the requested
shadow.

<rdar://problem/14959522>

llvm-svn: 223156
2014-12-02 21:36:24 +00:00
Tim Northover 24ec87debb AArch64: make register block rules apply to vector types too.
The blocking code originated in ARM, which is more aggressive about casting
types to a canonical representative before doing anything else, so I missed out
most vector HFAs and broke the ABI. This should fix it.

llvm-svn: 223126
2014-12-02 17:15:22 +00:00
Ahmed Bougacha d0ce058f2c [AArch64] Don't combine "select (setcc i1 LHS, RHS), vL, vR".
r208210 introduced an optimization that improves the vector select
codegen by doing the setcc on vectors directly.
This is a problem they the setcc operands are i1s, because the
optimization would create vectors of i1, which aren't legal.

Part of PR21549.

Differential Revision: http://reviews.llvm.org/D6308

llvm-svn: 223075
2014-12-01 20:59:00 +00:00
Ahmed Bougacha 879463206e [AArch64] Fix v2i8->i16 bitcast legalization.
r213378 improved f16 bitcasts, so that they go directly through subregs,
instead of through the stack.  That code now causes an assertion failure
for bitcasts from other 16-bits types (most importantly v2i8).

Correct that by doing the custom lowering for i16 bitcasts only when the
input is an f16.

Part of PR21549.

Differential Revision: http://reviews.llvm.org/D6307

llvm-svn: 223074
2014-12-01 20:52:32 +00:00
Akira Hatanaka 107d13c228 Fix capitalization. NFC.
llvm-svn: 222988
2014-12-01 06:14:52 +00:00
Craig Topper 44586dc4d6 Add missing 'override' keyword.
llvm-svn: 222911
2014-11-28 03:58:26 +00:00
Tim Northover a38e5cbf20 Stop using ArrayRef of a const type.
I *think* this is what the GCC bots are complaining about.

llvm-svn: 222905
2014-11-27 21:29:20 +00:00
Tim Northover 3c55ccac48 AArch64: treat [N x Ty] as a block during procedure calls.
The AAPCS treats small structs and homogeneous floating (or vector) aggregates
specially, and guarantees they either get passed as a contiguous block of
registers, or prevent any future use of those registers and get passed on the
stack.

This concept can fit quite neatly into LLVM's own type system, mapping an HFA
to [N x float] and so on, and small structs to [N x i64]. Doing so allows
front-ends to emit AAPCS compliant code without having to duplicate the
register counting logic.

llvm-svn: 222903
2014-11-27 21:02:42 +00:00
Will Newton 40f08faa70 Update AArch64 ELF relocations to ABI 1.0
This mostly entails adding relocations, however there are a couple of
changes to existing relocations:

1. R_AARCH64_NONE is defined to be zero rather than 256

R_AARCH64_NONE has been defined to be zero for a long time elsewhere
e.g. binutils and glibc since the submission of the AArch64 port in
2012 so this is required for compatibility.

2. R_AARCH64_TLSDESC_ADR_PAGE renamed to R_AARCH64_TLSDESC_ADR_PAGE21

I don't think there is any way for relocation names to leak out of LLVM
so this should not break anything.

Tested with check-all with no regressions.

llvm-svn: 222821
2014-11-26 10:49:18 +00:00
Craig Topper c50d64b07b Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.
llvm-svn: 222801
2014-11-26 00:46:26 +00:00
Juergen Ributzka eb67bd8d74 [FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching.
The pattern matching failed to recognize all instances of "-1", because when
comparing against "-1" we didn't use an APInt of the same bitwidth.

This commit fixes this and also adds inverse versions of the conditon to catch
more cases.

llvm-svn: 222722
2014-11-25 04:16:15 +00:00
Chad Rosier ba0e0664ff [AArch64] Fix clobber computation in A57LoadBalancing pass.
Extremely difficult to reproduce, so no test case included.
PR21637

llvm-svn: 222677
2014-11-24 18:57:58 +00:00
Hao Liu 44e5d7a131 DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal.
E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip)

A hook is added to allow the target to control whether it needs to do such combine.

Reviewed in http://reviews.llvm.org/D6334

llvm-svn: 222510
2014-11-21 06:39:58 +00:00
Reid Kleckner 343c395f11 Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/
Follow up to r221940, where I must not have caught em all. NFC

llvm-svn: 222481
2014-11-20 23:51:47 +00:00
Reid Kleckner 357600eab5 Add out of line virtual destructors to all LLVMTargetMachine subclasses
These recently all grew a unique_ptr<TargetLoweringObjectFile> member in
r221878.  When anyone calls a virtual method of a class, clang-cl
requires all virtual methods to be semantically valid. This includes the
implicit virtual destructor, which triggers instantiation of the
unique_ptr destructor, which fails because the type being deleted is
incomplete.

This is just part of the ongoing saga of PR20337, which is affecting
Blink as well. Because the MSVC ABI doesn't have key functions, we end
up referencing the vtable and implicit destructor on any virtual call
through a class. We don't actually end up emitting the dtor, so it'd be
good if we could avoid this unneeded type completion work.

llvm-svn: 222480
2014-11-20 23:37:18 +00:00
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Hao Liu 2aa06a989d [AArch64] Disable useAA for Cortex-A57.
Using AA during CodeGen is very useful for in-order cores. It is less useful for ooo cores. Also I find
enabling useAA for Cortex-A57 may generate worse code for some test cases. If useAA in codegen is improved 
and benefical for ooo cores, we can enable it again.

llvm-svn: 222333
2014-11-19 06:48:56 +00:00
Hao Liu fd46bea46a [AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on AArch64 backend.
SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE
and LICM. By enabling these passes we can have better address calculations and generate a better addressing
mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57.

Reviewed in http://reviews.llvm.org/D5864.

llvm-svn: 222331
2014-11-19 06:39:53 +00:00