Commit Graph

1260 Commits

Author SHA1 Message Date
Quentin Colombet a80b9c824e [AArch64][CollectLOH] Remove an invalid assertion and add a test case exposing it.
rdar://problem/22491525

llvm-svn: 246472
2015-08-31 19:02:00 +00:00
Matthias Braun 0acbd08f3c AArch64: Fix loads to lower NEON vector lanes using GPR registers
The ISelLowering code turned insertion turned the element for the
lowest lane of a BUILD_VECTOR into an INSERT_SUBREG, this prohibited
the patterns for SCALAR_TO_VECTOR(Load) to match later. Restrict this
to cases without a load argument.

Reported in rdar://22223823

Differential Revision: http://reviews.llvm.org/D12467

llvm-svn: 246462
2015-08-31 18:25:15 +00:00
Hal Finkel 982e8d48f8 [MIR Serialization] static -> static const in getSerializable*MachineOperandTargetFlags
Make the arrays 'static const' instead of just 'static'. Post-commit review
comment from Roman Divacky on IRC. NFC.

llvm-svn: 246376
2015-08-30 08:07:29 +00:00
Quentin Colombet fa4ecb4b9a [AArch64][CollectLOH] Fix a regression that prevented us to detect chains of
more than 2 instructions.

I introduced this regression a while back and did not noticed it because I
somehow forgot to push the initial test cases for the pass!

Fix that as well!

llvm-svn: 246239
2015-08-27 23:47:10 +00:00
Chad Rosier 9f4709b261 [AArch64] Remove a use-after-free when collecting stats.
The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt()
is referencing freed memory.

llvm-svn: 246033
2015-08-26 13:39:48 +00:00
Silviu Baranga db1ddb32ce [AArch64] Unify the integer min/max vector selection patterns with the intrinsic ones
Summary:
This change lowers the aarch64 integer vector min/max intrinsic nodes to
generic min/max nodes and replaces the intrinsic selection patterns with
the generic ones.

There should already be testing in place for this, so no further tests
were added.

Reviewers: jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12276

llvm-svn: 246030
2015-08-26 11:11:14 +00:00
Matthias Braun 17af607796 FastISel: Factor out common code; NFC intended
This should be no functional change but for the record: For three cases
in X86FastISel this will change the order in which the FalseMBB and
TrueMBB of a conditional branch is addedd to the successor/predecessor
lists.

llvm-svn: 245997
2015-08-26 01:38:00 +00:00
Mehdi Amini a758398833 Add missing break in AArch64DAGToDAGISel::Select() switch case
Reported by coverity.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245800
2015-08-23 00:42:57 +00:00
Matthias Braun 46e5639806 AArch64: Fix cmp;ccmp ordering
When producing conditional compare sequences for or operations we need
to negate the operands and the finally tested flags. The thing is if we negate
the finally tested flags this equals a logical negation of all previously
emitted expressions. There was a case missing where we have to order OR
expressions so they get emitted first.

This fixes http://llvm.org/PR24459

llvm-svn: 245641
2015-08-20 23:33:34 +00:00
Matthias Braun 266204b7dc AArch64: Do not create CCMP on multiple users.
Create CMP;CCMP sequences from and/or trees does not gain us anything if
the and/or tree is materialized to a GP register anyway. While most of
the code already checked for hasOneUse() there was one important case
missing.

llvm-svn: 245640
2015-08-20 23:33:31 +00:00
Juergen Ributzka b12248e9cd [AArch64][FastISel] Don't fold shifts with UB.
We are already falling back to SelectionDAG when encountering an shift with UB.
This adds the same checks for shifts with UB that get folded into arithmetic or
logical operations.

This fixes rdar://problem/22345295.

llvm-svn: 245499
2015-08-19 20:52:55 +00:00
Ahmed Bougacha 9e00ec6195 [AArch64] Improve short-form diags on long-form Match_InvalidOperand.
Since r244955, we try to use the short-form ErrorInfo when both
tries failed, and the long-form match failed on a suffix operand.
However, this means we sometimes mix ErrorInfo and MatchResult
(one manifestation of this being PR24498). Instead, restore both.

llvm-svn: 245469
2015-08-19 17:40:19 +00:00
Renato Golin eb552e83e0 Revert "[AArch64] Simplify/refactor code to ease code review. NFC."
This reverts commit r245443, as it broke AArch64 test-suite tramp3d
with an assert "Reg && "Null register has no regunits".

llvm-svn: 245455
2015-08-19 16:29:53 +00:00
Chad Rosier 494abf1ad8 [AArch64] Simplify/refactor code to ease code review. NFC.
llvm-svn: 245443
2015-08-19 14:34:54 +00:00
Alex Lorenz f3630113cd MIR Serialization: Serialize the operand's bit mask target flags.
This commit adds support for bit mask target flag serialization to the MIR
printer and the MIR parser. It also adds support for the machine operand's
target flag serialization to the AArch64 target.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 245383
2015-08-18 22:52:15 +00:00
Chad Rosier 3dd0e942b6 [AArch64] Simplify the logic for computing in bounds offset. NFC.
llvm-svn: 245307
2015-08-18 16:20:03 +00:00
Silviu Baranga b322aa6f53 [CostModel][AArch64] Increase cost of vector insert element and add missing cast costs
Summary:
Increase the estimated costs for insert/extract element operations on
AArch64. This is motivated by results from benchmarking interleaved
accesses.

Add missing costs for zext/sext/trunc instructions and some integer to
floating point conversions. These costs were previously calculated
by scalarizing these operation and were affected by the cost increase of
the insert/extract element operations.

Reviewers: rengolin

Subscribers: mcrosier, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11939

llvm-svn: 245226
2015-08-17 16:05:09 +00:00
James Molloy 88edc8243d Remove hand-rolled matching for fmin and fmax.
SDAGBuilder now does this all for us.

llvm-svn: 245198
2015-08-17 07:13:20 +00:00
James Y Knight 5567bafe93 Remove redundant TargetFrameLowering::getFrameIndexOffset virtual
function.

This was the same as getFrameIndexReference, but without the FrameReg
output.

Differential Revision: http://reviews.llvm.org/D12042

llvm-svn: 245148
2015-08-15 02:32:35 +00:00
Ahmed Bougacha cd35787217 [AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns.
We canonicalize V64 vectors to V128 through insert_subvector: the other
FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't,
so we'd fail to match fmls and generate fneg+fmla instead.
The vector equivalents are already tested and functional.

llvm-svn: 245107
2015-08-14 22:06:05 +00:00
Rafael Espindola dbaf0498a9 Revert "Centralize the information about which object format we are using."
This reverts commit r245047.

It was failing on the darwin bots. The problem was that when running

./bin/llc -march=msp430

llc gets to

  if (TheTriple.getTriple().empty())
    TheTriple.setTriple(sys::getDefaultTargetTriple());

Which means that we go with an arch of msp430 but a triple of
x86_64-apple-darwin14.4.0 which fails badly.

That code has to be updated to select a triple based on the value of
march, but that is not a trivial fix.

llvm-svn: 245062
2015-08-14 15:48:41 +00:00
Rafael Espindola 90eb70c8a7 Centralize the information about which object format we are using.
Other than some places that were handling unknown as ELF, this should
have no change. The test updates are because we were detecting
arm-coff or x86_64-win64-coff as ELF targets before.

It is not clear if the enum should live on the Triple. At least now it lives
in a single location and should be easier to move somewhere else.

llvm-svn: 245047
2015-08-14 13:31:17 +00:00
James Molloy 63be198712 [AArch64] FMINNAN/FMAXNAN on f16 is not legal.
Spotted by Ahmed - in r244594 I inadvertently marked f16 min/max as legal.

I've reverted it here, and marked min/max on scalar f16's as promote. I've also added a testcase. The test just checks that the compiler doesn't fall over - it doesn't create fmin nodes for f16 yet.

llvm-svn: 245035
2015-08-14 09:08:50 +00:00
Ahmed Bougacha 80e4ac802a [AArch64] Provide "too few operands" diags on short-form NEON also.
We used to just say "invalid type suffix for instruction", which is
misleading. This is because we fallback to the long-form matcher if the
short-form matcher failed, losing the error information on the way.

Save it, so that we can provide a little better diagnostics when the
long-form matcher thinks a suffix is the cause of the error.

llvm-svn: 244955
2015-08-13 21:09:13 +00:00
Ahmed Bougacha 2a97b1bcf8 [AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN.
We can lower them using our cool tricks if we fpext/fptrunc the second
input, like we do for f32/f64.

Follow-up to r243924, r243926, and r244858.

llvm-svn: 244860
2015-08-13 01:13:56 +00:00
Alex Lorenz e40c8a2b26 PseudoSourceValue: Replace global manager with a manager in a machine function.
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.

This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.

This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.

This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.

Reviewers: Akira Hatanaka
llvm-svn: 244693
2015-08-11 23:09:45 +00:00
James Molloy b7b2a1e9b4 [AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic.
Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change:

  - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant.
  - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall!

llvm-svn: 244595
2015-08-11 12:06:37 +00:00
James Molloy edf38f0cb0 [AArch64] Replace the custom AArch64ISD::FMIN/MAX nodes with ISD::FMINNAN/MAXNAN
NFCI. This just removes custom ISDNodes that are no longer needed.

llvm-svn: 244594
2015-08-11 12:06:33 +00:00
Chad Rosier c56a9132d0 [AArch64] Convert a conditional check that will always be true to an assert. NFC.
llvm-svn: 244479
2015-08-10 18:42:45 +00:00
Chad Rosier caed6db51e Typo. Move comment closer to relevant code. NFC.
llvm-svn: 244465
2015-08-10 17:17:19 +00:00
Benjamin Kramer df005cbe19 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
Quentin Colombet 7d8c74ff3f [AArch64][LoadStoreOptimizer] Turn a test into an assert. NFC.
At this point the given Opc must be valid, otherwise we should
not look for a matching pair to form paired load or store.

Thanks to Chad to point out this piece of code!

llvm-svn: 244366
2015-08-07 22:40:51 +00:00
Juergen Ributzka f09c7a3d0f [AArch64][FastISel] Always use AND before checking the branch flag.
When we are not emitting the condition for the branch, because the condition is
in another BB or SDAG did the selection for us, then we have to mask the flag in
the register with AND.

This is required when the condition comes from a truncate, because SDAG only
truncates down to a legal size of i32.

This fixes rdar://problem/22161062.

llvm-svn: 244291
2015-08-06 22:44:15 +00:00
Juergen Ributzka 9f54dbe7a1 Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types."
This reverts commit r243198 and 243304.

Turns out this wasn't the correct fix for this problem. It works only within
FastISel, but fails when the truncate is selected by SDAG.

llvm-svn: 244287
2015-08-06 22:13:48 +00:00
Nico Rieck 78199518c4 Rename inst_range() to instructions() for consistency. NFC
llvm-svn: 244248
2015-08-06 19:10:45 +00:00
Chad Rosier 22eb71056d [AArch64] Use a static function and other minor cleanup for readability. NFC.
llvm-svn: 244233
2015-08-06 17:37:18 +00:00
Chad Rosier f77e909f0a [AArch64] Improve the readability of the ld/st optimization pass. NFC.
llvm-svn: 244222
2015-08-06 15:50:12 +00:00
Chandler Carruth 93205eb966 [TTI] Make the cost APIs in TargetTransformInfo consistently use 'int'
rather than 'unsigned' for their costs.

For something like costs in particular there is a natural "negative"
value, that of savings or saved cost. As a consequence, there is a lot
of code that subtracts or creates negative values based on cost, all of
which is prone to awkwardness or bugs when dealing with an unsigned
type. Similarly, we *never* want these values to wrap, as that would
cause Very Bad code generation (likely percieved as an infinite loop as
we try to emit over 2^32 instructions or some such insanity).

All around 'int' seems a much better fit for these basic metrics. I've
added asserts to ensure that at least the TTI interface never returns
negative numbers here. If we ever have a use case for negative numbers,
we can remove this, but this way a bug where someone used '-1' to
produce a 'very large' cost will be caught by the assert.

This passes all tests, and is also UBSan clean.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D11741

llvm-svn: 244080
2015-08-05 18:08:10 +00:00
Pete Cooper 3ae0ee5453 Move BB succ_iterator to be inside TerminatorInst. NFC.
To get the successors of a BB we currently do successors(BB) which
ultimately walks the successors of the BB's terminator.

This moves the iterator to TerminatorInst as thats what we're actually
using to do the iteration, and adds a member function to TerminatorInst
to allow us to iterate directly over successors given an instruction.

For example, we can now do

  for (auto *Succ : BI->successors())

instead of

  for (unsigned i = 0, e = BI->getNumSuccessors(); i != e; ++i)

Reviewed by Tobias Grosser.

llvm-svn: 244074
2015-08-05 17:43:01 +00:00
Chad Rosier 69e3eb3c79 [AArch64] Register AArch64DeadRegisterDefinition pass with LLVM pass manager.
llvm-svn: 244067
2015-08-05 17:35:34 +00:00
Chad Rosier 1c81432eb6 [AArch64] Register (existing) AArch64BranchRelaxation pass with LLVM pass manager.
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.

llvm-svn: 244060
2015-08-05 16:12:10 +00:00
Chad Rosier 0c6c5fc303 [AArch64] Make the naming of the Address Type Promotion pass consistent.
llvm-svn: 244057
2015-08-05 15:32:23 +00:00
Chad Rosier 794b9b2fdd [AArch64] Register (existing) AArch64AdvSIMDScalar pass with LLVM pass manager.
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.

IIRC, this pass is off by default, but it's still helpful when debugging.

llvm-svn: 244056
2015-08-05 15:18:58 +00:00
Chad Rosier 084b78632e Make this less error prone by using a #define. NFC.
llvm-svn: 244048
2015-08-05 14:48:44 +00:00
Chad Rosier 9378c16ac8 [AArch64] Register (existing) AArch64ExpandPseudo pass with LLVM pass manager.
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.

llvm-svn: 244046
2015-08-05 14:22:53 +00:00
Chad Rosier 96530b3a43 [AArch64] Register (existing) AArch64LoadStoreOpt pass with LLVM pass manager.
Summary: Among other things, this allows -print-after-all/-print-before-all to
dump IR around this pass.

This is the AArch64 version of r243052.

llvm-svn: 244041
2015-08-05 13:44:51 +00:00
Chad Rosier 43f5c84cfc Update comment. NFC.
llvm-svn: 244038
2015-08-05 12:40:13 +00:00
Sanjay Patel 924879ad2c wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994
2015-08-04 15:49:57 +00:00
Ahmed Bougacha 81fda188f9 [AArch64] Rename FP formats to be more consistent. NFC.
Some are named "FP", others "SD", others still "FP*SD".
Rename all this to just use "FP", which, except for conversions
(which don't use this format naming scheme), implies "SD" anyway.

llvm-svn: 243936
2015-08-04 01:38:08 +00:00
Ahmed Bougacha e0e12db8c8 [AArch64] Add isel support for f16 indexed LD/ST.
llvm-svn: 243935
2015-08-04 01:29:38 +00:00
Ahmed Bougacha e8ea9ac32b [AArch64][v8.1a] The "pan" sysreg isn't MSR-specific. NFCI.
It's already in SysRegMappings, no need to also have it in MSRMappings:
the latter is only used if we didn't find a match in the former.

llvm-svn: 243933
2015-08-04 00:55:11 +00:00
Ahmed Bougacha 0cbe2efcd6 [AArch64] Remove unnecessary "break". NFC.
llvm-svn: 243931
2015-08-04 00:49:08 +00:00
Ahmed Bougacha 239d635d3d [AArch64] Use SDValue bool operator. NFC.
llvm-svn: 243930
2015-08-04 00:48:02 +00:00
Ahmed Bougacha b0ae36f0d1 [AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such.
There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and
is actually already vector-aware; let's use it instead of scalarizing!

The only interesting change is that for v2f32, we previously always used
use v4i32 as the integer vector type.
Use v2i32 instead, and mark FCOPYSIGN as Custom.

llvm-svn: 243926
2015-08-04 00:42:34 +00:00
Pete Cooper 7be8f8f018 Convert some AArch64 code to foreach loops. NFC.
Also converted a cast<> to dyn_cast while i was working on the same
line of code.

llvm-svn: 243894
2015-08-03 19:04:32 +00:00
Craig Topper e3dcce9700 De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

llvm-svn: 243842
2015-08-01 22:20:21 +00:00
Geoff Berry 8a7ef3b2ee [AArch64] Favor extended reg patterns for sub
Summary:
Favor the extended reg patterns over the shifted reg patterns that match
only the operand shift and not the full sign/zero extend and shift.

Reviewers: jmolloy, t.p.northover

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11569

llvm-svn: 243753
2015-07-31 15:55:54 +00:00
Nick Lewycky c3890d2969 Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.)
Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h.

llvm-svn: 243585
2015-07-29 22:32:47 +00:00
Tim Northover 2a9d801fd5 AArch64: use 32-bit MOV rather than UBFX to truncate registers.
It's potentially more efficient on Cyclone, and from the optimization guides &
schedulers looks like it has no effect on Cortex-A53 or A57. In general you'd
expect a MOV to be about the most efficient instruction with its semantics,
even though the official "UXTW" alias is really a UBFX.

llvm-svn: 243576
2015-07-29 21:34:32 +00:00
Tim Northover cf739b8c3d AArch64: use AddressingModes.h accessors for compare shifts
No functional change because "lsl #12" is actually encoded as 12, but one less
bug if someone ever decides to change that for the giggles.

llvm-svn: 243536
2015-07-29 16:39:56 +00:00
Akira Hatanaka f53b0403f8 [AArch64] Define subtarget feature strict-align.
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -aarch64-strict-align to decide whether strict alignment should be
forced.

rdar://problem/21529937

llvm-svn: 243516
2015-07-29 14:17:26 +00:00
Sanjay Patel 1dd15598cf fix TLI's combineRepeatedFPDivisors interface to return the minimum user threshold
This fix was suggested as part of D11345 and is part of fixing PR24141.

With this change, we can avoid walking the uses of a divisor node if the target
doesn't want the combineRepeatedFPDivisors transform in the first place.

There is no NFC-intended other than that.

Differential Revision: http://reviews.llvm.org/D11531

llvm-svn: 243498
2015-07-28 23:05:48 +00:00
Tim Northover 17ae83a25f AArch64: be careful of large immediates when optimising cmps.
llvm-svn: 243492
2015-07-28 22:42:32 +00:00
Chih-Hung Hsieh 1e859582d6 Implement target independent TLS compatible with glibc's emutls.c.
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.

clang and driver changes in http://reviews.llvm.org/D10524

  Added -femulated-tls flag to select the emulated TLS model,
  which will be used for old targets like Android that do not
  support ELF TLS models.

Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.

Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.

TODO: Add proper DIE for emulated TLS variables.
      Added new unit tests with emulated TLS.

Differential Revision: http://reviews.llvm.org/D10522

llvm-svn: 243438
2015-07-28 16:24:05 +00:00
Geoff Berry c573bf7a5f [AArch64] Match float round and convert to int instructions.
Summary:
Add patterns for doing floating point round with various rounding modes
followed by conversion to int as a single FCVT* instruction.

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11424

llvm-svn: 243422
2015-07-28 15:24:10 +00:00
Adhemerval Zanella 7bc3319d84 Implement __builtin_thread_pointer
This path add the aarch64 lowering of __builtin_thread_pointer.  It uses
the already implemented AArch64ISD::THREAD_POINTER used in TLS generation.

llvm-svn: 243412
2015-07-28 13:03:31 +00:00
Colin LeMahieu fe2c8b8015 [llvm-mc] Pushing plumbing through for --fatal-warnings flag.
llvm-svn: 243334
2015-07-27 21:56:53 +00:00
Akira Hatanaka 2541e0241c [AArch64] Remove check for Darwin that was needed to decide if x18 should
be reserved.

The decision to reserve x18 is going to be made solely by the front-end,
so it isn't necessary to check if the OS is Darwin in the backend.

llvm-svn: 243308
2015-07-27 19:18:47 +00:00
Silviu Baranga 7581d22512 [ARM/AArch64] Fix cost model for interleaved accesses
Summary:
Fix the cost of interleaved accesses for ARM/AArch64.
We were calling getTypeAllocSize and using it to check
the number of bits, when we should have called
getTypeAllocSizeInBits instead.

This would pottentially cause the vectorizer to
generate loads/stores and shuffles which cannot
be matched with an interleaved access instruction.

No performance changes are expected for now since
matching/generating interleaved accesses is still
disabled by default.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11524

llvm-svn: 243270
2015-07-27 14:39:34 +00:00
Juergen Ributzka 6364985b58 [AArch64][FastISel] Always use an AND instruction when truncating to non-legal types.
When truncating to non-legal types (such as i16, i8 and i1) always use an AND
instruction to mask out the upper bits. This was only done when the source type
was an i64, but not when the source type was an i32.

This commit fixes this and adds the missing i32 truncate tests.

This fixes rdar://problem/21990703.

llvm-svn: 243198
2015-07-25 02:16:53 +00:00
Akira Hatanaka 0d4c9ea6e0 [AArch64] Define subtarget feature "reserve-x18", which is used to decide
whether register x18 should be reserved.

This change is needed because we cannot use a backend option to set
cl::opt "aarch64-reserve-x18" when doing LTO.

Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18"
to reserve x18 should make changes to add subtarget feature "reserve-x18"
to the IR.

rdar://problem/21529937

Differential Revision: http://reviews.llvm.org/D11463

llvm-svn: 243186
2015-07-25 00:18:31 +00:00
Pete Cooper 7679afda82 Use make_range(rbegin(), rend()) to allow foreach loops. NFC.
Instead of the pattern

for (auto I = x.rbegin(), E = x.end(); I != E; ++I)

we can use make_range to construct the reverse range and iterate using
that instead.

llvm-svn: 243163
2015-07-24 21:13:43 +00:00
Luke Cheeseman b5c627aba8 When lowering vector shifts a check is performed to see if the value to shift by
is an immediate, in this check the value is negated and stored in and int64_t.
The value can be -2^63 yet the result cannot be stored in an int64_t and this
gives some undefined behaviour causing failures. The negation is only necessary
when the values is within a certain range and so it should not need to negate
-2^63, this patch introduces this and also a regression test.

Differential Revision: http://reviews.llvm.org/D11408

llvm-svn: 243100
2015-07-24 09:31:48 +00:00
Lawrence Hu 687097a0a9 test commit, only added one space
llvm-svn: 243070
2015-07-23 23:55:28 +00:00
Weiming Zhao b33a5557f4 This patch eanble register coalescing to coalesce the following:
%vreg2<def> = MOVi32imm 1; GPR32:%vreg2
  %W1<def> = COPY %vreg2; GPR32:%vreg2
into:
  %W1<def> = MOVi32imm 1
Patched by Lawrence Hu (lawrence@codeaurora.org)

llvm-svn: 243033
2015-07-23 19:24:53 +00:00
Chad Rosier 1bf48a6a69 Simplify switch as all cases other than default return true. NFC.
llvm-svn: 242922
2015-07-22 18:41:57 +00:00
Chad Rosier fe5399fe88 Follow up to r242810. NFC.
llvm-svn: 242812
2015-07-21 17:47:56 +00:00
Chad Rosier 96a18a96cc [AArch64] Simplify the passing of arguments. NFC.
This is setup for future work planned for the AArch64 Load/Store Opt pass.

llvm-svn: 242810
2015-07-21 17:42:04 +00:00
Matthias Braun c8b67e656b AArch64: Restrict macroop fusion heuristics to cyclone.
Even though this is just some hinting for the scheduler it doesn't make
sense to do that unless you know the target can perform the fusion.

llvm-svn: 242732
2015-07-20 23:11:42 +00:00
JF Bastien e4d22d59d1 Targets: commonize some stack realignment code
This patch does the following:
* Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`.
* Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute.

Multiple targets duplicated the same `needsStackRealignment` code:
 - Aarch64.
 - ARM.
 - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has.
 - PowerPC.
 - WebAssembly.
 - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has.

The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects:
 - AMDGPU
 - BPF
 - CppBackend
 - MSP430
 - NVPTX
 - Sparc
 - SystemZ
 - XCore
 - Out-of-tree targets
This is a breaking change! `make check` passes.

The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation.

`needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone.

Reviewers: sunfish

Subscribers: aemerson, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11160

llvm-svn: 242727
2015-07-20 22:51:32 +00:00
Matthias Braun e536f4f681 AArch64: Add aditional Cyclone macroop fusion opportunities
Related to rdar://19205407

Differential Revision: http://reviews.llvm.org/D10746

llvm-svn: 242724
2015-07-20 22:34:47 +00:00
Geoff Berry e41c2df0ef Fix comment typo (test commit). NFC
llvm-svn: 242719
2015-07-20 22:03:52 +00:00
Chad Rosier 3da0ea7f5d [AArch64] Change EON pattern to match more often.
Phabricator: http://reviews.llvm.org/D11359
Patch by Geoff Berry <gberry@codeaurora.org>

llvm-svn: 242694
2015-07-20 18:42:27 +00:00
James Molloy faf4e3c33b [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
No functional change, but it preps codegen for the future when SABSDIFF
will start getting generated in anger.

llvm-svn: 242545
2015-07-17 17:10:45 +00:00
Tim Northover a5740e0874 AArch64: add comment missed out from earlier patch.
Helps explain some of the background behind this bit of code.

llvm-svn: 242503
2015-07-17 03:31:50 +00:00
Tim Northover ca0ffc3561 AArch64: make inexact signalling on round Darwin-specific
C11 leaves the choice on whether round-to-integer operations set the inexact
flag implementation-defined. Darwin does expect it to be set, but this seems to
be against the intent of the IEEE document and slower to implement anyway. So
it should be opt-in.

llvm-svn: 242446
2015-07-16 21:30:21 +00:00
Matthias Braun af7d7709d6 AArch64: Implement conditional compare sequence matching.
This is a new iteration of the reverted r238793 /
http://reviews.llvm.org/D8232 which wrongly assumed that any and/or
trees can be represented by conditional compare sequences, however there
are some restrictions to that. This version fixes this and adds comments
that explain exactly what types of and/or trees can actually be
implemented as conditional compare sequences.

Related to http://llvm.org/PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D10579

llvm-svn: 242436
2015-07-16 20:02:37 +00:00
Mehdi Amini bd7287ebe5 Move most user of TargetMachine::getDataLayout to the Module one
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

This patch is quite boring overall, except for some uglyness in
ASMPrinter which has a getDataLayout function but has some clients
that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so
some methods are taking a DataLayout as parameter.

Reviewers: echristo

Subscribers: yaron.keren, rafael, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D11090

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 242386
2015-07-16 06:11:10 +00:00
Petr Pavlu 097adfb98c [AArch64] Fix problems in decoding generic MSR instructions
Bitpatterns rejected by the decoder method of `MSR (immediate)` should be
decoded as the `extended MSR (register)` instruction.

Differential Revision: http://reviews.llvm.org/D7174

llvm-svn: 242276
2015-07-15 08:10:30 +00:00
JF Bastien c8f48c19d3 WebAssembly: fix build breakage.
Summary:
processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909

WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed.

Reviewers: qcolombet, sunfish

Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits

Differential Revision: http://reviews.llvm.org/D11199

llvm-svn: 242242
2015-07-14 23:06:07 +00:00
Matthias Braun 9912bb817c MachineRegisterInfo: Remove UsedPhysReg infrastructure
We have a detailed def/use lists for every physical register in
MachineRegisterInfo anyway, so there is little use in maintaining an
additional bitset of which ones are used.

Removing it frees us from extra book keeping. This simplifies
VirtRegMap.

Differential Revision: http://reviews.llvm.org/D10911

llvm-svn: 242173
2015-07-14 17:52:07 +00:00
Matthias Braun 0256486532 PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():

- Rename the function to determineCalleeSaves()
- Pass a bitset of callee saved registers by reference, thus avoiding
  the function-global PhysRegUsed bitset in MachineRegisterInfo.
- Without PhysRegUsed the implementation is fine tuned to not save
  physcial registers which are only read but never modified.

Related to rdar://21539507

Differential Revision: http://reviews.llvm.org/D10909

llvm-svn: 242165
2015-07-14 17:17:13 +00:00
Tim Northover c962d4f28b AArch64: add rev64 alias for 64-bit rev instruction.
It could be useful to assembly programmers and makes the permitted variants a
little more uniform.

llvm-svn: 242164
2015-07-14 17:07:29 +00:00
Duncan P. N. Exon Smith 754e21f244 MC: Remove MCSubtargetInfo() default constructor
Force all creators of `MCSubtargetInfo` to immediately initialize it,
merging the default constructor and the initializer into an initializing
constructor.  Besides cleaning up the code a little, this makes it clear
that the initializer is never called again later.

Out-of-tree backends need a trivial change: instead of calling:

    auto *X = new MCSubtargetInfo();
    InitXYZMCSubtargetInfo(X, ...);
    return X;

they should call:

    return createXYZMCSubtargetInfoImpl(...);

There's no real functionality change here.

llvm-svn: 241957
2015-07-10 22:43:42 +00:00
Evgeniy Stepanov 00b3020453 Fix AArch64 prologue for empty frame with dynamic allocas.
Fixes PR23804: assertion failure in emitPrologue in the case of a
function with an empty frame and a dynamic alloca that needs stack
realignment. This is a typical case for AddressSanitizer.

llvm-svn: 241943
2015-07-10 21:24:07 +00:00
JF Bastien b73a2ed20e Target RegisterInfo: devirtualize TargetFrameLowering
Summary:
The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast. This change adds an auto-generated static function <Target>GenRegisterInfo::getFrameLowering(const MachineFunction &MF) which does this devirtualization, and uses this function in all targets which can.

This change was suggested by sunfish in D11070 for WebAssembly, I figure that I may as well improve the other targets while I'm here.

Subscribers: sunfish, ted, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11093

llvm-svn: 241921
2015-07-10 18:13:17 +00:00
Pat Gavlin a717f255b6 Allow {e,r}bp as the target of {read,write}_register.
This patch allows the read_register and write_register intrinsics to
read/write the RBP/EBP registers on X86 iff the targeted register is
the frame pointer for the containing function.

Differential Revision: http://reviews.llvm.org/D10977

llvm-svn: 241827
2015-07-09 17:40:29 +00:00
Mehdi Amini eaabc51e78 Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT user
A documentation for this function would be nice by the way.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241807
2015-07-09 15:12:23 +00:00
Arnaud A. de Grandmaison f40f99e3a4 [AArch64] Select SBFIZ or UBFIZ instead of left + right shifts
And rename LSB to Immr / MSB to Imms to match the ARM ARM terminology.

llvm-svn: 241803
2015-07-09 14:33:38 +00:00
Renato Golin 17d4efe7c1 Add support for nest attribute to AArch64 backend
The nest attribute is currently supported on the x86 (32-bit) and x86-64
backends, but not on ARM (32-bit) or AArch64. This patch adds support for
nest to the AArch64 backend.

Register x18 is used by GCC for this purpose and hence is used here.
As discussed on the GCC mailing list the register choice is an ABI issue
and so choosing the same register as GCC means __builtin_call_with_static_chain
is compatible.

Patch by Stephen Cross.

llvm-svn: 241794
2015-07-09 10:18:02 +00:00