Commit Graph

1476 Commits

Author SHA1 Message Date
Tim Northover 46e58354da ARM: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300904
2017-04-20 21:56:52 +00:00
Tim Northover 8b1240b0f0 ARM: handle post-indexed NEON ops where the offset isn't the access width.
Before, we assumed that any ConstantInt offset was precisely the access width,
so we could use the "[rN]!" form. ISelLowering only ever created that kind, but
further simplification during combining could lead to unexpected constants and
incorrect codegen.

Should fix PR32658.

llvm-svn: 300878
2017-04-20 19:54:02 +00:00
Diana Picus 7c6dee9f16 [ARM] Rename HW div feature to HW div Thumb. NFCI.
The hardware div feature refers only to Thumb, but because of its name
it is tempting to use it to check for hardware division in general,
which may cause problems in ARM mode. See https://reviews.llvm.org/D32005.

This patch adds "Thumb" to its name, to make its scope clear. One
notable place where I haven't made the change is in the feature flag
(used with -mattr), which is still hwdiv. Changing it would also require
changes in a lot of tests, including clang tests, and it doesn't seem
like it's worth the effort.

Differential Revision: https://reviews.llvm.org/D32160

llvm-svn: 300827
2017-04-20 09:38:25 +00:00
Eli Friedman 70ad2751d5 [ARM] Remove redundant computeKnownBits helper.
Move the BFI logic to computeKnownBitsForTargetNode, and delete
the redundant CMOV logic.

This is intended as a cleanup, but it's probably possible to construct
a case where moving the BFI logic allows more combines.

Differential Revision: https://reviews.llvm.org/D31795

llvm-svn: 300752
2017-04-19 20:50:57 +00:00
Eli Friedman f281d490cc [ARM] Use TableGen patterns to select vtbl. NFC.
Differential Revision: https://reviews.llvm.org/D32103

llvm-svn: 300749
2017-04-19 20:39:39 +00:00
Matt Arsenault 3138075dd4 DAG: Make mayBeEmittedAsTailCall parameter const
llvm-svn: 300603
2017-04-18 21:16:46 +00:00
Diana Picus e2626bb7c2 [ARM] Check for correct HW div when lowering divmod
For subtargets that use the custom lowering for divmod, e.g. gnueabi,
we used to check if the subtarget has hardware divide and then lower to
a div-mul-sub sequence if true, or to a libcall if false.

However, judging by the usage of hasDivide vs hasDivideInARMMode, it
seems that hasDivide only refers to Thumb. For instance, in the
ARMTargetLowering constructor, the code that specifies whether to use
libcalls for (S|U)DIV looks like this:

bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
                                      : Subtarget->hasDivideInARMMode();

In the case of divmod for arm-gnueabi, using only hasDivide() to
determine what to do means that instead of lowering to __aeabi_idivmod
to get the remainder, we lower to div-mul-sub and then further lower the
div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
not in Thumb, we generate a libcall instead of using it (this is not an
issue in practice since AFAICT none of the cores that we support have
hardware divide in ARM but not Thumb).

This patch fixes the code dealing with custom lowering to take into
account the mode (Thumb or ARM) when deciding whether or not hardware
division is available.

Differential Revision: https://reviews.llvm.org/D32005

llvm-svn: 300536
2017-04-18 08:32:27 +00:00
Matthew Simpson 1468d3e04e [ARM/AArch64] Ensure valid vector element types for interleaved accesses
This patch refactors and strengthens the type checks performed for interleaved
accesses. The primary functional change is to ensure that the interleaved
accesses have valid element types. The added test cases previously failed
because the element type is f128.

Differential Revision: https://reviews.llvm.org/D31817

llvm-svn: 299864
2017-04-10 18:34:37 +00:00
Huihui Zhang 98240e9643 [SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL
In LowerMUL, the chain information is not preserved for the new
created Load SDNode.

For example, if a Store alias with one of the operand of Mul.
The Load for that operand need to be scheduled before the Store.
The dependence is recorded in the chain of Store, in TokenFactor.
However, when lowering MUL, the SDNodes for the new Loads for
VMULL are not updated in the TokenFactor for the Store. Thus the
chain is not preserved for the lowered VMULL.

llvm-svn: 299701
2017-04-06 20:22:51 +00:00
Simon Pilgrim 37b536e4b3 [DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.

Differential Revision: https://reviews.llvm.org/D31249

llvm-svn: 299201
2017-03-31 11:24:16 +00:00
Eli Friedman 95ddd18703 [ARM] Fix mixup between Lo and Hi in SMLALBB formation.
llvm-svn: 298752
2017-03-25 00:13:24 +00:00
Pirama Arumuga Nainar bc26482717 [ARM] Fix computeKnownBits for ARMISD::CMOV
Summary:
The true and false operands for the CMOV are operands 0 and 1.
ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2
instead.  This can cause CMOV instructions to be incorrectly folded into
BFI if value set by the CMOV is another CMOV, whose known bits are
computed incorrectly.

This patch fixes the issue and adds a test case.

Reviewers: kristof.beyls, jmolloy

Subscribers: llvm-commits, aemerson, srhines, rengolin

Differential Revision: https://reviews.llvm.org/D31265

llvm-svn: 298624
2017-03-23 16:47:47 +00:00
Artyom Skrobov 92c0653095 Reapply r298417 "[ARM] Recommit the glueless lowering of addc/adde in Thumb1"
The UB in t2_so_imm_neg conversion has been addressed under D31242 / r298512

This reverts commit r298482.

llvm-svn: 298562
2017-03-22 23:35:51 +00:00
Vitaly Buka e69c137f90 Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648."
Fails check-llvm with ubsan

This reverts commit r298417.

llvm-svn: 298482
2017-03-22 05:07:44 +00:00
Artyom Skrobov 40a4f40679 [ARM] Recommit the glueless lowering of addc/adde in Thumb1,
including the amended (no UB anymore) fix for adding/subtracting -2147483648.

This reverts r298328 "[ARM] Revert r297443 and r297820."
and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648""

llvm-svn: 298417
2017-03-21 18:39:41 +00:00
Eli Friedman 76732acc23 [ARM] Revert r297443 and r297820.
The glueless lowering of addc/adde in Thumb1 has known serious
miscompiles (see https://reviews.llvm.org/D31081), and r297820
causes an infinite loop for certain constructs.  It's not
clear when they will be fixed, so let's just take them out
of the tree for now.

(I resolved a small conflict with r297453.)

llvm-svn: 298328
2017-03-21 00:26:39 +00:00
Vadzim Dambrouski ba789cbd3d [ARM] Fix PR32130: Handle promotion of zero sized constants.
The special case of zero sized values was previously not handled correctly.
This patch handles this by not promoting if the size is zero.

Patch by Tim Neumann.

Differential Revision: https://reviews.llvm.org/D31116

llvm-svn: 298320
2017-03-20 22:59:57 +00:00
Nirav Dave ac6081cb67 Make library calls sensitive to regparm module flag (Fixes PR3997).
Reviewers: mkuper, rnk

Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D27050

llvm-svn: 298179
2017-03-18 00:44:07 +00:00
Nirav Dave 6de2c77944 Capitalize ArgListEntry fields. NFC.
llvm-svn: 298178
2017-03-18 00:43:57 +00:00
Artyom Skrobov e72e1ba434 Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"
This reverts r297820 which apparently fails on A15 hosts.

llvm-svn: 297842
2017-03-15 14:50:43 +00:00
Artyom Skrobov 3fa5fd1dd2 [Thumb1] Fix the bug when adding/subtracting -2147483648
Differential Revision: https://reviews.llvm.org/D30829

llvm-svn: 297820
2017-03-15 10:19:16 +00:00
Sam Parker 654cb8263a [ARM] Enable SMLAL[B|T] isel
Enable the selection of the 64-bit signed multiply accumulate
instructions which operate on 16-bit operands. These are enabled for
ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb
architectures.

Differential Revision: https://reviews.llvm.org/D30044

llvm-svn: 297809
2017-03-15 08:27:11 +00:00
Sam Parker 916b1ba617 [ARM] Move SMULW[B|T] isel to DAG Combine
Create nodes for smulwb and smulwt and move their selection from
DAGToDAG to DAG combine. smlawb and smlawt can then be selected
using tablegen. Added some helper functions to detect shift patterns
as well as a wrapper around SimplifyDemandBits. Added a couple of
extra tests.

Differential Revision: https://reviews.llvm.org/D30708

llvm-svn: 297716
2017-03-14 09:13:22 +00:00
Artyom Skrobov bf19d4bc29 [Thumb1] combine ADDC/SUBC with a negative immediate
Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400

Reviewers: efriedma, jmolloy

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30829

llvm-svn: 297682
2017-03-13 22:36:14 +00:00
Artyom Skrobov 0cc80c1f5a Refactor the multiply-accumulate combines to act on
ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE].

Summary:
This allows for some simplification because the combines
are no longer limited to just one go at the node before
it gets legalized into an ARM target-specific one.

Reviewers: jmolloy, rogfer01

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30401

llvm-svn: 297453
2017-03-10 12:41:33 +00:00
Artyom Skrobov 0c93ceb5d8 For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,
same as already done for ARM and Thumb2.

Reviewers: jmolloy, rogfer01, efriedma

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D30400

llvm-svn: 297443
2017-03-10 07:40:27 +00:00
Ranjeet Singh 3d0af578cc [ARM] Reapply r296865 "[ARM] fpscr read/write intrinsics not aware of each other""
The original patch r296865 was reverted as it broke the chromium builds for
Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies
r296865 with a fix to make sure it doesn't cause the build regression.

The problem was that intrinsic selection on int_arm_get_fpscr was failing in
ISel this was because the code to manually select this intrinsic still thought
it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as
it doesn't semantically match the definition in the tablegen code which says it
does have side-effects, I've fixed this by updating the intrinsic type to
INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on
Hans original reproducer.

Differential Revision: https://reviews.llvm.org/D30645

llvm-svn: 297137
2017-03-07 11:17:53 +00:00
Matthew Simpson 1bfa159db9 [ARM/AArch64] Support wide interleaved accesses
This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types
to interleaved access intrinsics as long as the types are multiples of the
vector register width. A "wide" access will now be mapped to multiple
interleave intrinsics similar to the way in which non-interleaved accesses with
illegal types are legalized into multiple accesses. I'll update the associated
TTI costs (in getInterleavedMemoryOpCost) as a follow-on.

Differential Revision: https://reviews.llvm.org/D29466

llvm-svn: 296750
2017-03-02 15:11:20 +00:00
Sanjay Patel ae7873fe55 [ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition
The transform in question claims to be doing:

// fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c))

...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node
for the sext/zext patterns.

This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(),
so I was seeing infinite loops with my draft of a patch applied.

The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll
is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's 
not affected by this code change.

Differential Revision:
https://reviews.llvm.org/D30355

llvm-svn: 296389
2017-02-27 21:30:54 +00:00
Evgeniy Stepanov 1fd19c6e5d Fix PR31896.
Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset).

llvm-svn: 295762
2017-02-21 20:17:34 +00:00
Sam Parker 58af0c55d2 [ARM] Replace HasT2ExtractPack with HasDSP
Removed the HasT2ExtractPack feature and replaced its references
with HasDSP. This then allows the Thumb2 extend instructions to be
selected for ARMv8M +dsp. These instruction descriptions have also
been refactored and more target tests have been added for their isel.

Differential Revision: https://reviews.llvm.org/D29623

llvm-svn: 295452
2017-02-17 15:42:44 +00:00
James Molloy 0ae2202235 [ARM] Fix crash caused by r294945
I'd missed a creator of FCMP nodes - duplicateCmp().

Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite.

llvm-svn: 294968
2017-02-13 17:18:00 +00:00
James Molloy d508789668 [ARM] Use VCMP, not VCMPE, for floating point equality comparisons
When generating a floating point comparison we currently unconditionally
generate VCMPE. This has the sideeffect of setting the cumulative Invalid
bit in FPSCR if any of the operands are QNaN.

It is expected that use of a relational predicate on a QNaN value should
raise Invalid. Quoting from the C standard:

  The relational and equality operators support the usual mathematical
  relationships between numeric values. For any ordered pair of numeric
  values exactly one of relationships the less, greater, equal and is true.
  Relational operators may raise the floating-point exception when argument
  values are NaNs.

The standard doesn't explicitly state the expectation for equality operators,
but the implication and obvious expectation is that equality operators
should not raise Invalid on a QNaN input, as those predicates are wholly
defined on unordered inputs (to return not equal).

Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if
QNaN should raise Invalid, and pipe that through to TableGen.

llvm-svn: 294945
2017-02-13 12:32:47 +00:00
Ahmed Bougacha fc979dc9dd [ARM] Don't lower f16 interleaved accesses.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Reject f16 interleaved accesses.  If we try to emit the f16 intrinsics,
we'll just end up with a selection failure.

llvm-svn: 294818
2017-02-11 01:53:00 +00:00
Arnold Schwaighofer db7bbcbe78 [ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns'
We mark X0 as preserved by a call that passes the returned parameter.

 x0 = ...
 fun(x0) // no implicit def of x0

This no longer is valid if we pass the parameter in a different register then
the returned value as is the case with a swiftself parameter (passed in x20).

x20 = ...
fun(x20) // there should be an implict def of x8

rdar://30425845

llvm-svn: 294527
2017-02-08 22:30:47 +00:00
Christof Douma d3ed8380e0 [ARM] Make RWPI use movw/movt when available
When constructing global address literals while targeting the RWPI
relocation model. LLVM currently only uses literal pools. If MOVW/MOVT
instructions are available we can use these instead. Beside being more
efficient it allows -arm-execute-only to work with
-relocation-model=RWPI as well.

When we generate MOVW/MOVT for global addresses when targeting the RWPI
relocation model, we need to use base relative relocations. This patch
does the needed plumbing in MC to generate these for MOVW/MOVT.

Differential Revision: https://reviews.llvm.org/D29487

Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d
llvm-svn: 294298
2017-02-07 13:07:12 +00:00
Reid Kleckner c35139ec0d [CodeGen] Remove dead call-or-prologue enum from CCState
This enum has been dead since Olivier Stannard re-implemented ARM byval
handling in r202985 (2014).

llvm-svn: 293943
2017-02-02 21:58:22 +00:00
Matthew Simpson ba5cf9dfee [LV] Move interleaved access helper functions to VectorUtils (NFC)
This patch moves some helper functions related to interleaved access
vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like
to use these functions in a follow-on patch that improves interleaved load and
store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was
already duplicated there and has been removed.

Differential Revision: https://reviews.llvm.org/D29398

llvm-svn: 293788
2017-02-01 17:45:46 +00:00
Matthias Braun 8c209aa877 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Eugene Zelenko e79c077ef9 [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293348
2017-01-27 23:58:02 +00:00
Saleem Abdulrasool 26c00e3700 ARM: fix vectorized division on WoA
The Windows on ARM target uses custom division for normal division as
the backend needs to insert division-by-zero checks.  However, it is
designed to only handle non-vectorized division.  ARM has custom
lowering for vectorized division as that can avoid loading registers
with the values and invoke a division routine for each one, preferring
to lower using NEON instructions.  Fall back to the custom lowering for
the NEON instructions if we encounter a vectorized division.

Resolves PR31778!

llvm-svn: 293259
2017-01-27 03:41:53 +00:00
Diana Picus bd66b7dc87 [ARM] Use helpers for adding pred / CC operands. NFC
Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0)
and replace with add(condCodeOp()) and add(predOps()). This should make it
easier to understand what those operands represent (without having to look at
the definition of the instruction that we're adding to).

Differential Revision: https://reviews.llvm.org/D27984

llvm-svn: 292587
2017-01-20 08:15:24 +00:00
Saleem Abdulrasool 6ef45916c6 ARM: match GCC's behaviour for builtins
GCC changes the CC between the user-code and the builtins based on the
value of `-target` rather than `-mfloat-abi`.  When a HF target is used,
the VFP variant of the AAPCS CC is used.  Otherwise, the AAPCS variant
is used.  In all cases, the AEABI functions use the AAPCS CC.  Adjust
the calling convention based on the target.

Resolves PR30543!

llvm-svn: 291909
2017-01-13 16:25:33 +00:00
Benjamin Kramer 061f4a5fe6 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Diana Picus a2c59149e1 [ARM] CodeGen: Replace AddDefaultT1CC and AddNoT1CC. NFC
For AddDefaultT1CC, we add a new helper t1CondCodeOp, which creates the
appropriate register operand. For AddNoT1CC, we use the existing condCodeOp
helper - we only had two uses of AddNoT1CC, so at this point it's probably not
worth having yet another helper just for them.

Differential Revision: https://reviews.llvm.org/D28603

llvm-svn: 291894
2017-01-13 10:37:37 +00:00
Diana Picus 8a73f5562f [ARM] CodeGen: Remove AddDefaultCC. NFC.
Replace all uses of AddDefaultCC with add(condCodeOp()).
The transformation has been done automatically with a custom tool based on Clang
AST Matchers + RefactoringTool.

Differential Revision: https://reviews.llvm.org/D28557

llvm-svn: 291893
2017-01-13 10:18:01 +00:00
Diana Picus 116bbab4e4 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

llvm-svn: 291891
2017-01-13 09:58:52 +00:00
Diana Picus 4f8c3e1882 [ARM] CodeGen: Remove AddDefaultPred. NFC.
Replace all uses of AddDefaultPred with MachineInstrBuilder::add(predOps()).
This makes the code building MachineInstrs more readable, because it allows us
to write code like:

MIB.addSomeOperand(blah)
   .add(predOps())
   .addAnotherOperand(blahblah)

instead of

AddDefaultPred(MIB.addSomeOperand(blah))
    .addAnotherOperand(blahblah)

This commit also adds the predOps helper in the ARM backend, as well as the add
method taking a variable number of operands to the MachineInstrBuilder.

The transformation has been done mostly automatically with a custom tool based
on Clang AST Matchers + RefactoringTool.

Differential Revision: https://reviews.llvm.org/D28555

llvm-svn: 291890
2017-01-13 09:37:56 +00:00
Saleem Abdulrasool 555e5980a5 ARM: slightly more table driven libcall setup
Switch some additional library call setup to be table driven.  This
makes it more immediately obvious what the library call looks like.
This is important for ARM since the calling conventions for the builtins
change based on the target/libcall name.  NFC

llvm-svn: 291789
2017-01-12 18:46:11 +00:00
Eli Friedman 3a03742c37 [ARM] More aggressive matching for vpadd and vpaddl.
The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.

Differential Revision: https://reviews.llvm.org/D27779

llvm-svn: 291693
2017-01-11 19:33:38 +00:00