Commit Graph

90 Commits

Author SHA1 Message Date
Kevin Qin 50944eb638 fix some spell mistakes around 'ConcatVector' and 'ShuffleVector' in AArch64 backend.
llvm-svn: 199858
2014-01-23 01:35:13 +00:00
Kevin Qin ce0190c6d5 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR.
llvm-svn: 199791
2014-01-22 06:11:03 +00:00
Kevin Qin 6d379abd8f [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
It was commited as r199628 but reverted in r199628 as causing
regression test failed. It's because of old vervsion of patch
I used to commit. Sorry for mistake.

llvm-svn: 199704
2014-01-21 01:48:52 +00:00
Chandler Carruth f835fc6f4f Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT."
This test fails the newly added regression tests.

llvm-svn: 199631
2014-01-20 08:18:01 +00:00
Kevin Qin ff42e06ef4 [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
llvm-svn: 199628
2014-01-20 07:32:26 +00:00
Kevin Qin e0faea11b1 [AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations.
llvm-svn: 199485
2014-01-17 09:54:30 +00:00
Kevin Qin 212d9b4a56 [AArch64 NEON] Custom lower conversion between vector integer and vector floating point if element bit-width doesn't match.
llvm-svn: 199462
2014-01-17 05:52:35 +00:00
Hao Liu 18d92262c5 [AArch64]Fix the problem can't select concat_vectors of two v1i32 types.
Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32.

llvm-svn: 199461
2014-01-17 05:44:46 +00:00
Jiangning Liu 0a791c348b For AArch64, lowering sext_inreg and generate optimized code by using SXTL.
llvm-svn: 199296
2014-01-15 05:08:01 +00:00
Tim Northover 6e219cd588 AArch64: don't try to handle [SU]MUL_LOHI nodes
We should set them to expand for now since there are no patterns
dealing with them. Actually, there are no instructions either so I
doubt they'll ever be acceptable.

llvm-svn: 199265
2014-01-14 22:53:22 +00:00
Lang Hames 06234ec147 Add FPExt option to CCValAssign::LocInfo. When generating calling-convention
promotion code, Tablegen will now select FPExt for floating point promotions
(previously it had returned AExt, which is not valid for floating point types).

Any out-of-tree targets that were relying on AExt being returned for FP
promotions will need to update their code check for FPExt instead.

llvm-svn: 199252
2014-01-14 19:56:36 +00:00
Andrea Di Biagio 9bc0415c1f [AArch64] Fix assertion failure caused by an invalid comparison between APInt values.
APInt only knows how to compare values with the same BitWidth and asserts
in all other cases.

With this fix, function PerformORCombine does not use the APInt equality
operator if the APInt values returned by 'isConstantSplat' differ in BitWidth.
In that case they are different and no comparison is needed.

llvm-svn: 199119
2014-01-13 16:51:00 +00:00
Kevin Qin 21e8f1c4eb [AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector
This patch covered 2 more scenarios:

1.  Two operands of shuffle_vector are the same, like
%shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>

2. One of operands is undef, like
%shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>

After this patch, perm instructions will have chance to be emitted instead of lots of INS.

llvm-svn: 199069
2014-01-13 01:56:29 +00:00
Kristof Beyls 90ff80e329 Silence unused variable warning for non-asserting builds that was introduced in r198937.
llvm-svn: 198941
2014-01-10 14:20:45 +00:00
Kristof Beyls 58306ad903 Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux.
llvm-svn: 198937
2014-01-10 13:41:49 +00:00
Kevin Qin 44946439e1 [AArch64 NEON] Fix generating incorrect value type of NEON_VDUPLANE
when lower build_vector if result value type mismatch with operand
value type.

llvm-svn: 198743
2014-01-08 08:06:14 +00:00
Bill Wendling 13199b17f8 Remove unnecessary #includes.
llvm-svn: 198585
2014-01-06 06:00:00 +00:00
Bill Wendling 908bf814e7 Refactor function that checks that __builtin_returnaddress's argument is constant.
This moves the check up into the parent class so that all targets can use it
without having to copy (and keep in sync) the same error message.

llvm-svn: 198579
2014-01-06 00:43:20 +00:00
Bill Wendling df7dd28dc8 Emit an error message if the value passed to __builtin_returnaddress isn't a constant
__builtin_returnaddress requires that the value passed into is be a constant.
However, at -O0 even a constant expression may not be converted to a constant.
Emit an error message intead of crashing.

llvm-svn: 198531
2014-01-05 01:47:20 +00:00
Rafael Espindola 6994fdf33c Remove the 's' DataLayout specification
During the years there have been some attempts at figuring out how to
align byval arguments. A look at the commit log suggests that they
were

* Use the ABI alignment.
* When that was not sufficient for x86-64, I added the 's' specification to
  DataLayout.
* When that was not sufficient Evan added the virtual getByValTypeAlignment.
* When even that was not sufficient, we just got the FE to add the alignment
  to the byval.

This patch is just a simple cleanup that removes my first attempt at fixing the
problem. I also added an AArch64 implementation of getByValTypeAlignment to
make sure this patch is a nop. I also left the 's' parsing for backward
compatibility.

I will send a short email to llvmdev about the change for anyone maintaining
an out of tree target.

llvm-svn: 198287
2014-01-01 22:29:43 +00:00
Hao Liu 74107fe526 [AArch64]Fix the problem that can't select mul of v1i64/v2i64 types.
E.g. Can't select such IR:
     %tmp = mul <2 x i64> %a, %b

llvm-svn: 198188
2013-12-30 01:38:41 +00:00
Kevin Qin 82bd84aadf [AArch64 NEON] Fix a bug when lowering BUILD_VECTOR.
DAG.getVectorShuffle() doesn't always return a vector_shuffle node.
If mask is the exact sequence of it's operand(For example, operand_0
is v8i8, and  the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly
return that operand. So a check is added here.

llvm-svn: 197967
2013-12-24 08:16:06 +00:00
Kevin Qin cd5f3153f5 [AArch64 NEON] Fix a pattern match failure with NEON_VDUP.
This failure caused by improper condition when lowering shuffle_vector
to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not
be generated.

llvm-svn: 197966
2013-12-24 08:11:47 +00:00
Kevin Qin 53eaea0104 [AArch64 NEON]Implment loading vector constant form constant pool.
llvm-svn: 197551
2013-12-18 06:26:04 +00:00
Hao Liu 774cabb538 [AArch64]Fix the pattern match failure for v1i8/v1i16/v1i32 types.
Currently we have such types as legal vector types. The DAG combiner may generate some DAG nodes having such types but we don't have patterns to match them.
E.g. a load i32 and a bitcast i32 to v1i32 will be combined into a load v1i32:
     bitcast (load i32) to v1i32 -> load v1i32.
So this patch fixes such problems for load/dup instructions.
If v1i8/v1i16/v1i32 are not legal any more, the code in this patch can be deleted. So I also add some FIXME.

llvm-svn: 197361
2013-12-16 02:51:28 +00:00
Chad Rosier 4055f42d22 [AArch64] Removed unnecessary copy patterns with v1fx types.
- Copy patterns with float/double types are enough.
- Fix typos in test case names that were using v1fx.
- There is no ACLE intrinsic that uses v1f32 type.  And there is no conflict of
  neon and non-neon ovelapped operations with this type, so there is no need to
  support operations with this type.
- Remove v1f32 from FPR32 register and disallow v1f32 as a legal type for
  operations.

Patch by Ana Pazos!

llvm-svn: 197159
2013-12-12 15:46:29 +00:00
Kevin Qin 310b6c08ba [AArch64 NEON] Get instruction BSL matched to VSELECT.
llvm-svn: 196998
2013-12-11 02:33:50 +00:00
Hao Liu 868caea6d1 [AArch64]Pattern match failures for truncate store and extend load
llvm-svn: 196748
2013-12-09 03:34:08 +00:00
Jiangning Liu 65d8e3422a For AArch64, add missing register cost calculation for big value types like v4i64 and v8i64.
llvm-svn: 196456
2013-12-05 02:12:01 +00:00
Hao Liu dca64f4a20 [AArch64]Add missing floating point convert, round and misc intrinsics.
E.g. int64x1_t vcvt_s64_f64(float64x1_t a) -> FCVTZS Dd, Dn

llvm-svn: 196210
2013-12-03 06:06:55 +00:00
Jiangning Liu 94a7bb2130 Add some missing pattern matches for AArch64 Neon intrinsics like vmull_high_n_s16 and friends.
llvm-svn: 196190
2013-12-03 01:29:32 +00:00
Benjamin Kramer ea1982aff9 Silence sign-compare warning and reduce nesting.
No functionality change.

llvm-svn: 195932
2013-11-28 19:58:56 +00:00
Jiangning Liu 4bc9dbd846 Remove the variable only used by assert to avoid the build failure
caused by build options [-Werror,-Wunused-variable].

llvm-svn: 195905
2013-11-28 01:34:55 +00:00
Jiangning Liu 97aa8cf8b7 Fix the AArch64 NEON bug exposed by checking constant integer argument range of ACLE intrinsics.
llvm-svn: 195843
2013-11-27 14:02:25 +00:00
Kevin Qin 599c47d0de Refactored the implementation of AArch64 NEON instruction ZIP, UZP
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().

llvm-svn: 195716
2013-11-26 03:26:47 +00:00
Hao Liu 16edc4675c Implement AArch64 neon instructions class SIMD lsone and SIMD lone-post.
llvm-svn: 195078
2013-11-19 02:17:05 +00:00
Hao Liu 5a4e4e107d Implement the newly added ACLE functions for ld1/st1 with 2/3/4 vectors.
The functions are like: vst1_s8_x2 ...

llvm-svn: 194990
2013-11-18 06:31:53 +00:00
Kevin Qin aec95baf1a Implement aarch64 neon instruction class SIMD misc.
llvm-svn: 194656
2013-11-14 02:44:13 +00:00
Jiangning Liu a50e22ca4f Implement AArch64 Neon instruction set Bitwise Extract.
llvm-svn: 194118
2013-11-06 02:25:49 +00:00
Hao Liu d6b40b51c7 Implement AArch64 post-index vector load/store multiple N-element structure class SIMD(lselem-post).
Including following 14 instructions:
4 ld1 insts: post-index load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: post-index load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: post-index store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: post-index store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 194043
2013-11-05 03:39:32 +00:00
Kevin Qin 97f6aaa8ad Implemented aarch64 neon intrinsic vcopy_lane with float type.
llvm-svn: 194041
2013-11-05 02:03:59 +00:00
Amara Emerson f80f95fcc7 [AArch64] Make the use of FP instructions optional, but enabled by default.
This adds a new subtarget feature called FPARMv8 (implied by NEON), and
predicates the support of the FP instructions and registers on this feature.

llvm-svn: 193739
2013-10-31 09:32:11 +00:00
Chad Rosier be020d0309 [AArch64] Add support for NEON scalar floating-point compare instructions.
llvm-svn: 193691
2013-10-30 15:19:37 +00:00
Weiming Zhao ffade617bd [AArch64] Implement FrameAddr and ReturnAddr
Fixes PR17690

llvm-svn: 193625
2013-10-29 17:00:25 +00:00
Amara Emerson c5cae0f20c [AArch64] Fix NZCV reg live-in bug in F128CSEL codegen.
When generating the IfTrue basic block during the F128CSEL pseudo-instruction
handling, the NZCV live-in for the newly created BB wasn't being added. This
caused a fault during MI-sched/live range calculation when the predecessor
for the fall-through BB didn't have a live-in for phys-reg as expected.

llvm-svn: 193316
2013-10-24 08:28:24 +00:00
Kevin Qin a89e7a0e1c Implement aarch64 neon instruction set AdvSIMD (copy).
llvm-svn: 192410
2013-10-11 02:33:55 +00:00
Hao Liu 99eac7ee44 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192361
2013-10-10 17:00:52 +00:00
Rafael Espindola 9558af461d Revert "Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem). Including following 14 instructions: 4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers. ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4). 4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers. st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4)."
This reverts commit r192352. It broke the build.

llvm-svn: 192354
2013-10-10 15:15:17 +00:00
Hao Liu 9123ad8ab9 Implement AArch64 vector load/store multiple N-element structure class SIMD(lselem).
Including following 14 instructions:
4 ld1 insts: load multiple 1-element structure to sequential 1/2/3/4 registers.
ld2/ld3/ld4: load multiple N-element structure to sequential N registers (N=2,3,4).
4 st1 insts: store multiple 1-element structure from sequential 1/2/3/4 registers.
st2/st3/st4: store multiple N-element structure from sequential N registers (N = 2,3,4).

llvm-svn: 192352
2013-10-10 15:01:24 +00:00
Chad Rosier b6ceeb9126 [AArch64] Add support for NEON scalar arithmetic instructions:
SQDMULH, SQRDMULH, FMULX, FRECPS, and FRSQRTS.

llvm-svn: 192107
2013-10-07 16:36:15 +00:00