llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	f3491496dc	[AArch64] Expand vector SDIVREM/UDIVREM operations. http://reviews.llvm.org/D15214 Patch by Ana Pazos <apazos@codeaurora.org>! llvm-svn: 254773	2015-12-04 21:38:44 +00:00
Tim Northover	f3be9d5c0b	AArch64: fix 128-bit shifts We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an UNDEF value (and worse, it's not what you want with the natural Arch64 implementation). The generated code is pretty horrific, but I couldn't come up with an obviously better alternative (if the amount is constant EXTR could help). Turns out 128-bit shifts are just nasty. rdar://22491037 llvm-svn: 254475	2015-12-02 00:33:54 +00:00
Artyom Skrobov	314ee04268	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC) Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085	2015-11-25 19:41:11 +00:00
Ahmed Bougacha	88ddeae8bd	[AArch64] Promote f16 SELECT_CC CC operands when op is legal. SELECT_CC has the nasty property of having operands with unrelated types. So if you do something like: f32 = select_cc f16, f16, f32, f32, cc You'd only look for the action for <select_cc, f32>, but never f16. If the types are all legal, but the op isn't (as for f16 on AArch64, or for f128 on x86_64/AArch64?), then you get into trouble. For f128, we have softenSetCCOperands to handle this case. Similarly, for f16, we can directly promote the CC operands. llvm-svn: 253344	2015-11-17 16:45:40 +00:00
Tim Northover	339c83e27f	AArch64: add experimental support for address tagging. AArch64 has the ability to use the top 8-bits of an "address" for extra information, with the memory subsystem automatically masking them off for loads and stores. When that's happening, we can sometimes skip masks on memory operations in the compiler. However, this requires the host OS and support stack to preserve those bits so it can't be enabled everywhere. In principle iOS 8.0 and above do take the required precautions and but we'll put it under a flag for now. llvm-svn: 252573	2015-11-10 00:44:23 +00:00
Charlie Turner	7b7b06f737	[AArch64] Handle extract_subvector(..., 0) in ISel. Summary: Lowering this pattern early to an `EXTRACT_SUBREG` was making it impossible to match larger patterns in tblgen that use `extract_subvector(..., 0)` as part of the their input pattern. It seems like there will exist somewhere a better way of specifying this pattern over all relevant register value types, but I didn't manage to find it. Reviewers: t.p.northover, jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14207 llvm-svn: 252464	2015-11-09 12:45:11 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Charlie Turner	458e79b814	[ARM] Expand ROTL and ROTR of vector value types Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401	2015-10-27 10:25:20 +00:00
Evgeniy Stepanov	d1aad26589	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324	2015-10-26 18:28:25 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Charlie Turner	434d4599d4	[AArch64] Implement vector splitting on UADDV. Summary: Fixes PR25056. Reviewers: mcrosier, junbuml, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13466 llvm-svn: 250520	2015-10-16 15:38:25 +00:00
Evgeniy Stepanov	9addbc9fc1	Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android." Breaks the hexagon buildbot. llvm-svn: 250461	2015-10-15 21:26:49 +00:00
Evgeniy Stepanov	142947e9f0	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456	2015-10-15 20:50:16 +00:00
Duncan P. N. Exon Smith	d3b9df02b3	AArch64: Remove implicit ilist iterator conversions, NFC llvm-svn: 250216	2015-10-13 20:02:15 +00:00
Jun Bum Lim	0aace13d18	Improve ISel across lane float min/max reduction In vectorized float min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : svn0 = vector_shuffle t0, undef<2,3,u,u> fmin = fminnum t0,svn0 svn1 = vector_shuffle fmin, undef<1,u,u,u> cc = setcc fmin, svn1, ole n0 = extract_vector_elt cc, #0 n1 = extract_vector_elt fmin, #0 n2 = extract_vector_elt fmin, #1 result = select n0, n1,n2 into : result = llvm.aarch64.neon.fminnmv t0 This change extends r247575. llvm-svn: 249834	2015-10-09 14:11:25 +00:00
Chad Rosier	7c6ac2b8f9	[AArch64] Fold a floating-point divide by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249579	2015-10-07 17:51:37 +00:00
Chad Rosier	fa30c9b436	[AArch64] Fold a floating-point multiply by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249576	2015-10-07 17:39:18 +00:00
Jeroen Ketema	aebca09543	[ARM][AArch64] Only lower to interleaved load/store if the target has NEON Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550	2015-10-07 14:53:29 +00:00
Chad Rosier	1f385618c0	[ARM] Typo. NFC. llvm-svn: 249153	2015-10-02 16:42:59 +00:00
Sanjay Patel	bbbf9a1a34	merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711) This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ). The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned accesses up in performSTORECombine() because they are slow. This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving existing (perhaps questionable) lowering behavior. The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned stores. Differential Revision: http://reviews.llvm.org/D12635 llvm-svn: 248622	2015-09-25 21:49:48 +00:00
Ahmed Bougacha	07a844d758	[AArch64] Emit clrex in the expanded cmpxchg fail block. In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248291	2015-09-22 17:21:44 +00:00
Jun Bum Lim	34b9bd0435	Improve ISel using across lane min/max reduction In vectorized integer min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : %svn0 = vector_shuffle %0, undef<2,3,u,u> %smax0 = smax %0, svn0 %svn3 = vector_shuffle %smax0, undef<1,u,u,u> %sc = setcc %smax0, %svn3, gt %n0 = extract_vector_elt %sc, #0 %n1 = extract_vector_elt %smax0, #0 %n2 = extract_vector_elt $smax0, #1 %result = select %n0, %n1, n2 becomes : %1 = smaxv %0 %result = extract_vector_elt %1, 0 This change extends r246790. llvm-svn: 247575	2015-09-14 16:19:52 +00:00
Ahmed Bougacha	5246867384	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	9d677131c4	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428	2015-09-11 17:08:17 +00:00
Jun Bum Lim	4d3c5986f2	Remove white space (test commit) llvm-svn: 247021	2015-09-08 16:11:22 +00:00
Chad Rosier	6c36eff1d6	[AArch64] Improve ISel using across lane addition reduction. In vectorized add reduction code, the final "reduce" step is sub-optimal. This change wll combine : ext v1.16b, v0.16b, v0.16b, #8 add v0.4s, v1.4s, v0.4s dup v1.4s, v0.s[1] add v0.4s, v1.4s, v0.4s into addv s0, v0.4s PR21371 http://reviews.llvm.org/D12325 Patch by Jun Bum Lim <junbuml@codeaurora.org>! llvm-svn: 246790	2015-09-03 18:13:57 +00:00
Ahmed Bougacha	b0ff6437cb	[AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0. This matches the ARM behavior. In both cases, the register is part of the optional Performance Monitors extension, so, add the feature, and enable it for the A-class processors we support. Differential Revision: http://reviews.llvm.org/D12425 llvm-svn: 246555	2015-09-01 16:23:45 +00:00
Matthias Braun	0acbd08f3c	AArch64: Fix loads to lower NEON vector lanes using GPR registers The ISelLowering code turned insertion turned the element for the lowest lane of a BUILD_VECTOR into an INSERT_SUBREG, this prohibited the patterns for SCALAR_TO_VECTOR(Load) to match later. Restrict this to cases without a load argument. Reported in rdar://22223823 Differential Revision: http://reviews.llvm.org/D12467 llvm-svn: 246462	2015-08-31 18:25:15 +00:00
Silviu Baranga	db1ddb32ce	[AArch64] Unify the integer min/max vector selection patterns with the intrinsic ones Summary: This change lowers the aarch64 integer vector min/max intrinsic nodes to generic min/max nodes and replaces the intrinsic selection patterns with the generic ones. There should already be testing in place for this, so no further tests were added. Reviewers: jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12276 llvm-svn: 246030	2015-08-26 11:11:14 +00:00
Matthias Braun	46e5639806	AArch64: Fix cmp;ccmp ordering When producing conditional compare sequences for or operations we need to negate the operands and the finally tested flags. The thing is if we negate the finally tested flags this equals a logical negation of all previously emitted expressions. There was a case missing where we have to order OR expressions so they get emitted first. This fixes http://llvm.org/PR24459 llvm-svn: 245641	2015-08-20 23:33:34 +00:00
Matthias Braun	266204b7dc	AArch64: Do not create CCMP on multiple users. Create CMP;CCMP sequences from and/or trees does not gain us anything if the and/or tree is materialized to a GP register anyway. While most of the code already checked for hasOneUse() there was one important case missing. llvm-svn: 245640	2015-08-20 23:33:31 +00:00
James Molloy	88edc8243d	Remove hand-rolled matching for fmin and fmax. SDAGBuilder now does this all for us. llvm-svn: 245198	2015-08-17 07:13:20 +00:00
James Molloy	63be198712	[AArch64] FMINNAN/FMAXNAN on f16 is not legal. Spotted by Ahmed - in r244594 I inadvertently marked f16 min/max as legal. I've reverted it here, and marked min/max on scalar f16's as promote. I've also added a testcase. The test just checks that the compiler doesn't fall over - it doesn't create fmin nodes for f16 yet. llvm-svn: 245035	2015-08-14 09:08:50 +00:00
Ahmed Bougacha	2a97b1bcf8	[AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN. We can lower them using our cool tricks if we fpext/fptrunc the second input, like we do for f32/f64. Follow-up to r243924, r243926, and r244858. llvm-svn: 244860	2015-08-13 01:13:56 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
James Molloy	b7b2a1e9b4	[AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic. Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change: - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant. - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall! llvm-svn: 244595	2015-08-11 12:06:37 +00:00
James Molloy	edf38f0cb0	[AArch64] Replace the custom AArch64ISD::FMIN/MAX nodes with ISD::FMINNAN/MAXNAN NFCI. This just removes custom ISDNodes that are no longer needed. llvm-svn: 244594	2015-08-11 12:06:33 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Ahmed Bougacha	e0e12db8c8	[AArch64] Add isel support for f16 indexed LD/ST. llvm-svn: 243935	2015-08-04 01:29:38 +00:00
Ahmed Bougacha	0cbe2efcd6	[AArch64] Remove unnecessary "break". NFC. llvm-svn: 243931	2015-08-04 00:49:08 +00:00
Ahmed Bougacha	239d635d3d	[AArch64] Use SDValue bool operator. NFC. llvm-svn: 243930	2015-08-04 00:48:02 +00:00
Ahmed Bougacha	b0ae36f0d1	[AArch64] Vector FCOPYSIGN supports Custom-lowering: mark it as such. There's a bunch of code in LowerFCOPYSIGN that does smart lowering, and is actually already vector-aware; let's use it instead of scalarizing! The only interesting change is that for v2f32, we previously always used use v4i32 as the integer vector type. Use v2i32 instead, and mark FCOPYSIGN as Custom. llvm-svn: 243926	2015-08-04 00:42:34 +00:00
Pete Cooper	7be8f8f018	Convert some AArch64 code to foreach loops. NFC. Also converted a cast<> to dyn_cast while i was working on the same line of code. llvm-svn: 243894	2015-08-03 19:04:32 +00:00
Akira Hatanaka	f53b0403f8	[AArch64] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -aarch64-strict-align to decide whether strict alignment should be forced. rdar://problem/21529937 llvm-svn: 243516	2015-07-29 14:17:26 +00:00
Sanjay Patel	1dd15598cf	fix TLI's combineRepeatedFPDivisors interface to return the minimum user threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498	2015-07-28 23:05:48 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Adhemerval Zanella	7bc3319d84	Implement __builtin_thread_pointer This path add the aarch64 lowering of __builtin_thread_pointer. It uses the already implemented AArch64ISD::THREAD_POINTER used in TLS generation. llvm-svn: 243412	2015-07-28 13:03:31 +00:00
Luke Cheeseman	b5c627aba8	When lowering vector shifts a check is performed to see if the value to shift by is an immediate, in this check the value is negated and stored in and int64_t. The value can be -2^63 yet the result cannot be stored in an int64_t and this gives some undefined behaviour causing failures. The negation is only necessary when the values is within a certain range and so it should not need to negate -2^63, this patch introduces this and also a regression test. Differential Revision: http://reviews.llvm.org/D11408 llvm-svn: 243100	2015-07-24 09:31:48 +00:00
Geoff Berry	e41c2df0ef	Fix comment typo (test commit). NFC llvm-svn: 242719	2015-07-20 22:03:52 +00:00

1 2 3 4 5 ...

335 Commits