llvm-project

Commit Graph

Author	SHA1	Message	Date
Manman Ren	1602605bf8	CXX_FAST_TLS calling convention: Add support for ARM on Darwin. rdar://9001553 llvm-svn: 257417	2016-01-11 23:50:43 +00:00
Weiming Zhao	4b3b13d3bc	RBIT Instruction only available for ARMv6t2 and above. Summary: r255334 matches bit-reverse pattern in InstCombine and generates calls to Instrinsic::bitreverse. RBIT instruction is only available for ARMv6t2 and above. This patch has the intrinsic expanded during legalization for ARMv4 and ARMv5. Patch by Z. Zheng <zhaoshiz@codeaurora.org> Reviewers: apazos, jmolloy, weimingz Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D15932 llvm-svn: 257188	2016-01-08 18:43:41 +00:00
Eric Christopher	b793230797	Add some testing for thumb1 and thumb2 inline asm immediate constraints and fix a couple of bugs on inspection. Also fixes PR26061. llvm-svn: 257122	2016-01-08 00:34:44 +00:00
Tim Northover	bd41cf880c	ARM: support TLS accesses on Darwin platforms Darwin TLS accesses most closely resemble ELF's general-dynamic situation, since they have to be able to handle all possible situations. The descriptors and so on are obviously slightly different though. llvm-svn: 257039	2016-01-07 09:03:03 +00:00
Chad Rosier	353d71914a	Remove extra whitespace. NFC. llvm-svn: 256173	2015-12-21 18:08:05 +00:00
Cong Hou	c106989fd5	Normalize MBB's successors' probabilities in several locations. This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455	2015-12-13 09:26:17 +00:00
Hal Finkel	cd8664c3c2	Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387	2015-12-11 23:11:52 +00:00
Ahmed Bougacha	97564c3a1b	[AArch64][ARM] Don't base interleaved op legality on type alloc size. Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, any vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089	2015-12-09 01:19:50 +00:00
Quentin Colombet	901f036353	[ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine it with its source instead of forcing the values on GPRs. This improves the lowering of vector code when such bitcasts happen in the middle of vector computations. rdar://problem/23691584 llvm-svn: 254684	2015-12-04 01:53:14 +00:00
Tim Northover	f520eff782	AArch64: use ldxp/stxp pair to implement 128-bit atomic loads. The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524	2015-12-02 18:12:57 +00:00
Cong Hou	d97c100dc4	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377	2015-12-01 05:29:22 +00:00
Hans Wennborg	1dbaf67537	Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366	2015-12-01 03:49:42 +00:00
Cong Hou	fa1917c673	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348	2015-12-01 00:02:51 +00:00
Martell Malone	d12292480a	ARM: address WOA unsigned division overflow crash Building on r253865 the crash is not limited to signed overflows. Disable custom handling of unsigned 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit unsigned integer overflow. llvm-svn: 254158	2015-11-26 15:34:03 +00:00
Artyom Skrobov	314ee04268	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC) Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085	2015-11-25 19:41:11 +00:00
Martell Malone	a6b867eb0d	ARM: address WoA division overflow crash Disable custom handling of signed 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit integer overflow crashes. llvm-svn: 253865	2015-11-23 13:11:39 +00:00
James Molloy	2018091e87	Properly check if a CMPZ node is in fact comparing against zero This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless. Caught by Oliver Stannard using csmith; regression test added. llvm-svn: 253195	2015-11-16 10:49:25 +00:00
James Molloy	b564098c62	[ARM] Replace ARMISD::RBIT with ISD::BITREVERSE ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering. llvm-svn: 253047	2015-11-13 16:05:22 +00:00
James Molloy	8e99e97f2a	[ARM] CMOV->BFI combining: handle both senses of CMPZ I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was. If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases. llvm-svn: 252891	2015-11-12 13:49:17 +00:00
Diego Novillo	0767ae5896	Properly fix unused variable in disable-assert builds. I missed the side-effects of ParseBFI in my previous attempt (r252748). Thanks dblaikie for the suggestion of adding a void use of the unused variable instead. llvm-svn: 252751	2015-11-11 16:39:22 +00:00
Diego Novillo	29f88a2460	Remove unused variable in disable-assert builds. NFC. llvm-svn: 252748	2015-11-11 16:14:52 +00:00
James Molloy	ce12c92f66	[ARM] Combine BFIs together If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits. llvm-svn: 252740	2015-11-11 15:40:40 +00:00
Sanjay Patel	af1b48bfdc	[ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz() ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz r0, r0 bx lr cttz: rbit r0, r0 clz r0, r0 bx lr Instead of: ctlz: cmp r0, #0 moveq r0, #32 clzne r0, r0 bx lr cttz: cmp r0, #0 moveq r0, #32 rbitne r0, r0 clzne r0, r0 bx lr This will help solve a general speculation/despeculation problem noted in PR24818: https://llvm.org/bugs/show_bug.cgi?id=24818 Differential Revision: http://reviews.llvm.org/D14469 llvm-svn: 252639	2015-11-10 19:24:31 +00:00
James Molloy	9d55f19cfa	Reapply "[ARM] Combine CMOV into BFI where possible" Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh! Original commit message: If we have a CMOV, OR and AND combination such as: if (x & CN) y \|= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252606	2015-11-10 14:22:05 +00:00
Renato Golin	6d435f12f0	[EABI] Add LLVM support for -meabi flag "GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent '__aeabi_memops'. This convertion violates GCC contract. The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI targets. Without -meabi: use the triple default EABI. With -meabi=default: use the triple default EABI. With -meabi=gnu: use 'memops'. With -meabi=4 or -meabi=5: use '__aeabi_memops'. With -meabi set to an unknown value: same as -meabi=default. Patch by Vinicius Tinti. llvm-svn: 252462	2015-11-09 12:40:30 +00:00
Renato Golin	1d8a2c952f	Revert "[ARM] Combine CMOV into BFI where possible" This reverts commit r252057, as it broke ARM self-hosting buildbots, probably due to a code-gen fault. llvm-svn: 252460	2015-11-09 12:19:10 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
James Molloy	bef6e43107	[ARM] Compute known bits for ARMISD::CMOV We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities. I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :( This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case. llvm-svn: 252168	2015-11-05 15:21:58 +00:00
James Molloy	e7d679cf4c	[ARM] Combine CMOV into BFI where possible If we have a CMOV, OR and AND combination such as: if (x & CN) y \|= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252057	2015-11-04 16:55:07 +00:00
Tim Northover	f8e47e4868	ARM: add support for WatchOS's compact unwind information. llvm-svn: 251573	2015-10-28 22:56:36 +00:00
Tim Northover	8b40366b54	ARM: teach backend about WatchOS and TvOS libcalls. The most substantial changes are again for watchOS: libcalls are hard-float if needed and sincos has a different calling convention. llvm-svn: 251571	2015-10-28 22:51:16 +00:00
Charlie Turner	458e79b814	[ARM] Expand ROTL and ROTR of vector value types Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401	2015-10-27 10:25:20 +00:00
Peter Collingbourne	99fac80db2	ARM/ELF: Restore original (pre-r251322) logic for deciding whether to use GOT. Unbreaks linking with gold, which cannot resolve direct relocations referring to global symbols. llvm-svn: 251342	2015-10-26 20:46:44 +00:00
Peter Collingbourne	97aae40880	ARM/ELF: Better codegen for global variable addresses. In PIC mode we were previously computing global variable addresses (or GOT entry addresses) by adding the PC, the PC-relative GOT displacement and the GOT-relative symbol/GOT entry displacement. Because the latter two displacements are fixed, we ended up performing one more addition than necessary. This change causes us to compute addresses using a single PC-relative displacement, resulting in a shorter code sequence. This reduces code size by about 4% in a recent build of Chromium for Android. As a result of this change we no longer need to compute the GOT base address in the ARM backend, which allows us to remove the Global Base Reg pass and SDAG lowering for the GOT. We also now no longer use the GOT when addressing a symbol which is known to be defined in the same linkage unit. Specifically, the symbol must have either hidden visibility or a strong definition in the current module in order to not use the the GOT. This is a change from the previous behaviour where we would use the GOT to address externally visible symbols defined in the same module. I think the only cases where this could matter are cases involving symbol interposition, but we don't really support that well anyway. Differential Revision: http://reviews.llvm.org/D13650 llvm-svn: 251322	2015-10-26 18:23:16 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Duncan P. N. Exon Smith	9f9559e807	ARM: Remove implicit ilist iterator conversions, NFC llvm-svn: 250759	2015-10-19 23:25:57 +00:00
Craig Topper	2626094fa1	Make a bunch of static arrays const. llvm-svn: 250642	2015-10-18 05:15:34 +00:00
Chad Rosier	169865ffda	[ARM] Promote helper function to SelectionDAG. I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572	2015-10-07 17:28:58 +00:00
Oliver Stannard	d3d114ba54	[ARM] Use correct half-precision functions in EABI mode The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565	2015-10-07 16:58:49 +00:00
Chad Rosier	17436bf64e	[ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes. This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 249560	2015-10-07 16:15:40 +00:00
Jeroen Ketema	aebca09543	[ARM][AArch64] Only lower to interleaved load/store if the target has NEON Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550	2015-10-07 14:53:29 +00:00
Chad Rosier	db71abf2d4	[ARM] Push more complex check down to reduce compile time. NFC. llvm-svn: 249547	2015-10-07 13:40:44 +00:00
Chad Rosier	dca46b426f	[ARM] Minor refactoring. NFC. llvm-svn: 249465	2015-10-06 20:58:42 +00:00
Chad Rosier	aed910b7d7	[ARM] Minor refactoring. NFC. llvm-svn: 249464	2015-10-06 20:51:26 +00:00
Chad Rosier	9df4aff86d	[ARM] Minor refactoring. NFC. llvm-svn: 249463	2015-10-06 20:45:45 +00:00
Chad Rosier	a087fd21da	[ARM] Minor refactoring to improve readability. NFC. llvm-svn: 249454	2015-10-06 20:23:42 +00:00
Scott Douglass	953f908173	[ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing memcpy as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achieving better codegen is to create MEMCPY pseudo instructions, attach scratch virtual registers to them and then, post register allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers. The register allocator will have picked arbitrary registers which we sort when expanding the MEMCPY. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. Fixes PR9199 and PR23768. [This is based on Peter Collingbourne's r238473 which was reverted.] Differential Revision: http://reviews.llvm.org/D13239 Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458 Author: Peter Collingbourne <peter@pcc.me.uk> llvm-svn: 249322	2015-10-05 14:49:54 +00:00
Jeroen Ketema	ab99b59e8c	[ARM][NEON] Use address space in vld([1234]\|[234]lane) and vst([1234]\|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887	2015-09-30 10:56:37 +00:00
Artyom Skrobov	ad8a0638f7	[ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall() supportsTailCall() has two callers. Both of them double-check isThumb1Only(), and refuse to proceed with tail-calling in that case. Therefore, it makes sense to move this check to ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized; and to eliminate the extra checks at the call sites. Following a review comment, added an "assert(supportsTailCall())" in IsEligibleForTailCall. NFC. llvm-svn: 248703	2015-09-28 09:44:11 +00:00
Ahmed Bougacha	e81610fabb	[ARM] Don't generate clrex for pre-v7 targets. Since r248294, we emit clrex, but it doesn't exist on v6. llvm-svn: 248640	2015-09-26 00:14:02 +00:00
Saleem Abdulrasool	8e99f50768	ARM: make -Asserts,-Werror=unused-variable build happy The value was only used in an assertion. Sink the variable usage into the assertion. llvm-svn: 248562	2015-09-25 05:41:02 +00:00
Saleem Abdulrasool	fe83b50289	ARM: address WoA division limitation We now emit the compiler generated divide by zero check that was needed for the MSVC routines. We construct a psuedo-instruction for the DBZ check as the operation requires splitting up the BB. For the 64-bit operations, we need to custom expand the node as we need to insert the DBZ check and then emit the libcall to the appropriate name. Because this is target specific, it seemed better to reproduce the expansion operation from the target-agnostic type legalization rather than sink this there to avoid the duplication. The division library calls now match MSVC semantically. llvm-svn: 248561	2015-09-25 05:15:46 +00:00
Artyom Skrobov	cf296444ab	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following a revert of r248152 and new review comments, this patch also includes renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc. The spelling of "t2dsp" is preserved, pending a further investigation of its possible external usage. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248519	2015-09-24 17:31:16 +00:00
Ahmed Bougacha	81616a72ea	[ARM] Emit clrex in the expanded cmpxchg fail block. ARM counterpart to r248291: In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248294	2015-09-22 17:22:58 +00:00
Jeroen Ketema	41681a5329	[ARM] Do not scale vext with a factor The vext pseudo-instruction takes the number of elements that need to be extracted, not the number of bytes. Hence, use the number of elements directly instead of scaling them with a factor. Reviewers: Silviu Baranga, James Molloy (not reflected in the differential revision) Differential Revision: http://reviews.llvm.org/D12974 llvm-svn: 248208	2015-09-21 20:28:04 +00:00
Saleem Abdulrasool	4966f58ac2	ARM: cleanup formatting clang-format a line which was poorly formatted. NFC. llvm-svn: 248110	2015-09-20 03:19:09 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Ahmed Bougacha	5246867384	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	9d677131c4	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428	2015-09-11 17:08:17 +00:00
James Molloy	8c995a93ce	[ARM] Do not use vtrn for vectorshuffle if the order is reversed The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254	2015-09-10 08:42:28 +00:00
Ahmed Bougacha	699a9dd7c3	[ARM] Don't abort on variable-idx extractelt in ReconstructShuffle. The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only takes constant indices, but it does accept variables. Bail out for those: we can't use them, as the shuffles we want to reconstruct do require constant masks. llvm-svn: 246594	2015-09-01 21:56:00 +00:00
Ahmed Bougacha	f9c19da03a	[CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0. For targets that didn't support this, this will let us respect the langref instead of failing to select. Note that we don't need to change the 32-bit x86/PPC lowerings (to account for the result type/# difference) because they're both custom and bypass type legalization. llvm-svn: 246258	2015-08-28 01:49:59 +00:00
Reid Kleckner	0e2882345d	[WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235	2015-08-27 23:27:47 +00:00
Scott Douglass	bdef60462d	[ARM] Use AEABI helpers for i64 div and rem Differential Revision: http://reviews.llvm.org/D12232 llvm-svn: 245830	2015-08-24 09:17:18 +00:00
Scott Douglass	d2974a6afa	[ARM] Refactor LowerDivRem before adding LowerREM (nfc) Differential Revision: http://reviews.llvm.org/D12230 llvm-svn: 245829	2015-08-24 09:17:11 +00:00
James Molloy	bf17009a97	[ARM] Don't try and custom lower a vNi64 SETCC. It won't go well. We've already marked 64-bit SETCCs as non-Custom, but it's just possible that a SETCC has a legal result type but an illegal operand type. If this happens, bail out before we create unselectable nodes. Fixes PR24292. I tried to create a testcase but in 99% of cases we can't trigger this - not surprising that this bug has been latent since 2009. llvm-svn: 245577	2015-08-20 16:33:44 +00:00
Silviu Baranga	ad1b19fcb7	[ARM] Add instruction selection patterns for vmin/vmax Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439	2015-08-19 14:11:27 +00:00
James Molloy	974838f294	[ARM] Fix crash when targetting CPU without NEON We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231	2015-08-17 19:37:12 +00:00
James Molloy	c617be559a	Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNM This is no longer needed - SDAGBuilder will do this for us. llvm-svn: 245197	2015-08-17 07:13:15 +00:00
James Molloy	31117875c2	[ARM] FMINNAN/FMAXNAN of f64 are not legal. This was my error. We've got f32 marked as legal because they're simulated using a v2f32 instruction, but there's no equivalent for f64. This will get test coverage imminently when D12015 lands. llvm-svn: 244916	2015-08-13 17:28:26 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
James Molloy	d616c642bb	[ARM] Match fminnan/fmaxnan for vector vmin/vmax instead of an intrinsic Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG. NFCI. llvm-svn: 244593	2015-08-11 12:06:28 +00:00
James Molloy	ee868b2a3e	[ARM] Match fminnum/fmaxnum for vector vminnm/vmaxnm instead of an intrinsic Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics. NFCI. llvm-svn: 244592	2015-08-11 12:06:25 +00:00
James Molloy	ea3a687a33	[ARM] Replace ARMISD::VMINNM/VMAXNM with ISD::FMINNUM/FMAXNUM NFCI. This replaces another custom ISDNode with a generic equivalent. llvm-svn: 244591	2015-08-11 12:06:22 +00:00
James Molloy	db8ee4b5a9	[ARM] Replace ARMISD::FMIN/FMAX with the shiny new ISD::FMINNAN/FMAXNAN. NFCI. This removes a custom ISDNode. llvm-svn: 244590	2015-08-11 12:06:15 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Silviu Baranga	a07090f7fa	Fix unused variable warning introduced in r244314 llvm-svn: 244315	2015-08-07 12:05:46 +00:00
Silviu Baranga	3e8e51c1a9	[ARM] Update ReconstructShuffle to handle mismatched types Summary: Port the ReconstructShuffle function from AArch64 to ARM to handle mismatched incoming types in the BUILD_VECTOR node. This fixes an outstanding FIXME in the ReconstructShuffle code. Reviewers: t.p.northover, rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11720 llvm-svn: 244314	2015-08-07 11:40:46 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Saleem Abdulrasool	0a2672bb43	ARM: support windows division routines This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952	2015-08-04 03:57:56 +00:00
Saleem Abdulrasool	67697a7ea9	ARM: make Darwin libcall registration table driven (NFC) Make the libcall updating table driven similar to the approach that the Linux and Windows codepath does below. NFC. llvm-svn: 243951	2015-08-04 03:57:52 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
Sumanth Gundapaneni	532a13691c	[ARM] Lower modulo operation to generate __aeabi_divmod on Android For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717	2015-07-31 00:45:12 +00:00
Sanjay Patel	1166f2ff9f	fix memcpy/memset/memmove lowering when optimizing for size Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693	2015-07-30 21:41:50 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Luke Cheeseman	4d45ff2b87	[ARM] - Fix lowering of shufflevectors in AArch32 Some shufflevectors are currently being incorrectly lowered in the AArch32 backend as the existing checks for detecting the NEON operations from the shufflevector instruction expects the shuffle mask and the vector operands to be of the same length. This is not always the case as the mask may be twice as long as the operand; here only the lower half of the shufflemask gets checked, so provided the lower half of the shufflemask looks like a vector transpose (or even is just all -1 for undef) then the intrinsics may get incorrectly lowered into a vector transpose (VTRN) instruction. This patch fixes this by accommodating for both cases and adds regression tests. Differential Revision: http://reviews.llvm.org/D11407 llvm-svn: 243103	2015-07-24 09:57:05 +00:00
Luke Cheeseman	b5c627aba8	When lowering vector shifts a check is performed to see if the value to shift by is an immediate, in this check the value is negated and stored in and int64_t. The value can be -2^63 yet the result cannot be stored in an int64_t and this gives some undefined behaviour causing failures. The negation is only necessary when the values is within a certain range and so it should not need to negate -2^63, this patch introduces this and also a regression test. Differential Revision: http://reviews.llvm.org/D11408 llvm-svn: 243100	2015-07-24 09:31:48 +00:00
James Molloy	a6702e2f14	[ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA No functional change, but it preps codegen for the future when SABSDIFF will start getting generated in anger. llvm-svn: 242546	2015-07-17 17:10:55 +00:00
Matthias Braun	3cd00c1739	Fix __builtin_setjmp in combination with sjlj exception handling. llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481	2015-07-16 22:34:16 +00:00
Logan Chien	0a43abc9f8	ARM: Fix cttz expansion on vector types. The 64/128-bit vector types are legal if NEON instructions are available. However, there was no matching patterns for @llvm.cttz.*() intrinsics and result in fatal error. This commit fixes the problem by lowering cttz to: a. ctpop((x & -x) - 1) b. width - ctlz(x & -x) - 1 llvm-svn: 242037	2015-07-13 15:37:30 +00:00
Pat Gavlin	a717f255b6	Allow {e,r}bp as the target of {read,write}_register. This patch allows the read_register and write_register intrinsics to read/write the RBP/EBP registers on X86 iff the targeted register is the frame pointer for the containing function. Differential Revision: http://reviews.llvm.org/D10977 llvm-svn: 241827	2015-07-09 17:40:29 +00:00
Mehdi Amini	a749f2ad47	Remove getDataLayout() from TargetLowering Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779	2015-07-09 02:09:52 +00:00
Mehdi Amini	0cdec1e2ab	Make isLegalAddressingMode() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778	2015-07-09 02:09:40 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Mehdi Amini	ffc1402fad	Remove IsLittleEndian from TargetLowering and redirect to DataLayout Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11017 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241655	2015-07-08 01:00:38 +00:00
Akira Hatanaka	1bc8af78f4	[ARM] Define a subtarget feature and use it to decide whether long calls should be emitted. This is needed to enable ARM long calls for LTO and enable and disable it on a per-function basis. Out-of-tree projects currently using EnableARMLongCalls to emit long calls should start passing "+long-calls" to the feature string (see the changes made to clang in r241565). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D9364 llvm-svn: 241566	2015-07-07 06:54:42 +00:00
Peter Collingbourne	6a9d1774d0	IR: Do not consider available_externally linkage to be linker-weak. From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413	2015-07-05 20:52:35 +00:00
Benjamin Kramer	9bfb627a0e	[TargetLowering] StringRefize asm constraint getters. There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411	2015-07-05 19:29:18 +00:00
Hao Liu	2cd34bb585	[ARM] Lower interleaved memory accesses to vldN/vstN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4 %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4) %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4 into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240755	2015-06-26 02:45:36 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Ahmed Bougacha	9a9094260d	[ARM] Look through concat when lowering in-place shuffles (VZIP, ..) Currently, we canonicalize shuffles that produce a result larger than their operands with: shuffle(concat(v1, undef), concat(v2, undef)) -> shuffle(concat(v1, v2), undef) because we can access quad vectors (see PerformVECTOR_SHUFFLECombine). This is useful in the general case, but there are special cases where native shuffles produce larger results: the two-result ops. We can look through the concat when lowering them: shuffle(concat(v1, v2), undef) -> concat(VZIP(v1, v2):0, :1) This lets us generate the native shuffles instead of scalarizing to dozens of VMOVs. Differential Revision: http://reviews.llvm.org/D10424 llvm-svn: 240118	2015-06-19 02:32:35 +00:00
Ahmed Bougacha	2ffa91f908	[ARM] Factor out two-result shuffle matching. NFCI. In preparation for a future patch: makes it easier to do the same matching to generate different nodes, without duplication. llvm-svn: 240116	2015-06-19 02:25:01 +00:00
Daniel Sanders	c81f450f1a	Clean up redundant copies of Triple objects. NFC Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823	2015-06-16 15:44:21 +00:00
Reid Kleckner	c35e7f52ba	Revert "Move dllimport name mangling to IR mangler." This reverts commit r239437. This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol directly instead of using it to do an indirect function call. llvm-svn: 239502	2015-06-11 01:31:48 +00:00
Peter Collingbourne	9fe51fdf18	Move dllimport name mangling to IR mangler. This ensures that LTO clients see the correct external symbol name. Differential Revision: http://reviews.llvm.org/D10318 llvm-svn: 239437	2015-06-09 22:09:53 +00:00
Akira Hatanaka	d9699bc7bd	Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions that was resetting it. Remove the uses of DisableTailCalls in subclasses of TargetLowering and use the value of function attribute "disable-tail-calls" instead. Also, unconditionally add pass TailCallElim to the pipeline and check the function attribute at the start of runOnFunction to disable the pass on a per-function basis. This is part of the work to remove TargetMachine::resetTargetOptions, and since DisableTailCalls was the last non-fast-math option that was being reset in that function, we should be able to remove the function entirely after the work to propagate IR-level fast-math flags to DAG nodes is completed. Out-of-tree users should remove the uses of DisableTailCalls and make changes to attach attribute "disable-tail-calls"="true" or "false" to the functions in the IR. rdar://problem/13752163 Differential Revision: http://reviews.llvm.org/D10099 llvm-svn: 239427	2015-06-09 19:07:19 +00:00
Peter Collingbourne	6679fc1a79	Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM." as it caused miscompilations and assertion failures (PR23768, http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html). llvm-svn: 239169	2015-06-05 18:01:28 +00:00
Luke Cheeseman	85fd06d389	Re-commit of r238201 with fix for building with shared libraries. llvm-svn: 238739	2015-06-01 12:02:47 +00:00
Matt Arsenault	bd7d80a4a6	Add address space argument to isLegalAddressingMode This is important because of different addressing modes depending on the address space for GPU targets. This only adds the argument, and does not update any of the uses to provide the correct address space. llvm-svn: 238723	2015-06-01 05:31:59 +00:00
Peter Collingbourne	450fbee6b2	Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing these as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achiveing better codegen is to create LDM/STM instructions with identical sets of virtual registers, let the register allocator pick arbitrary registers and order register lists when printing an MCInst. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. This is implemented by lowering the memcpy intrinsic to a series of SD-only MCOPY pseudo-instructions which performs a memory copy using a given number of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a little unusual, but it avoids the need to encode register lists in the SD, and we can take advantage of SD use lists to decide whether to use the _UPD variant of the instructions. Fixes PR9199. Differential Revision: http://reviews.llvm.org/D9508 llvm-svn: 238473	2015-05-28 20:02:45 +00:00
Diego Novillo	bfecc06656	Revert "Re-commit changes in r237579 with fix for bug breaking windows builds." This reverts commit r238201 to fix linking problems in x86 Linux http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html llvm-svn: 238223	2015-05-26 17:45:38 +00:00
Luke Cheeseman	a5d053d6f4	Re-commit changes in r237579 with fix for bug breaking windows builds. llvm-svn: 238201	2015-05-26 13:40:31 +00:00
Luke Cheeseman	0af4f635f1	Test Commit llvm-svn: 238199	2015-05-26 13:10:35 +00:00
Matthias Braun	6091208331	ARM: Fix comment and make it slightly more readable llvm-svn: 237820	2015-05-20 18:40:06 +00:00
David Blaikie	ff6409d096	Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only llvm-svn: 237624	2015-05-18 22:13:54 +00:00
Oliver Stannard	6cb23465e0	Revert r237579, as it broke windows buildbots llvm-svn: 237583	2015-05-18 16:39:16 +00:00
Oliver Stannard	0c553afe6a	[LLVM - ARM/AArch64] Add ACLE special register intrinsics This patch implements LLVM support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This patch is intended to lower the read/write_register instrinsics, used to implement the special register intrinsics in the clang patch for special register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor registers in AArch32 and AArch64. This is done by inspecting the register string passed to the intrinsic and then lowering to the appropriate instruction. Patch by Luke Cheeseman. Differential Revision: http://reviews.llvm.org/D9699 llvm-svn: 237579	2015-05-18 16:23:33 +00:00
Artyom Skrobov	a70dfe18d3	Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases No longer breaks SPEC2000/2006 llvm-svn: 237361	2015-05-14 12:59:46 +00:00
Tim Northover	4998a47f73	ARM: remove custom jump table UID We were creating and propagating two separate indices for each jump table (from back in the mists of time). However, the generic index used by other backends is sufficient to emit a unique symbol so this was unneeded. llvm-svn: 237294	2015-05-13 20:28:38 +00:00
Silviu Baranga	780a3b3be7	Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006 llvm-svn: 237256	2015-05-13 14:03:18 +00:00
Artyom Skrobov	b526681e08	[AArch64] Codegen VMAX/VMIN for safe math cases llvm-svn: 237247	2015-05-13 12:01:09 +00:00
Eric Christopher	824f42f209	Migrate existing backends that care about software floating point to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079	2015-05-12 01:26:05 +00:00
Arnold Schwaighofer	f54b73d681	ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916	2015-05-08 23:52:00 +00:00
Matthias Braun	f45afee3dc	Fix typo. llvm-svn: 236785	2015-05-07 22:16:10 +00:00
Matthias Braun	d04893fa36	Change getTargetNodeName() to produce compiler warnings for missing cases, fix them llvm-svn: 236775	2015-05-07 21:33:59 +00:00
Artyom Skrobov	3f8eae92a4	[ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too llvm-svn: 236590	2015-05-06 11:44:10 +00:00
Pete Cooper	5111881cfc	Don't always apply kill flag in thumb2 ABS pseudo expansion. The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272	2015-04-30 22:15:59 +00:00
Sergey Dmitrouk	842a51bad8	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes" [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989	2015-04-28 14:05:47 +00:00
Daniel Jasper	48e93f7181	Revert "[DebugInfo] Add debug locations to constant SD nodes" This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987	2015-04-28 13:38:35 +00:00
Sergey Dmitrouk	adb4c69d5c	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977	2015-04-28 11:56:37 +00:00
Matthias Braun	eec4efcca5	Cleanup, remove unused return value llvm-svn: 235952	2015-04-28 00:37:05 +00:00
Scott Douglass	7ad7792088	[ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math Because -menable-no-nans causes fcmp conditions to be rewritten without 'o' or 'u' the recognition code in needs to cope. Also extended it to handle 'le' and 'ge. Differential Revision: http://reviews.llvm.org/D8725 llvm-svn: 234421	2015-04-08 17:18:28 +00:00
Derek Schuff	b051389f04	Use movw/movt instead of constant pool loads to lower byval parameter copies Summary: The ARM backend can use a loop to implement copying byval parameters before a call. In non-thumb2 mode it uses a constant pool load to materialize the trip count. For targets that need movt instead (e.g. Native Client), use the same code as in thumb2 mode to materialize the trip count. Reviewers: jfb, t.p.northover Differential Revision: http://reviews.llvm.org/D8442 llvm-svn: 233324	2015-03-26 22:11:00 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
James Molloy	fa041153e5	[ARM] Remove target-specific ITOFP/FPTOI nodes Anton tried this 5 years ago but it was reverted due to extra VMOVs being emitted. This can be easily fixed with a liberal application of patterns - matching loads/stores and extractelts. llvm-svn: 232958	2015-03-23 16:15:16 +00:00
John Brawn	0dbcd65442	[ARM] Align stack objects passed to memory intrinsics Memcpy, and other memory intrinsics, typically tries to use LDM/STM if the source and target addresses are 4-byte aligned. In CodeGenPrepare look for calls to memory intrinsics and, if the object is on the stack, 4-byte align it if it's large enough that we expect that memcpy would want to use LDM/STM to copy it. Differential Revision: http://reviews.llvm.org/D7908 llvm-svn: 232627	2015-03-18 12:01:59 +00:00
Eric Christopher	ae32649ff2	In preparation for moving ARM's TargetRegisterInfo to the TargetMachine merge Thumb1RegisterInfo and Thumb2RegisterInfo. This will enable us to match the TargetMachine for our TargetRegisterInfo classes. llvm-svn: 232117	2015-03-12 22:48:50 +00:00
Aaron Ballman	c579d66b9a	Silencing an "enumeral and non-enumeral type in conditional expression" warning; NFC. llvm-svn: 232035	2015-03-12 13:24:06 +00:00
Eric Christopher	9deb75d176	Have getCallPreservedMask and getThisCallPreservedMask take a MachineFunction argument so that we can grab subtarget specific features off of it. llvm-svn: 231979	2015-03-11 22:42:13 +00:00
Tim Northover	8cda34f5e7	ARM: simplify and extend byval handling The main issue being fixed here is that APCS targets handling a "byval align N" parameter with N > 4 were miscounting what objects were where on the stack, leading to FrameLowering setting the frame pointer incorrectly and clobbering the stack. But byval handling had grown over many years, and had multiple layers of cruft trying to compensate for each other and calculate padding correctly. This only really needs to be done once, in the HandleByVal function. Elsewhere should just do what it's told by that call. I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits byvals with the correct C ABI alignment), which simplified HandleByVal. rdar://20095672 llvm-svn: 231959	2015-03-11 18:54:22 +00:00
Benjamin Kramer	7bd1f7cb58	Remove the remaining uses of abs64 and nuke it. std::abs works just fine and we're already using it in many places. NFC intended. llvm-svn: 231696	2015-03-09 20:20:16 +00:00
Benjamin Kramer	867bfc53ee	Make constant arrays that are passed to functions as const. In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571	2015-03-07 17:41:00 +00:00
Ahmed Bougacha	4200cc95b4	[ARM] Enable vector extload combine for legal types. This commit enables forming vector extloads for ARM. It only does so for legal types, and when we can't fold the extension in a wide/long form of the user instruction. Enabling it for larger types isn't as good an idea on ARM as it is on X86, because: - we pretend that extloads are legal, but end up generating vld+vmov - we have instructions like vld {dN, dM}, which can't be generated when we "manually expand" extloads to vld+vmov. For legal types, the combine doesn't fire that often: in the integration tests only in a big endian testcase, where it removes a pointless AND. Related to rdar://19723053 Differential Revision: http://reviews.llvm.org/D7423 llvm-svn: 231396	2015-03-05 19:37:53 +00:00
JF Bastien	f14889ee34	Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250	2015-03-04 15:47:57 +00:00
Pete Cooper	ef21bd444d	Remove MCStreamer.h include from MCContext.h and explictly include it where necessary. NFC llvm-svn: 231193	2015-03-04 01:24:11 +00:00
Eric Christopher	11e4df73c8	getRegForInlineAsmConstraint wants to use TargetRegisterInfo for a lookup, pass that in rather than use a naked call to getSubtargetImpl. This involved passing down and around either a TargetMachine or TargetRegisterInfo. Update all callers/definitions around the targets and SelectionDAG. llvm-svn: 230699	2015-02-26 22:38:43 +00:00
Eric Christopher	23a3a7c871	Remove an argument-less call to getSubtargetImpl from TargetLoweringBase. This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583	2015-02-26 00:00:24 +00:00
Tim Northover	e95c5b3236	ARM: treat [N x i32] and [N x i64] as AAPCS composite types The logic is almost there already, with our special homogeneous aggregate handling. Tweaking it like this allows front-ends to emit AAPCS compliant code without ever having to count registers or add discarded padding arguments. Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to apply the logic to all integer arrays for more consistency. llvm-svn: 230348	2015-02-24 17:22:34 +00:00

1 2 3 4 5 ...

1383 Commits