llvm-project

Commit Graph

Author	SHA1	Message	Date
Sjoerd Meijer	834f7dc7ab	[ARM] FP16 vmaxnm/vminnm scalar instructions This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions. Differential Revision: https://reviews.llvm.org/D44675 llvm-svn: 330034	2018-04-13 15:34:26 +00:00
Sjoerd Meijer	ac96d7c4b3	[ARM] FP16 VSEL codegen This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788	2018-04-11 09:28:04 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
Christof Douma	a1e77c0e02	[ARM] Support float literals under XO Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691	2018-03-28 10:02:26 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	13e77db2df	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
Christof Douma	4a025cc79d	[ARM] Support float literals under XO When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313	2018-03-23 13:02:03 +00:00
Martin Storsjo	9a55c1b0dc	[ARM, AArch64] Check the no-stack-arg-probe attribute for dynamic stack probes This extends the use of this attribute on ARM and AArch64 from SVN r325900 (where it was only checked for fixed stack allocations on ARM/AArch64, but for all stack allocations on X86). This also adds a testcase for the existing use of disabling the fixed stack probe with the attribute on ARM and AArch64. Differential Revision: https://reviews.llvm.org/D44291 llvm-svn: 327897	2018-03-19 20:06:50 +00:00
Sjoerd Meijer	d16037d9bb	[ARM] Support for v4f16 and v8f16 vectors This is the groundwork for adding the Armv8.2-A FP16 vector intrinsics, which uses v4f16 and v8f16 vector operands and return values. All the moving parts are tested with two intrinsics, a 1-operand v8f16 and a 2-operand v4f16 intrinsic. In a follow-up patch the rest of the intrinsics and tests will be added. Differential Revision: https://reviews.llvm.org/D44538 llvm-svn: 327839	2018-03-19 13:35:25 +00:00
Sjoerd Meijer	d391a1a985	[ARM] FP16 codegen support for VSEL This implements lowering of SELECT_CC for f16s, which enables codegen of VSEL with f16 types. Differential Revision: https://reviews.llvm.org/D44518 llvm-svn: 327695	2018-03-16 08:06:25 +00:00
Sjoerd Meijer	af30f06d5c	[ARM] Fix for PR36577 Don't PerformSHLSimplify if the given node is used by a node that also uses a constant because we may get stuck in an infinite combine loop. bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36577 Patch by Sam Parker. Differential Revision: https://reviews.llvm.org/D44097 llvm-svn: 326882	2018-03-07 09:10:44 +00:00
Chih-Hung Hsieh	9f9e4681ac	[TLS] use emulated TLS if the target supports only this mode Emulated TLS is enabled by llc flag -emulated-tls, which is passed by clang driver. When llc is called explicitly or from other drivers like LTO, missing -emulated-tls flag would generate wrong TLS code for targets that supports only this mode. Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether emulated TLS code should be generated. Unit tests are modified to run with and without the -emulated-tls flag. Differential Revision: https://reviews.llvm.org/D42999 llvm-svn: 326341	2018-02-28 17:48:55 +00:00
Pablo Barrio	512f7ee315	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Martin Svanfeldt Reviewers: fhahn, pbarrio, rogfer01 Reviewed By: pbarrio, rogfer01 Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 326333	2018-02-28 17:13:07 +00:00
Geoff Berry	f8bf2ec0a8	[MachineOperand][Target] MachineOperand::isRenamable semantics changes Summary: Add a target option AllowRegisterRenaming that is used to opt in to post-register-allocation renaming of registers. This is set to 0 by default, which causes the hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq fields of all opcodes to be set to 1, causing MachineOperand::isRenamable to always return false. Set the AllowRegisterRenaming flag to 1 for all in-tree targets that have lit tests that were effected by enabling COPY forwarding in MachineCopyPropagation (AArch64, AMDGPU, ARM, Hexagon, Mips, PowerPC, RISCV, Sparc, SystemZ and X86). Add some more comments describing the semantics of the MachineOperand::isRenamable function and how it is set and maintained. Change isRenamable to check the operand's opcode hasExtraSrcRegAllocReq/hasExtraDstRegAllocReq bit directly instead of relying on it being consistently reflected in the IsRenamable bit setting. Clear the IsRenamable bit when changing an operand's register value. Remove target code that was clearing the IsRenamable bit when changing registers/opcodes now that this is done conservatively by default. Change setting of hasExtraSrcRegAllocReq in AMDGPU target to be done in one place covering all opcodes that have constant pipe read limit restrictions. Reviewers: qcolombet, MatzeB Subscribers: aemerson, arsenm, jyknight, mcrosier, sdardis, nhaehnle, javed.absar, tpr, arichardson, kristof.beyls, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, escha, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D43042 llvm-svn: 325931	2018-02-23 18:25:08 +00:00
Sjoerd Meijer	4d5c40492a	[ARM] Lower BR_CC for f16 This case wasn't handled yet. Differential Revision: https://reviews.llvm.org/D43508 llvm-svn: 325616	2018-02-20 19:28:05 +00:00
Roger Ferrer Ibanez	d41059a9f6	[ARM] Materialise some boolean values to avoid a branch This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form a != b ? x : 0 a == b ? 0 : x and that currently (e.g. in Thumb1) are emitted as branches. Differential Revision: https://reviews.llvm.org/D34515 llvm-svn: 325323	2018-02-16 09:23:59 +00:00
Pablo Barrio	e28cb8399a	[ARM] Allow 64- and 128-bit types with 't' inline asm constraint Summary: In LLVM, 't' selects a floating-point/SIMD register and only supports 32-bit values. This is appropriately documented in the LLVM Language Reference Manual. However, this behaviour diverges from that of GCC, where 't' selects the s0-s31 registers and its qX and dX variants depending on additional operand modifiers (q/P). For example, the following C code: #include <arm_neon.h> float32x4_t a, b, x; asm("vadd.f32 %0, %1, %2" : "=t" (x) : "t" (a), "t" (b)) results in the following assembly if compiled with GCC: vadd.f32 s0, s0, s1 whereas LLVM will show "error: couldn't allocate output register for constraint 't'", since a, b, x are 128-bit variables, not 32-bit. This patch extends the use of 't' to mean that of GCC, thus allowing selection of the lower Q vector regs and their D/S variants. For example, the earlier code will now compile as: vadd.f32 q0, q0, q1 This behaviour still differs from that of GCC but I think it is actually more correct, since LLVM picks up the right register type based on the datatype of x, while GCC would need an extra operand modifier to achieve the same result, as follows: asm("vadd.f32 %q0, %q1, %q2" : "=t" (x) : "t" (a), "t" (b)) Since this is only an extension of functionality, existing code should not be affected by this change. Note that operand modifiers q/P are already supported by LLVM, so this patch should suffice to support inline assembly with constraint 't' originally built for GCC. Reviewers: grosbach, rengolin Reviewed By: rengolin Subscribers: rogfer01, efriedma, olista01, aemerson, javed.absar, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42962 llvm-svn: 325244	2018-02-15 14:44:22 +00:00
Sjoerd Meijer	89ea2648bb	[ARM] Armv8.2-A FP16 code generation (part 3/3) This adds most of the FP16 codegen support, but these areas need further work: - FP16 literals and immediates are not properly supported yet (e.g. literal pool needs work), - Instructions that are generated from intrinsics (e.g. vabs) haven't been added. This will be addressed in follow-up patches. Differential Revision: https://reviews.llvm.org/D42849 llvm-svn: 324321	2018-02-06 08:43:56 +00:00
Sjoerd Meijer	9d9a86535e	[ARM] FullFP16 LowerReturn Fix Commit r323512 introduced an optimisation in LowerReturn for half-precision return values. A missing check caused a crash when the return value is "undef" (i.e. a node that has no operands). Differential Revision: https://reviews.llvm.org/D42743 llvm-svn: 323968	2018-02-01 13:48:40 +00:00
Evgeniy Stepanov	7746899f48	Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations" Miscompiles code. Testcase pending. This reverts commit r323869. llvm-svn: 323929	2018-01-31 22:55:19 +00:00
Pablo Barrio	2e442a7831	[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations Summary: Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before. Patch by Marten Svanfeldt. Reviewers: fhahn, pbarrio Reviewed By: pbarrio Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42574 llvm-svn: 323869	2018-01-31 13:20:10 +00:00
Sjoerd Meijer	98d5359ea2	[ARM] Armv8.2-A FP16 code generation (part 2/3) Half-precision arguments and return values are passed as if it were an int or float for ARM. This results in truncates and bitcasts to/from i16 and f16 values, which are legalized very early to stack stores/loads. When FullFP16 is enabled, we want to avoid codegen for these bitcasts as it is unnecessary and inefficient. Differential Revision: https://reviews.llvm.org/D42580 llvm-svn: 323861	2018-01-31 10:18:29 +00:00
Sjoerd Meijer	011de9c0ca	[ARM] Armv8.2-A FP16 code generation (part 1/3) This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is not supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512	2018-01-26 09:26:40 +00:00
Weiming Zhao	665784f170	[ARM] Expand long shifts for Thumb1 to __aeabi_ calls Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354	2018-01-24 18:00:57 +00:00
Martin Storsjo	4ed94a06ac	[ARM] Call __chkstk for dynamic stack allocation in all windows environments This matches what MSVC does for alloca() function calls on ARM. Even if MSVC doesn't support VLAs at the language level, it does support the alloca function. On the clang level, both the _alloca() (when emulating MSVC, which is what the alloca() function expands to) and __builtin_alloca() builtin functions, and VLAs, map to the same LLVM IR "alloca" function - so within LLVM they're not distinguishable from each other. Differential Revision: https://reviews.llvm.org/D42292 llvm-svn: 323308	2018-01-24 06:40:11 +00:00
Joel Galenson	bbcaf4ac5c	[ARM] Optimize {s,u}mul.with.overflow. This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738	2018-01-17 19:19:05 +00:00
Andre Vieira	5627c218e1	[ARM] Add codegen for SMMULR, SMMLAR and SMMLSR This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR instructions from equivalent IR patterns. Differential Revision: https://reviews.llvm.org/D41775 llvm-svn: 322361	2018-01-12 09:24:41 +00:00
Joel Galenson	6f4e827e4c	[ARM] Optimize {s,u}{add,sub}.with.overflow. The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG. This commit ports that code to the ARM backend. Differential revision: https://reviews.llvm.org/D35635 llvm-svn: 321224	2017-12-20 22:25:39 +00:00
Florian Hahn	c3aa6d83fd	[ARM] Lower unsigned saturation to USAT Summary: Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it. Patch by Marten Svanfeldt Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn Reviewed By: fhahn Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41348 llvm-svn: 321164	2017-12-20 11:13:57 +00:00
Adrian Prantl	0e6694d111	Silence a bunch of implicit fallthrough warnings llvm-svn: 321114	2017-12-19 22:05:25 +00:00
Matthias Braun	a4852d2c19	X86/AArch64/ARM: Factor out common sincos_stret logic; NFCI Note: - X86ISelLowering: setLibcallName(SINCOS) was superfluous as InitLibcalls() already does it. - ARMISelLowering: Setting libcallnames for sincos/sincosf seemed superfluous as in the darwin case it wouldn't be used while for all other cases InitLibcalls already does it. llvm-svn: 321036	2017-12-18 23:19:42 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Matt Arsenault	7d7adf4f2e	TLI: Allow using PSV for intrinsic mem operands llvm-svn: 320756	2017-12-14 22:34:10 +00:00
Matt Arsenault	1117133687	DAG: Expose all MMO flags in getTgtMemIntrinsic Rather than adding more bits to express every MMO flag you could want, just directly use the MMO flags. Also fixes using a bunch of bool arguments to getMemIntrinsicNode. On AMDGPU, buffer and image intrinsics should always have MODereferencable set, but currently there is no way to do that directly during the initial intrinsic lowering. llvm-svn: 320746	2017-12-14 21:39:51 +00:00
Geoff Berry	60c431022e	[MachineOperand][MIR] Add isRenamable to MachineOperand. Summary: Add isRenamable() predicate to MachineOperand. This predicate can be used by machine passes after register allocation to determine whether it is safe to rename a given register operand. Register operands that aren't marked as renamable may be required to be assigned their current register to satisfy constraints that are not captured by the machine IR (e.g. ABI or ISA constraints). Reviewers: qcolombet, MatzeB, hfinkel Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39400 llvm-svn: 320503	2017-12-12 17:53:59 +00:00
Roger Ferrer Ibanez	5ea0f2501f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 - fixes PR35103 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 320355	2017-12-11 12:13:45 +00:00
Francis Visoiu Mistrih	a8a83d150f	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022	2017-12-07 10:40:31 +00:00
Pablo Barrio	2b4385846c	Fix function pointer tail calls in armv8-M.base Summary: The compiler fails with the following error message: fatal error: error in backend: ran out of registers during register allocation Tail call optimization for Armv8-M.base fails to meet all the required constraints when handling calls to function pointers where the arguments take up r0-r3. This is because the pointer to the function to be called can only be stored in r0-r3, but these are all occupied by arguments. This patch makes sure that tail call optimization does not try to handle this type of calls. Reviewers: chill, MatzeB, olista01, rengolin, efriedma Reviewed By: olista01, efriedma Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40706 llvm-svn: 319664	2017-12-04 16:55:49 +00:00
Martin Storsjo	c85cc41801	[ARM] Allow using emulated tls on platforms other than ELF This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Differential Revision: https://reviews.llvm.org/D40769 llvm-svn: 319643	2017-12-04 09:08:55 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Yi Kong	39bcd4ed3e	[ARM] 't' asm constraint should accept i32 't' constraint normally only accepts f32 operands, but for VCVT the operands can be i32. LLVM is overly restrictive and rejects asm like: float foo() { float result; __asm__ __volatile__( "vcvt.f32.s32 %[result], %[arg1]\n" : [result]"=t"(result) : [arg1]"t"(0x01020304) ); return result; } Relax the value type for 't' constraint to either f32 or i32. Differential Revision: https://reviews.llvm.org/D40137 llvm-svn: 318472	2017-11-16 23:38:17 +00:00
Martin Storsjo	4629f52312	[ARM, AArch64] Fix an assert message, Darwin isn't the only target supporting TLS. NFC. llvm-svn: 318184	2017-11-14 19:57:59 +00:00
Evgeniy Stepanov	76d5ac4906	[arm] Fix Unnecessary reloads from GOT. Summary: This fixes PR35221. Use pseudo-instructions to let MachineCSE hoist global address computation. Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39871 llvm-svn: 318081	2017-11-13 20:45:38 +00:00
Momchil Velikov	842aa90192	[ARM] Place jump table as the first operand in additions When generating table jump code for switch statements, place the jump table label as the first operand in the various addition instructions in order to enable addressing mode selectors to better match index computation and possibly fold them into the addressing mode of the table entry load instruction. Differential revision: https://reviews.llvm.org/D39752 llvm-svn: 318033	2017-11-13 11:56:48 +00:00
David Blaikie	3f833edc7c	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647	2017-11-08 01:01:31 +00:00
Sam Parker	242052c6b4	[ARM] and, or, xor and add with shl combine The generic dag combiner will fold: (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) This can create constants which are too large to use as an immediate. Many ALU operations are also able of performing the shl, so we can unfold the transformation to prevent a mov imm instruction from being generated. Other patterns, such as b + ((a << 1) \| 510), can also be simplified in the same manner. Differential Revision: https://reviews.llvm.org/D38084 llvm-svn: 317197	2017-11-02 10:43:10 +00:00
Roger Ferrer Ibanez	9dfbc10522	Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY" That change causes PR35103, so reverting until I figure it out. llvm-svn: 317092	2017-11-01 14:06:57 +00:00
Eli Friedman	d5dfb62de7	[ARM] Honor -mfloat-abi for libcall calling convention As far as I can tell, this matches gcc: -mfloat-abi determines the calling convention for all functions except those explicitly defined as soft-float in the ARM RTABI. This change only affects cases where the user specifies -mfloat-abi to override the default calling convention derived from the target triple. Fixes https://bugs.llvm.org//show_bug.cgi?id=34530. Differential Revision: https://reviews.llvm.org/D38299 llvm-svn: 316708	2017-10-26 21:42:32 +00:00
Sam Parker	1f742117bd	[ARM] OrCombineToBFI function Extract the functionality to combine OR to BFI into its own function. Differential Revision: https://reviews.llvm.org/D39001 llvm-svn: 316563	2017-10-25 08:37:33 +00:00
Sam Parker	ccb209bb97	[ARM] Swap cmp operands for automatic shifts Swap the compare operands if the lhs is a shift and the rhs isn't, as in arm and T2 the shift can be performed by the compare for its second operand. Differential Revision: https://reviews.llvm.org/D39004 llvm-svn: 316562	2017-10-25 08:33:06 +00:00
Sjoerd Meijer	7a22a4948f	ISel type legalization: add debug messages. NFCI. This adds some more debug messages to the type legalizer and functions like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make the debug messages a little bit more informative and useful. Differential Revision: https://reviews.llvm.org/D38450 llvm-svn: 314773	2017-10-03 08:54:15 +00:00
Sam Parker	211f47aa37	[ARM] isTruncateFree fix I implemented isTruncateFree in rL313533, this patch fixes the logic to match my comment, as the previous logic was too general. Now the only truncates that are free are i64 -> i32. Differential Revision: https://reviews.llvm.org/D38234 llvm-svn: 314280	2017-09-27 08:30:45 +00:00
Eugene Zelenko	076468c0d0	[ARM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 313823	2017-09-20 21:35:51 +00:00
Roger Ferrer Ibanez	8d0180c955	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313618	2017-09-19 09:05:39 +00:00
Sam Parker	71efbe4c68	[ARM] Implement isTruncateFree Implement the isTruncateFree hooks, lifted from AArch64, that are used by TargetTransformInfo. This allows simplifycfg to reduce the test case into a single basic block. Differential Revision: https://reviews.llvm.org/D37516 llvm-svn: 313533	2017-09-18 14:28:51 +00:00
Hans Wennborg	8c1eb106bd	Revert r313009 "[ARM] Use ADDCARRY / SUBCARRY" This was causing PR34045 to fire again. > This is a preparatory step for D34515 and also is being recommitted as its > first version caused PR34045. > > This change: > - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 > - lowering is done by first converting the boolean value into the carry flag > using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value > using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two > operations does the actual addition. > - for subtraction, given that ISD::SUBCARRY second result is actually a > borrow, we need to invert the value of the second operand and result before > and after using ARMISD::SUBE. We need to invert the carry result of > ARMISD::SUBE to preserve the semantics. > - given that the generic combiner may lower ISD::ADDCARRY and > ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering > as well otherwise i64 operations now would require branches. This implies > updating the corresponding test for unsigned. > - add new combiner to remove the redundant conversions from/to carry flags > to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C > - fixes PR34045 > > Differential Revision: https://reviews.llvm.org/D35192 Also revert follow-up r313010: > [ARM] Fix typo when creating ISD::SUB nodes > > In D35192, I accidentally introduced a typo when creating ISD::SUB nodes, > giving them two values instead of one. > > This fails when the merge_values combiner finds one of these nodes. > > This change fixes PR34564. > > Differential Revision: https://reviews.llvm.org/D37690 llvm-svn: 313044	2017-09-12 16:24:17 +00:00
Roger Ferrer Ibanez	9df2527b0b	[ARM] Fix typo when creating ISD::SUB nodes In D35192, I accidentally introduced a typo when creating ISD::SUB nodes, giving them two values instead of one. This fails when the merge_values combiner finds one of these nodes. This change fixes PR34564. Differential Revision: https://reviews.llvm.org/D37690 llvm-svn: 313010	2017-09-12 07:42:28 +00:00
Roger Ferrer Ibanez	4f92b4162f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515 and also is being recommitted as its first version caused PR34045. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313009	2017-09-12 07:40:09 +00:00
Hans Wennborg	075e5a2e2b	Revert r312898 "[ARM] Use ADDCARRY / SUBCARRY" It caused PR34564. > This is a preparatory step for D34515 and also is being recommitted as its > first version caused PR34045. > > This change: > - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 > - lowering is done by first converting the boolean value into the carry flag > using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value > using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two > operations does the actual addition. > - for subtraction, given that ISD::SUBCARRY second result is actually a > borrow, we need to invert the value of the second operand and result before > and after using ARMISD::SUBE. We need to invert the carry result of > ARMISD::SUBE to preserve the semantics. > - given that the generic combiner may lower ISD::ADDCARRY and > ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering > as well otherwise i64 operations now would require branches. This implies > updating the corresponding test for unsigned. > - add new combiner to remove the redundant conversions from/to carry flags > to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C > - fixes PR34045 > > Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 312980	2017-09-11 23:52:02 +00:00
Roger Ferrer Ibanez	12b20f2307	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515 and also is being recommitted as its first version caused PR34045. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 312898	2017-09-11 07:38:05 +00:00
Diana Picus	1d101d76c0	Move static helper into ARMTargetLowering. NFC This exposes the isReadOnly(GlobalValue *) in the ARMTargetLowering so we can make use of it in GlobalISel as well. llvm-svn: 312320	2017-09-01 10:44:48 +00:00
Evgeny Astigeevich	540a39adf7	[ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes for the Thumb1 target. This causes generation of redundant code and affects performance. This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106 Differential Revision: https://reviews.llvm.org/D36467 llvm-svn: 311649	2017-08-24 10:00:25 +00:00
Renato Golin	f63d701669	[ARM] Call setBooleanContents(ZeroOrOneBooleanContent) The ARM backend should call setBooleanContents so that it can use known bits to make some optimizations. Review: D35821 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311446	2017-08-22 11:02:37 +00:00
Chandler Carruth	b866178067	Fix a typo in r311435. llvm-svn: 311437	2017-08-22 09:20:52 +00:00
Alex Bradbury	080f6976c0	Use report_fatal_error for unsupported calling conventions The calling convention can be specified by the user in IR. Failing to support a particular calling convention isn't a programming error, and so relying on llvm_unreachable to catch and report an unsupported calling convention is not appropriate. Differential Revision: https://reviews.llvm.org/D36830 llvm-svn: 311435	2017-08-22 09:11:41 +00:00
Martin Storsjo	d606d2a643	[ARM] Factorize the calculation of WhichResult in isV*Mask. NFC. Differential Revision: https://reviews.llvm.org/D36930 llvm-svn: 311260	2017-08-19 20:26:51 +00:00
Martin Storsjo	91522ffa12	[ARM] Check the right order for halves of VZIP/VUZP if both parts are used This is the exact same fix as in SVN r247254. In that commit, the fix was applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue is present for VZIP/VUZP as well. This fixes PR33921. Differential Revision: https://reviews.llvm.org/D36899 llvm-svn: 311258	2017-08-19 19:47:48 +00:00
Saleem Abdulrasool	dd8c16b58e	ARM: mark CPSR as clobbered for Windows VLAs When lowering a VLA, we emit a __chstk call. However, this call can internally clobber CPSR. We did not mark this register as an ImpDef, which could potentially allow a comparison to be hoisted above the call to `__chkstk`. In such a case, the CPSR could be clobbered, and the check invalidated. When the support was initially added, it seemed that the call would take care of preventing CPSR from being clobbered, but this is not the case. Mark the register as clobbered to fix a possible state corruption. llvm-svn: 311061	2017-08-17 02:42:24 +00:00
Craig Topper	2251ef95a3	[X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap Summary: Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not. For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors. Reviewers: RKSimon, zvi, efriedma Reviewed By: RKSimon Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36649 llvm-svn: 310793	2017-08-13 17:29:07 +00:00
Eli Friedman	e9499dd4c7	[ARM] Clarify legal addressing modes for ARM and Thumb2. NFC The existing code is very clever, but not clear, which seems like the wrong tradeoff here. Differential Revision: https://reviews.llvm.org/D36559 llvm-svn: 310653	2017-08-10 19:31:27 +00:00
Nico Weber	bfde70b097	Revert r309923, it caused PR34045. llvm-svn: 309950	2017-08-03 15:41:26 +00:00
Roger Ferrer Ibanez	854980341b	[ARM] Use ADDCARRY / SUBCARRY This patch: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 309923	2017-08-03 07:45:10 +00:00
Rafael Espindola	79e238afee	Delete Default and JITDefault code models IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911	2017-08-03 02:16:21 +00:00
Peter Collingbourne	081ffe2ff2	Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159	2017-07-26 19:15:29 +00:00
Zvi Rackover	1b73682243	TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI. Changing mask argument type from const SmallVectorImpl<int>& to ArrayRef<int>. This came up in D35700 where a mask is received as an ArrayRef<int> and we want to pass it to TargetLowering::isShuffleMaskLegal(). Also saves a few lines of code. llvm-svn: 309085	2017-07-26 08:06:58 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Konstantin Zhuravlyov	bb80d3e1d3	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Matthew Simpson	12eaef75ce	[ARM] Implement interleaved access bug fix from r306334 r306334 fixed a bug in AArch64 dealing with wide interleaved accesses having pointer types. The bug also exists in ARM, so this patch copies over the fix. llvm-svn: 307409	2017-07-07 16:15:05 +00:00
Simon Pilgrim	ce1fb22c6a	[Arm] Fix -Wimplicit-fallthrough warnings. NFCI. llvm-svn: 307375	2017-07-07 10:05:45 +00:00
Tim Northover	2b5f03aa12	ARM: fix big-endian 64-bit cmpxchg. On big-endian machines the high and low parts of the value accessed by ldrexd and strexd are swapped around. To account for this we swap inputs and outputs in ISelLowering. Patch by Bharathi Seshadri. llvm-svn: 306865	2017-06-30 19:51:02 +00:00
Alexandros Lamprineas	2b2b420563	[ARM] Support constant pools in data when generating execute-only code. Resubmission of r305387, which was reverted at r305390. The Address Sanitizer caught a stack-use-after-scope of a Twine variable. This is now fixed by passing the Twine directly as a function parameter. The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305776	2017-06-20 07:20:52 +00:00
Alexandros Lamprineas	1c15ee2631	Revert "[ARM] Support constant pools in data when generating execute-only code." This reverts commit 3a204faa093c681a1e96c5e0622f50649b761ee0. I've upset a buildbot which runs the address sanitizer: ERROR: AddressSanitizer: stack-use-after-scope lib/Target/ARM/ARMISelLowering.cpp:2690 That Twine variable is used illegally. llvm-svn: 305390	2017-06-14 15:00:08 +00:00
Alexandros Lamprineas	c582d6e133	[ARM] Support constant pools in data when generating execute-only code. The ARM backend asserts against constant pool lowering when it generates execute-only code in order to prevent the generation of constant pools in the text section. It appears that target independent optimizations might generate DAG nodes that represent constant pools. By lowering such nodes as global addresses we don't violate the semantics of execute-only code and also it is guaranteed that execute-only behaves correct with the position-independent addressing modes that support execute-only code. Differential Revision: https://reviews.llvm.org/D33773 llvm-svn: 305387	2017-06-14 13:22:41 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Craig Topper	f6d4dc5b4a	[SelectionDAG] Set ISD::FPOWI to Expand by default Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215	2017-05-30 15:27:55 +00:00
John Brawn	9009d2905d	[ARM] Fix lowering of misaligned memcpy/memset Currently getOptimalMemOpType returns i32 for large enough sizes without checking for alignment, leading to poor code generation when misaligned accesses aren't permitted as we generate a word store then later split it up into byte stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for memset we splat the memset value into a word then immediately split it up again. Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type to use, but also fix a bug there where it wasn't correctly checking if misaligned memory accesses are allowed. Differential Revision: https://reviews.llvm.org/D33442 llvm-svn: 303990	2017-05-26 13:59:12 +00:00
Oleg Ranevskyy	09df0020fc	[ARM] Temporarily disable globals promotion to constant pools to prevent miscompilation Summary: A temporary workaround for PR32780 - rematerialized instructions accessing the same promoted global through different constant pool entries. The patch turns off the globals promotion optimization leaving all its code in place, so that it can be easily turned on once PR32780 is fixed. Since this is a miscompilation issue causing generation of misbehaving code, and the problem is very subtle, the patch might be valuable enough to get into 4.0.1. Reviewers: efriedma, jmolloy Reviewed By: efriedma Subscribers: aemerson, javed.absar, llvm-commits, rengolin, asl, tstellar Differential Revision: https://reviews.llvm.org/D33446 llvm-svn: 303679	2017-05-23 19:38:37 +00:00
Serge Guelton	e38003f839	Suppress all uses of LLVM_END_WITH_NULL. NFC. Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571	2017-05-09 19:31:13 +00:00
Tim Shen	04de70d3a7	[Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC. Now both emitLeadingFence and emitTrailingFence take the instruction itself, instead of taking IsLoad/IsStore pairs. Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used for determining those two booleans. The instruction argument is also useful for later D32763, in emitTrailingFence. For emitLeadingFence, it seems to have cleaner interface with the proposed change. Differential Revision: https://reviews.llvm.org/D32762 llvm-svn: 302539	2017-05-09 15:27:17 +00:00
Serge Pavlov	d526b13e61	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527	2017-05-09 13:35:13 +00:00
Tim Northover	c48c993b75	ARM: use divmod libcalls on embedded MachO platforms too. The separated libcalls are implemented in terms of __divmodsi4 and __udivmodsi4 anyway, so we should always use them if possible. llvm-svn: 302462	2017-05-08 20:00:14 +00:00
Simon Pilgrim	f5ca255d18	[ARM][NEON] Add support for ISD::ABS lowering Update NEON int_arm_neon_vabs intrinsic to use the ISD::ABS opcode directly Added constant folding tests. Differential Revision: https://reviews.llvm.org/D32938 llvm-svn: 302417	2017-05-08 10:37:34 +00:00
Matthias Braun	4682ac6c83	ARM: Compute MaxCallFrame size early This exposes a method in MachineFrameInfo that calculates MaxCallFrameSize and calls it after instruction selection in the ARM target. This avoids ARMBaseRegisterInfo::canRealignStack()/ARMFrameLowering::hasReservedCallFrame() giving different answers in early/late phases of codegen. The testcase shows a particular nasty example result of that where we would fail to properly align an alloca. Differential Revision: https://reviews.llvm.org/D32622 llvm-svn: 302303	2017-05-05 22:04:05 +00:00
Craig Topper	f0aeee01c3	[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262	2017-05-05 17:36:09 +00:00
Sam Parker	df337704f0	[ARM] ACLE Chapter 9 intrinsics Added the integer data processing intrinsics from ACLE v2.1 Chapter 9 but I have missed out the saturation_occurred intrinsics for now. For the instructions that read and write the GE bits, a chain is included and the only instruction that reads these flags (sel) is only selectable via the implemented intrinsic. Differential Revision: https://reviews.llvm.org/D32281 llvm-svn: 302126	2017-05-04 07:31:28 +00:00
Craig Topper	d0af7e8ab8	[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620	2017-04-28 05:31:46 +00:00
Tim Northover	e31cf3f824	ARM: make sure we use all entries in a vector before forming a vpaddl. Otherwise there's some mismatch, and we'll either form an illegal type or an illegal node. Thanks to Eli Friedman for pointing out the problem with my original solution. llvm-svn: 301036	2017-04-21 20:35:52 +00:00
Tim Northover	1061ccca8c	ARM: don't try to create an i8 -> i32 vpaddl. DAG combine was mistakenly assuming that the step-up it was looking at was always a doubling, but it can sometimes be a larger extension in which case we'd crash. llvm-svn: 301002	2017-04-21 17:21:59 +00:00
Tim Northover	46e58354da	ARM: lower "fence singlethread" to a pure compiler barrier. Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300904	2017-04-20 21:56:52 +00:00
Tim Northover	8b1240b0f0	ARM: handle post-indexed NEON ops where the offset isn't the access width. Before, we assumed that any ConstantInt offset was precisely the access width, so we could use the "[rN]!" form. ISelLowering only ever created that kind, but further simplification during combining could lead to unexpected constants and incorrect codegen. Should fix PR32658. llvm-svn: 300878	2017-04-20 19:54:02 +00:00
Diana Picus	7c6dee9f16	[ARM] Rename HW div feature to HW div Thumb. NFCI. The hardware div feature refers only to Thumb, but because of its name it is tempting to use it to check for hardware division in general, which may cause problems in ARM mode. See https://reviews.llvm.org/D32005. This patch adds "Thumb" to its name, to make its scope clear. One notable place where I haven't made the change is in the feature flag (used with -mattr), which is still hwdiv. Changing it would also require changes in a lot of tests, including clang tests, and it doesn't seem like it's worth the effort. Differential Revision: https://reviews.llvm.org/D32160 llvm-svn: 300827	2017-04-20 09:38:25 +00:00
Eli Friedman	70ad2751d5	[ARM] Remove redundant computeKnownBits helper. Move the BFI logic to computeKnownBitsForTargetNode, and delete the redundant CMOV logic. This is intended as a cleanup, but it's probably possible to construct a case where moving the BFI logic allows more combines. Differential Revision: https://reviews.llvm.org/D31795 llvm-svn: 300752	2017-04-19 20:50:57 +00:00
Eli Friedman	f281d490cc	[ARM] Use TableGen patterns to select vtbl. NFC. Differential Revision: https://reviews.llvm.org/D32103 llvm-svn: 300749	2017-04-19 20:39:39 +00:00
Matt Arsenault	3138075dd4	DAG: Make mayBeEmittedAsTailCall parameter const llvm-svn: 300603	2017-04-18 21:16:46 +00:00
Diana Picus	e2626bb7c2	[ARM] Check for correct HW div when lowering divmod For subtargets that use the custom lowering for divmod, e.g. gnueabi, we used to check if the subtarget has hardware divide and then lower to a div-mul-sub sequence if true, or to a libcall if false. However, judging by the usage of hasDivide vs hasDivideInARMMode, it seems that hasDivide only refers to Thumb. For instance, in the ARMTargetLowering constructor, the code that specifies whether to use libcalls for (S\|U)DIV looks like this: bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide() : Subtarget->hasDivideInARMMode(); In the case of divmod for arm-gnueabi, using only hasDivide() to determine what to do means that instead of lowering to __aeabi_idivmod to get the remainder, we lower to div-mul-sub and then further lower the div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but not in Thumb, we generate a libcall instead of using it (this is not an issue in practice since AFAICT none of the cores that we support have hardware divide in ARM but not Thumb). This patch fixes the code dealing with custom lowering to take into account the mode (Thumb or ARM) when deciding whether or not hardware division is available. Differential Revision: https://reviews.llvm.org/D32005 llvm-svn: 300536	2017-04-18 08:32:27 +00:00
Matthew Simpson	1468d3e04e	[ARM/AArch64] Ensure valid vector element types for interleaved accesses This patch refactors and strengthens the type checks performed for interleaved accesses. The primary functional change is to ensure that the interleaved accesses have valid element types. The added test cases previously failed because the element type is f128. Differential Revision: https://reviews.llvm.org/D31817 llvm-svn: 299864	2017-04-10 18:34:37 +00:00
Huihui Zhang	98240e9643	[SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL In LowerMUL, the chain information is not preserved for the new created Load SDNode. For example, if a Store alias with one of the operand of Mul. The Load for that operand need to be scheduled before the Store. The dependence is recorded in the chain of Store, in TokenFactor. However, when lowering MUL, the SDNodes for the new Loads for VMULL are not updated in the TokenFactor for the Store. Thus the chain is not preserved for the lowered VMULL. llvm-svn: 299701	2017-04-06 20:22:51 +00:00
Simon Pilgrim	37b536e4b3	[DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201	2017-03-31 11:24:16 +00:00
Eli Friedman	95ddd18703	[ARM] Fix mixup between Lo and Hi in SMLALBB formation. llvm-svn: 298752	2017-03-25 00:13:24 +00:00
Pirama Arumuga Nainar	bc26482717	[ARM] Fix computeKnownBits for ARMISD::CMOV Summary: The true and false operands for the CMOV are operands 0 and 1. ARMISelLowering.cpp::computeKnownBits was looking at operands 1 and 2 instead. This can cause CMOV instructions to be incorrectly folded into BFI if value set by the CMOV is another CMOV, whose known bits are computed incorrectly. This patch fixes the issue and adds a test case. Reviewers: kristof.beyls, jmolloy Subscribers: llvm-commits, aemerson, srhines, rengolin Differential Revision: https://reviews.llvm.org/D31265 llvm-svn: 298624	2017-03-23 16:47:47 +00:00
Artyom Skrobov	92c0653095	Reapply r298417 "[ARM] Recommit the glueless lowering of addc/adde in Thumb1" The UB in t2_so_imm_neg conversion has been addressed under D31242 / r298512 This reverts commit r298482. llvm-svn: 298562	2017-03-22 23:35:51 +00:00
Vitaly Buka	e69c137f90	Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648." Fails check-llvm with ubsan This reverts commit r298417. llvm-svn: 298482	2017-03-22 05:07:44 +00:00
Artyom Skrobov	40a4f40679	[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648. This reverts r298328 "[ARM] Revert r297443 and r297820." and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"" llvm-svn: 298417	2017-03-21 18:39:41 +00:00
Eli Friedman	76732acc23	[ARM] Revert r297443 and r297820. The glueless lowering of addc/adde in Thumb1 has known serious miscompiles (see https://reviews.llvm.org/D31081), and r297820 causes an infinite loop for certain constructs. It's not clear when they will be fixed, so let's just take them out of the tree for now. (I resolved a small conflict with r297453.) llvm-svn: 298328	2017-03-21 00:26:39 +00:00
Vadzim Dambrouski	ba789cbd3d	[ARM] Fix PR32130: Handle promotion of zero sized constants. The special case of zero sized values was previously not handled correctly. This patch handles this by not promoting if the size is zero. Patch by Tim Neumann. Differential Revision: https://reviews.llvm.org/D31116 llvm-svn: 298320	2017-03-20 22:59:57 +00:00
Nirav Dave	ac6081cb67	Make library calls sensitive to regparm module flag (Fixes PR3997). Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179	2017-03-18 00:44:07 +00:00
Nirav Dave	6de2c77944	Capitalize ArgListEntry fields. NFC. llvm-svn: 298178	2017-03-18 00:43:57 +00:00
Artyom Skrobov	e72e1ba434	Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648" This reverts r297820 which apparently fails on A15 hosts. llvm-svn: 297842	2017-03-15 14:50:43 +00:00
Artyom Skrobov	3fa5fd1dd2	[Thumb1] Fix the bug when adding/subtracting -2147483648 Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297820	2017-03-15 10:19:16 +00:00
Sam Parker	654cb8263a	[ARM] Enable SMLAL[B\|T] isel Enable the selection of the 64-bit signed multiply accumulate instructions which operate on 16-bit operands. These are enabled for ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb architectures. Differential Revision: https://reviews.llvm.org/D30044 llvm-svn: 297809	2017-03-15 08:27:11 +00:00
Sam Parker	916b1ba617	[ARM] Move SMULW[B\|T] isel to DAG Combine Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716	2017-03-14 09:13:22 +00:00
Artyom Skrobov	bf19d4bc29	[Thumb1] combine ADDC/SUBC with a negative immediate Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400 Reviewers: efriedma, jmolloy Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297682	2017-03-13 22:36:14 +00:00
Artyom Skrobov	0cc80c1f5a	Refactor the multiply-accumulate combines to act on ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE]. Summary: This allows for some simplification because the combines are no longer limited to just one go at the node before it gets legalized into an ARM target-specific one. Reviewers: jmolloy, rogfer01 Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30401 llvm-svn: 297453	2017-03-10 12:41:33 +00:00
Artyom Skrobov	0c93ceb5d8	For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as already done for ARM and Thumb2. Reviewers: jmolloy, rogfer01, efriedma Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30400 llvm-svn: 297443	2017-03-10 07:40:27 +00:00
Ranjeet Singh	3d0af578cc	[ARM] Reapply r296865 "[ARM] fpscr read/write intrinsics not aware of each other"" The original patch r296865 was reverted as it broke the chromium builds for Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies r296865 with a fix to make sure it doesn't cause the build regression. The problem was that intrinsic selection on int_arm_get_fpscr was failing in ISel this was because the code to manually select this intrinsic still thought it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as it doesn't semantically match the definition in the tablegen code which says it does have side-effects, I've fixed this by updating the intrinsic type to INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on Hans original reproducer. Differential Revision: https://reviews.llvm.org/D30645 llvm-svn: 297137	2017-03-07 11:17:53 +00:00
Matthew Simpson	1bfa159db9	[ARM/AArch64] Support wide interleaved accesses This patch teaches (ARM\|AArch64)ISelLowering.cpp to match illegal vector types to interleaved access intrinsics as long as the types are multiples of the vector register width. A "wide" access will now be mapped to multiple interleave intrinsics similar to the way in which non-interleaved accesses with illegal types are legalized into multiple accesses. I'll update the associated TTI costs (in getInterleavedMemoryOpCost) as a follow-on. Differential Revision: https://reviews.llvm.org/D29466 llvm-svn: 296750	2017-03-02 15:11:20 +00:00
Sanjay Patel	ae7873fe55	[ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition The transform in question claims to be doing: // fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c)) ...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node for the sext/zext patterns. This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(), so I was seeing infinite loops with my draft of a patch applied. The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's not affected by this code change. Differential Revision: https://reviews.llvm.org/D30355 llvm-svn: 296389	2017-02-27 21:30:54 +00:00
Evgeniy Stepanov	1fd19c6e5d	Fix PR31896. Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset). llvm-svn: 295762	2017-02-21 20:17:34 +00:00
Sam Parker	58af0c55d2	[ARM] Replace HasT2ExtractPack with HasDSP Removed the HasT2ExtractPack feature and replaced its references with HasDSP. This then allows the Thumb2 extend instructions to be selected for ARMv8M +dsp. These instruction descriptions have also been refactored and more target tests have been added for their isel. Differential Revision: https://reviews.llvm.org/D29623 llvm-svn: 295452	2017-02-17 15:42:44 +00:00
James Molloy	0ae2202235	[ARM] Fix crash caused by r294945 I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968	2017-02-13 17:18:00 +00:00
James Molloy	d508789668	[ARM] Use VCMP, not VCMPE, for floating point equality comparisons When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945	2017-02-13 12:32:47 +00:00
Ahmed Bougacha	fc979dc9dd	[ARM] Don't lower f16 interleaved accesses. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818	2017-02-11 01:53:00 +00:00
Arnold Schwaighofer	db7bbcbe78	[ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 llvm-svn: 294527	2017-02-08 22:30:47 +00:00
Christof Douma	d3ed8380e0	[ARM] Make RWPI use movw/movt when available When constructing global address literals while targeting the RWPI relocation model. LLVM currently only uses literal pools. If MOVW/MOVT instructions are available we can use these instead. Beside being more efficient it allows -arm-execute-only to work with -relocation-model=RWPI as well. When we generate MOVW/MOVT for global addresses when targeting the RWPI relocation model, we need to use base relative relocations. This patch does the needed plumbing in MC to generate these for MOVW/MOVT. Differential Revision: https://reviews.llvm.org/D29487 Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d llvm-svn: 294298	2017-02-07 13:07:12 +00:00
Reid Kleckner	c35139ec0d	[CodeGen] Remove dead call-or-prologue enum from CCState This enum has been dead since Olivier Stannard re-implemented ARM byval handling in r202985 (2014). llvm-svn: 293943	2017-02-02 21:58:22 +00:00
Matthew Simpson	ba5cf9dfee	[LV] Move interleaved access helper functions to VectorUtils (NFC) This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788	2017-02-01 17:45:46 +00:00
Matthias Braun	8c209aa877	Cleanup dump() functions. We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359	2017-01-28 02:02:38 +00:00
Eugene Zelenko	e79c077ef9	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293348	2017-01-27 23:58:02 +00:00
Saleem Abdulrasool	26c00e3700	ARM: fix vectorized division on WoA The Windows on ARM target uses custom division for normal division as the backend needs to insert division-by-zero checks. However, it is designed to only handle non-vectorized division. ARM has custom lowering for vectorized division as that can avoid loading registers with the values and invoke a division routine for each one, preferring to lower using NEON instructions. Fall back to the custom lowering for the NEON instructions if we encounter a vectorized division. Resolves PR31778! llvm-svn: 293259	2017-01-27 03:41:53 +00:00
Diana Picus	bd66b7dc87	[ARM] Use helpers for adding pred / CC operands. NFC Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0) and replace with add(condCodeOp()) and add(predOps()). This should make it easier to understand what those operands represent (without having to look at the definition of the instruction that we're adding to). Differential Revision: https://reviews.llvm.org/D27984 llvm-svn: 292587	2017-01-20 08:15:24 +00:00
Saleem Abdulrasool	6ef45916c6	ARM: match GCC's behaviour for builtins GCC changes the CC between the user-code and the builtins based on the value of `-target` rather than `-mfloat-abi`. When a HF target is used, the VFP variant of the AAPCS CC is used. Otherwise, the AAPCS variant is used. In all cases, the AEABI functions use the AAPCS CC. Adjust the calling convention based on the target. Resolves PR30543! llvm-svn: 291909	2017-01-13 16:25:33 +00:00
Benjamin Kramer	061f4a5fe6	Apply clang-tidy's performance-unnecessary-value-param to LLVM. With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904	2017-01-13 14:39:03 +00:00
Diana Picus	a2c59149e1	[ARM] CodeGen: Replace AddDefaultT1CC and AddNoT1CC. NFC For AddDefaultT1CC, we add a new helper t1CondCodeOp, which creates the appropriate register operand. For AddNoT1CC, we use the existing condCodeOp helper - we only had two uses of AddNoT1CC, so at this point it's probably not worth having yet another helper just for them. Differential Revision: https://reviews.llvm.org/D28603 llvm-svn: 291894	2017-01-13 10:37:37 +00:00
Diana Picus	8a73f5562f	[ARM] CodeGen: Remove AddDefaultCC. NFC. Replace all uses of AddDefaultCC with add(condCodeOp()). The transformation has been done automatically with a custom tool based on Clang AST Matchers + RefactoringTool. Differential Revision: https://reviews.llvm.org/D28557 llvm-svn: 291893	2017-01-13 10:18:01 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Diana Picus	4f8c3e1882	[ARM] CodeGen: Remove AddDefaultPred. NFC. Replace all uses of AddDefaultPred with MachineInstrBuilder::add(predOps()). This makes the code building MachineInstrs more readable, because it allows us to write code like: MIB.addSomeOperand(blah) .add(predOps()) .addAnotherOperand(blahblah) instead of AddDefaultPred(MIB.addSomeOperand(blah)) .addAnotherOperand(blahblah) This commit also adds the predOps helper in the ARM backend, as well as the add method taking a variable number of operands to the MachineInstrBuilder. The transformation has been done mostly automatically with a custom tool based on Clang AST Matchers + RefactoringTool. Differential Revision: https://reviews.llvm.org/D28555 llvm-svn: 291890	2017-01-13 09:37:56 +00:00
Saleem Abdulrasool	555e5980a5	ARM: slightly more table driven libcall setup Switch some additional library call setup to be table driven. This makes it more immediately obvious what the library call looks like. This is important for ARM since the calling conventions for the builtins change based on the target/libcall name. NFC llvm-svn: 291789	2017-01-12 18:46:11 +00:00
Eli Friedman	3a03742c37	[ARM] More aggressive matching for vpadd and vpaddl. The new matchers work after legalization to make them simpler, and to avoid blocking other optimizations. Differential Revision: https://reviews.llvm.org/D27779 llvm-svn: 291693	2017-01-11 19:33:38 +00:00
Chad Rosier	d0114fc1dd	[ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse. Testing already covered by CodeGen/ARM/rbit.ll llvm-svn: 291587	2017-01-10 19:23:51 +00:00
Aaron Ballman	58a61e723e	Caught a simple typo. I do not know of a way to test this, but it seems like an unlikely thing to regress in the future. llvm-svn: 290757	2016-12-30 15:57:56 +00:00

1 2 3 4 5 ...

1674 Commits