llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	58407ca045	[ARM] Use unique_ptr to fix memory leak introduced in r337701 llvm-svn: 337714	2018-07-23 17:43:21 +00:00
Jordan Rupprecht	e5daf61229	OpChain has subclasses, so add a virtual destructor. Summary: OpChain has subclasses, so add a virtual destructor. This fixes an issue when deleting subclasses of OpChain (see MatchSMLAD() specifically) in r337701. Reviewers: javed.absar Subscribers: llvm-commits, SjoerdMeijer, samparker Differential Revision: https://reviews.llvm.org/D49681 llvm-svn: 337713	2018-07-23 17:38:05 +00:00
Matt Morehouse	d773cb99f1	[ARM] Follow-up to r337709. Fix double-free. llvm-svn: 337711	2018-07-23 17:22:53 +00:00
Matt Morehouse	a70685f63a	[ARM] Add doFinalization() to ARMCodeGenPrepare pass. Attempt to fix the leak introduced in r337687 and make sanitizer buildbots green again. llvm-svn: 337709	2018-07-23 17:00:45 +00:00
Sam Parker	89a3799a69	[ARM][NFC] ParallelDSP reorganisation In preparing to allow ARMParallelDSP pass to parallelise more than smlads, I've restructed some elements: - The ParallelMAC struct has been renamed to BinOpChain. - The BinOpChain struct holds two value lists: LHS and RHS, as well as inheriting from the OpChain base class. - The OpChain struct holds all the values of the represented chain and has had the memory locations functionality inserted into it. - ParallelMACList becomes OpChainList and it now holds pointers instead of objects. Differential Revision: https://reviews.llvm.org/D49020 llvm-svn: 337701	2018-07-23 15:25:59 +00:00
Sam Parker	3828c6ff94	[ARM] ARMCodeGenPrepare backend pass Arm specific codegen prepare is implemented to perform type promotion on icmp operands, which can enable the removal of uxtb and uxth (unsigned extend) instructions. This is possible because performing type promotion before ISel alleviates this duty from the DAG builder which has to perform legalisation, but has a limited view on data ranges. The pass visits any instruction operand of an icmp and creates a worklist to traverse the use-def tree to determine whether the values can simply be promoted. Our concern is values in the registers overflowing the narrow (i8, i16) data range, so instructions marked with nuw can be promoted easily. For add and sub instructions, we are able to use the parallel dsp instructions to operate on scalar data types and avoid overflowing bits. Underflowing adds and subs are also permitted when the result is only used by an unsigned icmp. Differential Revision: https://reviews.llvm.org/D48832 llvm-svn: 337687	2018-07-23 12:27:47 +00:00
Evandro Menezes	fffa9b5897	[ARM] Add new feature to enable optimizing the VFP registers Enable the optimization of operations on DPR and SPR via a feature instead of checking the target. Differential revision: https://reviews.llvm.org/D49463 llvm-svn: 337575	2018-07-20 16:49:28 +00:00
Tim Northover	a7c18ee8c4	ARM: switch armv7em MachO triple to hard-float defaults and libcalls. We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. Recommitted without breaking ELF this time. llvm-svn: 337450	2018-07-19 12:44:51 +00:00
Tim Northover	da142d10d9	Revert "ARM: switch armv7em triple to hard-float defaults and libcalls." This reverts commit r337385 until it can be targeted at MachO only. llvm-svn: 337424	2018-07-18 21:32:49 +00:00
Tim Northover	e00cf4fc68	ARM: stop explicitly marking armv7k libcalls as hard-float. NFC. Since the triple's default is hard float, the libcalls will already use VFP registers. llvm-svn: 337386	2018-07-18 12:37:43 +00:00
Tim Northover	d4abd14c1b	ARM: switch armv7em triple to hard-float defaults and libcalls. We were emitting incorrect calls to libm functions that LLVM had decided it knew about because the default is soft-float. llvm-svn: 337385	2018-07-18 12:37:04 +00:00
Tim Northover	097a3e3d95	ARM: deduplicate hard-float detection code. NFC. ARMSubtarget had a copy/pasted block to determine whether the target was hard-float, but it just delegated to triple features anyway so it's better at the TargetMachine level. llvm-svn: 337384	2018-07-18 12:36:25 +00:00
Sjoerd Meijer	53449daa96	[ARM] ParallelDSP: multiple reduction stmts in loop This fixes an issue that we were not properly supporting multiple reduction stmts in a loop, and not generating SMLADs for these cases. The alias analysis checks were done too early, making it too conservative. Differential revision: https://reviews.llvm.org/D49125 llvm-svn: 336795	2018-07-11 12:36:25 +00:00
Eli Friedman	d2c739230c	[ARM] Treat cmn immediates as legal in isLegalICmpImmediate. The original code attempted to do this, but the std::abs() call didn't actually do anything due to implicit type conversions. Fix the type conversions, and perform the correct check for negative immediates. This probably has very little practical impact, but it's worth fixing just to avoid confusion in the future, I think. Differential Revision: https://reviews.llvm.org/D48907 llvm-svn: 336742	2018-07-10 23:44:37 +00:00
Sjoerd Meijer	b3e06faa28	[ARM] ParallelDSP: added statistics, NFC. Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441	2018-07-06 14:47:09 +00:00
Sjoerd Meijer	2a57b357a3	[AArch64][ARM] Armv8.4-A: Trace synchronization barrier instruction This adds the Armv8.4-A Trace synchronization barrier (TSB) instruction. Differential Revision: https://reviews.llvm.org/D48918 llvm-svn: 336418	2018-07-06 08:03:12 +00:00
Ivan A. Kosarev	466037900c	[NEON] Fix combining of vldx_dup intrinsics with updating of base addresses Resolves: Unsupported ARM Neon intrinsics in Target-specific DAG combine function for VLDDUP https://bugs.llvm.org/show_bug.cgi?id=38031 Related diff: D48439 Differential Revision: https://reviews.llvm.org/D48920 llvm-svn: 336325	2018-07-05 08:59:49 +00:00
Sjoerd Meijer	27be58b307	[ARM] ParallelDSP: only support i16 loads for now We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319	2018-07-05 08:21:40 +00:00
Volodymyr Turanskyy	17c0c4e742	[ARM] [Assembler] Support negative immediates: cover few missing cases Support for negative immediates was implemented in https://reviews.llvm.org/rL298380, however few instruction options were missing. This change adds negative immediates support and respective tests for the following: ADD ADDS ADDS.W AND.W ANDS BIC.W BICS BICS.W SUB SUBS SUBS.W Differential Revision: https://reviews.llvm.org/D48649 llvm-svn: 336286	2018-07-04 16:11:15 +00:00
Fangrui Song	68169343a5	[ARM] Fix inconsistent declaration parameter name in r336195 llvm-svn: 336223	2018-07-03 19:12:27 +00:00
Sam Parker	ffc1681620	[ARM][NFC] Refactor sequential access for DSP With a view to support parallel operations that have their results stored to memory, refactor the consecutive access helper out so it could support stores instructions. Differential Revision: https://reviews.llvm.org/D48872 llvm-svn: 336195	2018-07-03 12:44:16 +00:00
Vadzim Dambrouski	fd10286e04	[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m. Reviewers: efriedma, rogfer01, javed.absar Reviewed By: efriedma, rogfer01 Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D48846 llvm-svn: 336144	2018-07-02 21:05:26 +00:00
David Green	963401d2be	[UnrollAndJam] New Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062	2018-07-01 12:47:30 +00:00
Sjoerd Meijer	195e904002	[ARM][AArch64] Armv8.4-A Enablement Initial patch adding assembly support for Armv8.4-A. Besides adding v8.4 as a supported architecture to the usual places, this also adds target features for the different crypto algorithms. Armv8.4-A introduced new crypto algorithms, made them optional, and allows different combinations: - none of the v8.4 crypto functions are supported, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. - the v8.4 SM3 and SM4 support is implemented, which is independent of the implementation of the Armv8.0 SHA1 and SHA2 instructions. - all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1 and SHA2 instructions must also be implemented. The v8.4 crypto instructions are added to AArch64 only, and not AArch32, and are made optional extensions to Armv8.2-A. The user-facing Clang options will map on these new target features, their naming will be compatible with GCC and added in follow-up patches. The Armv8.4-A instruction sets can be downloaded here: https://developer.arm.com/products/architecture/a-profile/exploration-tools Differential Revision: https://reviews.llvm.org/D48625 llvm-svn: 335953	2018-06-29 08:43:19 +00:00
Eli Friedman	65d885e376	[ARM] Assert that ARMDAGToDAGISel creates valid UBFX/SBFX nodes. We don't ever check these again (unless you're using -fno-integrated-as), so make sure the extracted bits are well-defined. I don't think it's possible to trigger any of the assertions on trunk, but it's difficult to prove. (The first one depends on DAGCombine to minimize the number of set bits in AND masks; I think the others are mathematically impossible to hit.) llvm-svn: 335931	2018-06-28 21:49:41 +00:00
Eli Friedman	6613efbd4e	[ARM] Add missing Thumb2 assembler diagnostics. Mostly just adding checks for Thumb2 instructions which correspond to ARM instructions which already had diagnostics. While I'm here, also fix ARM-mode strd to check the input registers correctly. Differential Revision: https://reviews.llvm.org/D48610 llvm-svn: 335909	2018-06-28 19:53:12 +00:00
Simon Pilgrim	c09b5e31d7	Remove unnecessary semicolon. NFCI. Fixes -Wpedantic warning. llvm-svn: 335901	2018-06-28 18:37:16 +00:00
Matthias Braun	da5e7e11d1	SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O) Add NoTrapAfterNoreturn target option which skips emission of traps behind noreturn calls even if TrapUnreachable is enabled. Enable the feature on Mach-O to save code size; Comments suggest it is not possible to enable it for the other users of TrapUnreachable. rdar://41530228 DifferentialRevision: https://reviews.llvm.org/D48674 llvm-svn: 335877	2018-06-28 17:00:45 +00:00
Sjoerd Meijer	c89ca5582a	[ARM] Parallel DSP Pass Armv6 introduced instructions to perform 32-bit SIMD operations. The purpose of this pass is to do some straightforward IR pattern matching to create ACLE DSP intrinsics, which map on these 32-bit SIMD operations. Currently, only the SMLAD instruction gets recognised. This instruction performs two multiplications with 16-bit operands, and stores the result in an accumulator. We will follow this up with patches to recognise SMLAD in more cases, and also to generate other DSP instructions (like e.g. SADD16). Patch by: Sam Parker and Sjoerd Meijer Differential Revision: https://reviews.llvm.org/D48128 llvm-svn: 335850	2018-06-28 12:55:29 +00:00
Hans Wennborg	a257376003	s/TablesChecked/TableChecked/ after r335823 llvm-svn: 335831	2018-06-28 10:24:38 +00:00
Benjamin Kramer	f9613b2995	Unify sorted asserts to use the existing atomic pattern These are all benign races and only visible in !NDEBUG. tsan complains about it, but a simple atomic bool is sufficient to make it happy. llvm-svn: 335823	2018-06-28 10:03:45 +00:00
Ivan A. Kosarev	7231598fce	[NEON] Support vldNq intrinsics in AArch32 (LLVM part) This patch adds support for the q versions of the dup (load-to-all-lanes) NEON intrinsics, such as vld2q_dup_f16() for example. Currently, non-q versions of the dup intrinsics are implemented in clang by generating IR that first loads the elements of the structure into the first lane with the lane (to-single-lane) intrinsics, and then propagating it other lanes. There are at least two problems with this approach. First, there are no double-spaced to-single-lane byte-element instructions. For example, there is no such instruction as 'vld2.8 { d0[0], d2[0] }, [r0]'. That means we cannot rely on the to-single-lane intrinsics and instructions to implement the q versions of the dup intrinsics. Note that to-all-lanes instructions do support all sizes of data items, including bytes. The second problem with the current approach is that we need a separate vdup instruction to propagate the structure to each lane. So for vld4q_dup_f16() we would need four vdup instructions in addition to the initial vld instruction. This patch introduces dup LLVM intrinsics and reworks handling of the currently supported (non-q) NEON dup intrinsics to expand them into those LLVM intrinsics, thus eliminating the need for using to-single-lane intrinsics and instructions. Additionally, this patch adds support for u64 and s64 dup NEON intrinsics. These are marked as Arch64-only in the ARM NEON Reference, but it seems there are no reasons to not support them in AArch32 mode. Please correct, if that is wrong. That's what we generate with this patch applied: vld2q_dup_f16: vld2.16 {d0[], d2[]}, [r0] vld2.16 {d1[], d3[]}, [r0] vld3q_dup_f16: vld3.16 {d0[], d2[], d4[]}, [r0] vld3.16 {d1[], d3[], d5[]}, [r0] vld4q_dup_f16: vld4.16 {d0[], d2[], d4[], d6[]}, [r0] vld4.16 {d1[], d3[], d5[], d7[]}, [r0] Differential Revision: https://reviews.llvm.org/D48439 llvm-svn: 335733	2018-06-27 13:57:52 +00:00
Than McIntosh	3190993a02	[X86,ARM] Retain split-stack prolog check for sibling calls Summary: If a routine with no stack frame makes a sibling call, we need to preserve the stack space check even if the local stack frame is empty, since the call target could be a "no-split" function (in which case the linker needs to be able to fix up the prolog sequence in order to switch to a larger stack). This fixes PR37807. Reviewers: cherry, javed.absar Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D48444 llvm-svn: 335604	2018-06-26 14:11:30 +00:00
Tim Northover	b73efb85ba	ARM: correctly decode VFP instructions following unpredictable t2IT When the condition code for an IT instruction is "AL" we get strange "15" predicates on subsequent instructions. These are dealt with for most instructions by treating them as "ARMCC::AL", but VFP takes a different path which didn't have this code. llvm-svn: 335594	2018-06-26 11:39:20 +00:00
Tim Northover	bf54858115	ARM: diagnose unpredictable IT instructions IT instructions are allowed to have the 'AL' predicate, but it must never result in an 'NV' predicated instruction. Essentially this means that all branches must be 't' rather than 'e' if the predicate is 'AL'. This patch adds a diagnostic for this during assembly (error because parsing hits an assertion if allowed to continue) and an annotation during disassembly. llvm-svn: 335593	2018-06-26 11:38:41 +00:00
Sjoerd Meijer	1043dffbd3	Recommit of r335326, with the test fixed that I missed. llvm-svn: 335331	2018-06-22 10:03:03 +00:00
Sjoerd Meijer	7ee5b090de	Reverting r335326 while I look at the test failure llvm-svn: 335328	2018-06-22 09:17:08 +00:00
Sjoerd Meijer	8d2f1565b7	[ARM] ARMv6m and v8m.baseline strict align This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326	2018-06-22 08:48:13 +00:00
David Green	21a2973cc4	[ARM] Enable useAA() for the in-order Cortex-R52 This option allows codegen (such as DAGCombine or MI scheduling) to use alias analysis information, which can help with the codegen on in-order cpu's, especially machine scheduling. Here I have done things the same way as AArch64, adding a subtarget feature to enable this for specific cores, and enabled it for the R52 where we have a schedule to make use of it. Differential Revision: https://reviews.llvm.org/D48074 llvm-svn: 335249	2018-06-21 15:48:29 +00:00
Tim Northover	644a819534	ARM: convert ORR instructions to ADD where possible on Thumb. Thumb has more 16-bit encoding space dedicated to ADD than ORR, allowing both a 3-address encoding and a wider range of immediates. So, particularly when optimizing for code size (but it doesn't make things worse elsewhere) it's beneficial to select an OR operation to an ADD if we know overflow won't occur. This is made even better by LLVM's penchant for putting operations in canonical form by converting the other way. llvm-svn: 335119	2018-06-20 12:09:44 +00:00
Daniel Sanders	8ead1290e6	[globalisel][tablegen] Add support for C++ predicates on PatFrags and use it to support BFC on ARM. So far, we've only handled special cases of PatFrag like ImmLeaf. This patch adds support for the remaining cases using similar mechanisms. Like most C++ code from SelectionDAG, GISel and DAGISel expect to operate on different types and representations and as such the code is not compatible between the two. It's therefore necessary to add an alternative implementation in the GISelPredicateCode field. The target test for this feature could easily be done with IntImmLeaf and this would save on a little boilerplate. The reason I've chosen to implement this using PatFrag.GISelPredicateCode and not IntImmLeaf is because I was unable to find a rule that was blocked solely by lack of support for PatFrag predicates. I found that the ones I investigated as being likely candidates for the test were further blocked by other things. llvm-svn: 334871	2018-06-15 23:13:43 +00:00
Krzysztof Parzyszek	82d284c1d2	[DAGCombiner] Recognize more patterns for ABS Differential Revision: https://reviews.llvm.org/D47831 llvm-svn: 334553	2018-06-12 21:51:49 +00:00
Simon Pilgrim	e39fa6cbbb	[CostModel] Replace ShuffleKind::SK_Alternate with ShuffleKind::SK_Select (PR33744) As discussed on PR33744, this patch relaxes ShuffleKind::SK_Alternate which requires shuffle masks to only match an alternating pattern from its 2 sources: e.g. v4f32: <0,5,2,7> or <4,1,6,3> This seems far too restrictive as most SIMD hardware which will implement it using a general blend/bit-select instruction, so replaces it with SK_Select, permitting elements from either source as long as they are inline: e.g. v4f32: <0,5,2,7>, <4,1,6,3>, <0,1,6,7>, <4,1,2,3> etc. This initial patch just updates the name and cost model shuffle mask analysis, later patch reviews will update SLP to better utilise this - it still limits itself to SK_Alternate style patterns. Differential Revision: https://reviews.llvm.org/D47985 llvm-svn: 334513	2018-06-12 16:12:29 +00:00
Ivan A. Kosarev	847daa11f8	[NEON] Support VST1xN intrinsics in AArch32 mode (LLVM part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47447 llvm-svn: 334361	2018-06-10 09:27:27 +00:00
Eli Friedman	864df22307	[ARM] Allow CMPZ transforms even if the input has multiple uses. It looks like this got left in by accident in r289794; I can't think of any reason this check would be necessary. (Maybe it was meant to be a check that the AND has one use? But we check that a few lines earlier.) Differential Revision: https://reviews.llvm.org/D47921 llvm-svn: 334322	2018-06-08 21:16:56 +00:00
Evandro Menezes	b2c8244715	[AArch64, ARM] Add support for Samsung Exynos M4 Create a separate feature set for Exynos M4 and add test cases. llvm-svn: 334115	2018-06-06 18:56:00 +00:00
Petar Jovanovic	8cb6a521be	Change TII isCopyInstr way of returning arguments(NFC) Make TII isCopyInstr() return MachineOperands through pointer to pointer instead via reference. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D47364 llvm-svn: 334105	2018-06-06 16:36:30 +00:00
Peter Smith	57f661bd7d	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078	2018-06-06 09:40:06 +00:00
Peter Smith	ef945b2240	[MC][ARM] Add range checking for Thumb2 resolved fixups. When the branch target of a Thumb2 unconditional or conditonal branch is resolved at assembly time, no range checking is performed on the result leading to incorrect immediates. This change adds a range check: +- 16 Megabytes for unconditional branches, +- 1 Megabyte for the conditional branch. Differential Revision: https://reviews.llvm.org/D46306 llvm-svn: 333997	2018-06-05 10:00:56 +00:00
Peter Smith	0aafe0cee5	[MC][ARM] Correct Thumb BL instruction range The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing check for BL range is incorrectly set at +- 32 Megabytes. This change corrects the higher range and uses the lower range if the featurebits don't have the necessary support for it. Differential Revision: https://reviews.llvm.org/D46305 llvm-svn: 333991	2018-06-05 09:32:28 +00:00
Ivan A. Kosarev	60a991ed1a	[NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825	2018-06-02 16:40:03 +00:00
Ivan A. Kosarev	73c5337a64	Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)" The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824	2018-06-02 16:38:38 +00:00
Ivan A. Kosarev	51f19b9ee1	[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part) We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819	2018-06-02 16:26:42 +00:00
Roman Tereshin	667c7581ed	[GlobalISel][ARM] LegalizerInfo verifier: Adding LegalizerInfo::verify(...) call and fixing bugs exposed Reviewers: aemerson, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D46339 llvm-svn: 333663	2018-05-31 16:16:48 +00:00
Amaury Sechet	f47d9f30b0	[ARM] Remove code handling ADDC/ADDE/SUBC/SUBE Summary: This code is now dead as the ARM backend uses ADDCARRY/SUBCARRY/SETCCCARRY . Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D47413 llvm-svn: 333544	2018-05-30 13:45:43 +00:00
Eli Friedman	63fead0f43	[ARM] Enable SETCCCARRY lowering for Thumb1. We've had Thumb1 support for ARMISD::SUBE for a while now, so this just works. Reduces codesize a bit for 64-bit integer comparisons. Differential Revision: https://reviews.llvm.org/D47387 llvm-svn: 333445	2018-05-29 18:17:16 +00:00
David Green	aee7ad0cde	Revert 333358 as it's failing on some builders. I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359	2018-05-27 12:54:33 +00:00
David Green	3034281b43	[UnrollAndJam] Add a new Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358	2018-05-27 12:11:21 +00:00
Petar Jovanovic	c051000b83	[X86][MIPS][ARM] New machine instruction property 'isMoveReg' This property is needed in order to follow values movement between registers. This property is used in TII to implement method that returns true if simple copy like instruction is recognized, along with source and destination machine operands. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D45204 llvm-svn: 333093	2018-05-23 15:28:28 +00:00
Roman Tereshin	e79d656c33	[GlobalISel][ARM] Adding HPR and QPR regclasses to FPRB regbank Also bringing ARMRegisterBankInfo::getRegBankFromRegClass implementation up to speed with the *.td-definition. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D43982 llvm-svn: 333056	2018-05-23 02:59:31 +00:00
Peter Collingbourne	dcd7d6c331	MC: Separate creating a generic object writer from creating a target object writer. NFCI. With this we gain a little flexibility in how the generic object writer is created. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47045 llvm-svn: 332868	2018-05-21 19:20:29 +00:00
Peter Collingbourne	571a3301ae	MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. To make this work I needed to add an endianness field to MCAsmBackend so that writeNopData() implementations know which endianness to use. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47035 llvm-svn: 332857	2018-05-21 17:57:19 +00:00
Tim Northover	4e3eec39fa	ARM: be conservative when asked load/store alignment of weird type. Chances are we'll be asked again after type legalization, but before that point it's better to claim misaligned accesses aren't allowed than to assert. llvm-svn: 332840	2018-05-21 12:43:54 +00:00
Peter Collingbourne	f7b81db715	MC: Change the streamer ctors to take an object writer instead of a stream. NFCI. The idea is that a client that wants split dwarf would create a specific kind of object writer that creates two files, and use it to create the streamer. Part of PR37466. Differential Revision: https://reviews.llvm.org/D47050 llvm-svn: 332749	2018-05-18 18:26:45 +00:00
Amara Emerson	0d6a26dffc	[GlobalISel][IRTranslator] Split aggregates during IR translation. We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly. This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value. A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases. Patch is based on the original work of Tim Northover. Differential Revision: https://reviews.llvm.org/D46018 llvm-svn: 332449	2018-05-16 10:32:02 +00:00
Peter Collingbourne	ec8236ead1	ARM: Remove unnecessary argument. NFCI. IsLittleEndian is already a field of ARMAsmBackend. llvm-svn: 332420	2018-05-16 00:21:47 +00:00
Peter Collingbourne	76d463af0a	ARM: Deduplicate code and remove unnecessary declaration. NFCI. llvm-svn: 332419	2018-05-16 00:21:31 +00:00
Martin Storsjo	ace7ae935f	[ARM] Back up R4 and LR if calling the stack probe function Differential Revision: https://reviews.llvm.org/D46777 llvm-svn: 332298	2018-05-14 21:32:52 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Amaury Sechet	4f729f6a67	[ARM] Add support for SETCCCARRY instead of SETCCE Summary: As per title. SETCCE is deprecated and will eventually be removed. Reviewers: rogfer01, efriedma, rengolin, javed.absar Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D46512 llvm-svn: 331929	2018-05-09 22:15:51 +00:00
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Amaury Sechet	f91b6a8cf7	[ARM] Select result 1 from ConvertBooleanCarryToCarryFlag's result automatically. NFC The old behavior return the value 0, which is error prone. llvm-svn: 331614	2018-05-07 01:43:42 +00:00
Craig Topper	781aa181ab	Fix a bunch of places where operator-> was used directly on the return from dyn_cast. Inspired by r331508, I did a grep and found these. Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa. llvm-svn: 331577	2018-05-05 01:57:00 +00:00
Tim Northover	28e0a6f7dd	ARM: don't try to over-align large vectors as arguments. By default LLVM thinks very large vectors get aligned to their size when passed across functions. Unfortunately no-one told the ARM backend so it doesn't trigger stack realignment and so accesses can cause the usual misalignment issues (e.g. a data abort). This changes the ABI alignment to the stack alignment, which in practice (and as a bonus) also coincides with the alignment "natural" vectors get. llvm-svn: 331451	2018-05-03 12:54:25 +00:00
Clement Courbet	6794660828	[TableGen][NFC] Make ResourceCycles definitions more explicit. https://reviews.llvm.org/D46356 llvm-svn: 331439	2018-05-03 06:08:47 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Nico Weber	432a38838d	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184	2018-04-30 14:59:11 +00:00
Oliver Stannard	f3632143da	[ARM] Codegen for v8.2A dot product intrinsics This adds IR intrinsics for the ARM dot-product instructions introduced in v8.2-A. Differential revision: https://reviews.llvm.org/D46106 llvm-svn: 331032	2018-04-27 12:50:40 +00:00
David Green	c4cccea4c9	[ARM] Enable misched for R52. Back when the R52 schedule was added in rL286949, there was no way to enable machine schedules in ARM for specific cores. Since then a target feature has been added. This enables the feature for R52, removing the need to manually specify compiler flags. llvm-svn: 331027	2018-04-27 11:29:49 +00:00
Nico Weber	77c5471d9f	List cpp file only once (was added in 147117 and 147117 as build fix each). llvm-svn: 330587	2018-04-23 13:11:51 +00:00
Nico Weber	5d53aed419	Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt llvm-svn: 330584	2018-04-23 12:49:34 +00:00
Tim Northover	271d3d2771	MachO: trap unreachable instructions Debugability is more important than saving 4 bytes to let us to fall through to nonense. llvm-svn: 330073	2018-04-13 22:25:20 +00:00
Sjoerd Meijer	834f7dc7ab	[ARM] FP16 vmaxnm/vminnm scalar instructions This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions. Differential Revision: https://reviews.llvm.org/D44675 llvm-svn: 330034	2018-04-13 15:34:26 +00:00
Ivan A. Kosarev	f533a6e5aa	[NEON] Support intrinsic for scalar and vector versions of the VRINTN instruction Differential Revision: https://reviews.llvm.org/D45514 llvm-svn: 330011	2018-04-13 12:45:12 +00:00
Sjoerd Meijer	ac96d7c4b3	[ARM] FP16 VSEL codegen This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788	2018-04-11 09:28:04 +00:00
Hiroshi Inoue	9ff2380ea6	[NFC] fix trivial typos in comments and error message "is is" -> "is", "are are" -> "are" llvm-svn: 329546	2018-04-09 04:37:53 +00:00
Tim Northover	e25e458d52	Reapply ARM: Do not spill CSR to stack on entry to noreturn functions Should fix UBSan bot by also checking there's no "uwtable" attribute before skipping. Otherwise the unwind table will be useless since its moves expect CSRs to actually be preserved. A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch mostly by myeisha (pmb). llvm-svn: 329494	2018-04-07 10:57:03 +00:00
Vitaly Buka	de5f196530	Revert "ARM: Do not spill CSR to stack on entry to noreturn functions" Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM This reverts commit r329287 llvm-svn: 329486	2018-04-07 05:36:44 +00:00
Mandeep Singh Grang	9893fe218c	[ARM] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, RKSimon, MatzeB, bkramer Reviewed By: bkramer Subscribers: javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44855 llvm-svn: 329329	2018-04-05 18:31:50 +00:00
Tim Northover	b30388bf11	ARM: Do not spill CSR to stack on entry to noreturn functions A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch by myeisha (pmb). llvm-svn: 329287	2018-04-05 14:26:06 +00:00
Nico Weber	1cbd096914	Sort targetgen calls in lib/Target/*/CMakeLists. Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181	2018-04-04 12:37:44 +00:00
Mikhail Maltsev	68f35bcc85	[ARM] Do not convert some vmov instructions Summary: Patch https://reviews.llvm.org/D44467 implements conversion of invalid vmov instructions into valid ones. It turned out that some valid instructions also get converted, for example vmov.i64 d2, #0xff00ff00ff00ff00 -> vmov.i16 d2, #0xff00 Such behavior is incorrect because according to the ARM ARM section F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD instructions, "On assembly, the data type must be matched in the table if possible." This patch fixes the isNEONmovReplicate check so that the above instruction is not modified any more. Reviewers: rengolin, olista01 Reviewed By: rengolin Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D44678 llvm-svn: 329158	2018-04-04 08:54:19 +00:00
David Blaikie	bd0c88078a	Remove some unneeded #includes to fix layering llvm-svn: 328838	2018-03-29 22:31:36 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
Christof Douma	a1e77c0e02	[ARM] Support float literals under XO Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691	2018-03-28 10:02:26 +00:00
Martin Storsjo	439824622a	[ARM] Simplify constructing the ARMArchFeature string. NFC. Differential Revision: https://reviews.llvm.org/D44819 llvm-svn: 328478	2018-03-26 08:41:10 +00:00
Simon Pilgrim	351e4fa0e2	[ARM] Remove sched model instregex entries that don't match any instructions (D44687) Reviewed by @javed.absar llvm-svn: 328457	2018-03-25 19:07:17 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	13e77db2df	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
David Blaikie	6054e650ff	Move TargetLoweringObjectFile from CodeGen to Target to fix layering It's implemented in Target & include from other Target headers, so the header should be in Target. llvm-svn: 328392	2018-03-23 23:58:19 +00:00

1 2 3 4 5 ...

9736 Commits